survey methods for transport planning

Survey MethodsforTransport Planning

Anthony J. RichardsonElizabeth S. Ampt

Arnim H. Meyburg

Survey Methods for Transport Planning

Richardson, Ampt & Meyburg i

TABLE of CONTENTS

1. Introduction 11.1 THE ROLE OF SURVEYS 1

1.1.1 Types of Surveys 21.1.2 Survey Purposes 3

1.2 THE TRANSPORT SURVEY PROCESS 41.3 TRADE-OFFS IN TRANSPORT SURVEY DESIGN 81.4 STRUCTURE OF THE BOOK 14

2. Preliminary Planning of Surveys 172.1 OVERALL STUDY OBJECTIVES 172.2 SURVEY OBJECTIVES 242.3 REVIEW OF EXISTING INFORMATION 262.4 FORMATION OF HYPOTHESES 282.5 DEFINITION OF TERMS 282.6 DETERMINATION OF SURVEY RESOURCES 30

2.6.1 Costs of Surveys 302.6.2 Time Estimates for Surveys 312.6.3 Personnel Requirements 312.6.4 Space Requirements 32

2.7 SURVEY CONTENT 32

3. Selection of Survey Method 333.1 TIME FRAME FOR SURVEY 34

3.1.1 Longitudinal Surveys 363.2 TYPES OF DATA COLLECTION TECHNIQUE 42

3.2.1 Documentary Searches 433.2.2 Observational Surveys 443.2.3 Household Self-Completion Surveys 463.2.4 Telephone Surveys 503.2.5 Intercept Surveys 543.2.6 Household Personal Interview Surveys 583.2.7 Group Surveys 613.2.8 In-Depth Interviews 64

3.3 SUMMARY OF SURVEY METHOD SELECTION 71

4. Sampling Procedures 754.1 TARGET POPULATION DEFINITION 754.2 SAMPLING UNITS 764.3 SAMPLING FRAME 774.4 SAMPLING METHODS 80

4.4.1 Simple Random Sampling 814.4.2 Stratified Random Sampling 844.4.3 Variable Fraction Stratified Random Sampling 874.4.4 Multi-Stage Sampling 894.4.5 Cluster Sampling 924.4.6 Systematic Sampling 934.4.7 Non-Random Sampling Methods 95

4.5 SAMPLING ERROR AND SAMPLING BIAS 964.6 SAMPLE SIZE CALCULATIONS 101

4.6.1 Sample Sizes for Population Parameter Estimates 1034.6.2 Sample Sizes for Hypothesis Testing 116

Table of Contents

ii

4.7 VARIANCE ESTIMATION TECHNIQUES 1264.7.1 Variability in Simple Random Samples 1264.7.2 Design Effects 1274.7.3 Replicate Sampling 133

4.8 DRAWING THE SAMPLE 142

5. Survey Instrument Design 1475.1 MORE TRADE-OFFS IN TRANSPORT SURVEY DESIGN 1475.2 SCOPE OF THIS CHAPTER 1505.3 QUESTIONNAIRE CONTENT 151

5.3.1 Length of the Questionnaire 1515.3.2 Relevance of the Questions 1535.3.3 Reasonableness of the Questions 1545.3.4 The Context of Questions about Trips 1545.3.5 Questionnaire Design to Maximise Trip Recording 155

5.4 PHYSICAL DESIGN OF SURVEY FORMS 1595.5 QUESTION TYPES 166

5.5.1 Classification Questions 1665.5.2 Factual Questions 1675.5.3 Opinion and Attitude Questions 1685.5.4 Stated Response Questions 183

5.6 QUESTION FORMAT 1875.6.1 Open Questions 1875.6.2 Field-Coded Questions 1895.6.3 Closed Questions 191

5.7 QUESTION WORDING 1945.7.1 Use Simple Vocabulary 1945.7.2 Use Words Appropriate to the Audience 1955.7.3 Length of Questions 1965.7.4 Clarify the Context of Questions 1965.7.5 Avoid Ambiguous Questions 1965.7.6 Avoid Double-Barrelled Questions 1975.7.7 Avoid Vague Words About Frequency 1975.7.8 Avoid Loaded Questions 1975.7.9 The Case of Leading Questions 1985.7.10 Avoid Double Negatives 2025.7.11 Stressful Questions 2025.7.12 Avoid Grossly Hypothetical Questions 2025.7.13 Allow for the Effect of Response Styles 2035.7.14 Care with Periodicity Questions 2035.7.15 Use of an Activity Framework 2045.7.16 Flow of the Question 2045.7.17 Always Use a Pilot Test 204

5.8 QUESTION ORDERING 2055.9 QUESTION INSTRUCTIONS 207

5.9.1 Self-Completion Surveys 2075.9.2 Personal Interview Surveys 209

6. Pilot Surveys 2136.1 WHY A PILOT SURVEY? 213

6.1.1 A Test of ALL Aspects of Survey Design 2146.1.2 A Need for Several Tests 215

6.2 USES OF THE PILOT SURVEY 216

Table of Contents

iii

6.2.1 Adequacy of the Sampling Frame 2166.2.2 Variability of Parameters Within the Survey Population 2166.2.3 Non-Response Rate 2176.2.4 Method of Data Collection 2176.2.5 "Skirmishing" of Question Wording 2186.2.6 Layout of the Questionnaire 2186.2.7 Adequacy of the Questionnaire in General 2186.2.8 Efficiency of Interviewer and Administrator Training 2206.2.9 Data Entry, Editing and Analysis Procedures 2206.2.10 Cost and Duration of Survey 2216.2.11 Efficiency of Survey Organisation 221

6.3 SIZE OF THE PILOT SURVEY 222

7. Administration of the Survey 2237.1 GENERIC PROCEDURES FOR SURVEY ADMINISTRATION 223

7.1.1 Recruitment of Survey Staff 2247.1.2 Survey Staff Training 2287.1.3 Pre-Publicity 2327.1.4 Survey Execution and Monitoring 2347.1.5 Follow-up and Validation Procedures 2377.1.6 Confidentiality 2377.1.7 Response Rates 238

7.2 ADMINISTRATION OF SELF-COMPLETION SURVEYS 2397.2.1 The Use of a Reminder Regime 2397.2.2 Validation Methods 2407.2.3 Sponsorship of Survey 2427.2.4 Consideration of Respondents 2437.2.5 Use of Incentives 2437.2.6 Covering Letter 2437.2.7 Use of Comments Section 2437.2.8 Provision of a Phone-in Service 2447.2.9 Preparation of Questionnaires for Mailing 2447.2.10 Type of Postage Used 2447.2.11 Response Rates 245

7.3 ADMINISTRATION OF PERSONAL INTERVIEW SURVEYS 2467.3.1 Use of a Robust Interview Regime 2467.3.2 Training of Interviewers 2477.3.3 Checking of Interviews 2477.3.4 Satisfactory Response Rate 2497.3.5 Correct Asking of Questions 2497.3.6 Number of Interviewer Call-Backs 250

7.4 ADMINISTRATION OF TELEPHONE INTERVIEW SURVEYS 2517.4.1 Sampling 2517.4.2 Dealing with Non-Response 2527.4.3 Improving Data Quality 2537.4.4 Other Aspects of Administration 254

7.5 ADMINISTRATION OF INTERCEPT SURVEYS 2557.5.1 Sampling 2557.5.2 Training of Staff 2577.5.3 Other Administrative Details 258

7.6 ADMINISTRATION OF IN-DEPTH INTERVIEW SURVEYS 2587.6.1 Organisation of "Pre-Interviews" 2587.6.2 Preparation for the Main Interview 259

Table of Contents

iv

7.6.3 Training 2607.6.4 Preparation of Questionnaires/Data 2607.6.5 Time and Resources for Transcriptions 2607.6.6 Report Writing 260

7.7 COMPUTERS AND THE CONDUCT OF TRAVEL SURVEYS 2607.7.1 Survey Design and Management 2617.7.2 Data Collection 261

8. Data Processing 2638.1 INITIAL QUESTIONNAIRE EDITING 2648.2 CODING 265

8.2.1 Coding Method 2668.2.2 Code Format 2688.2.3 Coding Frames 2698.2.4 Coding Administration 290

8.3 COMPUTER-BASED DATA ENTRY AND EDITING 2928.3.1 Data Entry 2928.3.2 Data Editing 299

9. Weighting & Expansion of Data 3059.1 POPULATION EXPANSION FACTORS 3079.2 CORRECTIONS FOR NON-REPORTED DATA 3139.3 CORRECTIONS FOR NON-RESPONSE 321

10. Data Analysis 33710.1 EXPLORATORY DATA ANALYSIS 33910.2 CONFIRMATORY DATA ANALYSIS 346

10.2.1 Bivariate regression 35510.2.2 Multivariate regression 37110.2.3 Factor analysis 37510.2.4 Discriminate analysis 37910.2.5 Maximum likelihood estimation 38510.2.6 Logit analysis 390

11. Finalising the Survey 39511.1 PRESENTATION OF RESULTS 396

11.1.1 Avoid Distorted Graphics 39711.1.2 Maximise the Data-Ink Ratio 39911.1.3 Minimise Chart Junk 40011.1.4 Shape and Orientation of Graphs 40111.1.5 The Friendly Data Graphic 401

11.2 DOCUMENTATION OF SURVEY 40211.3 TIDYING-UP 412

References 413Index 447Appendix A - Areas under the Normal Distribution Curve 455Appendix B - A Table of Random Numbers 457Appendix C - Survey Design Checklist 459

List of Figures

v

LISTof FIGURESFigure 1.1 The Transport Survey Process 5Figure 1.2 Trade-Offs in the Transport Survey Process 9Figure 1.3 Effects of Instrument Quality and Sample Size on Uncertainty 11Figure 1.4 Effects of Instrument Quality and Sample Size on Cost 11Figure 1.5 Combined Effects of Instrument Quality and Sample Size 12Figure 1.6 Optimal Combinations of Instrument Quality and Sample Size 13

Figure 2.1 The Transport Planning Process 18Figure 2.2 The Dimensions of System Boundaries 21Figure 2.3 The Systems Modelling Process 22

Figure 3.1 Trade-offs in Selection of the Survey Method 34Figure 3.2 A Completed HATS Display Board 67Figure 3.3 Decision Trees in the Situational Approach Survey 69Figure 3.4 The Roles of Interactive Interviews 71

Figure 4.1 A Population of 100 Sampling Units 82Figure 4.2 A Population of 100 Sampling Units with Identifiers 82Figure 4.3 A Simple Random Sample of 10 Sampling Units 83Figure 4.4 A Simple Random Sample from a Stratified Population 84Figure 4.5 A Stratified Random Sample from a Stratified Population 85Figure 4.6 Examples of Respondent Selection Grids 91Figure 4.7 A Systematic Sample from a Stratified Population 94Figure 4.8 The Distinction between Accuracy and Precision 97Figure 4.9 The Confusion between Accuracy and Precision 98Figure 4.10 Distribution of the Parameter in the Population 104Figure 4.11 Distribution of the Means of Independent Samples 105Figure 4.12 Confidence Limit Estimator for Continuous Variables 112Figure 4.13 Confidence Limit Estimator for Discrete Variables 113Figure 4.14 Input Screen for Sample Size Design Parameters 114Figure 4.15 Input Screen for Expected Values of Variables in Population 115Figure 4.16 Output Screen for Estimated Values of Variables in Sample 115Figure 4.17 Probability of a Type I Error 118Figure 4.18 Probability of a Type II Error 119Figure 4.19 Variation in b as a function of d 125

Figure 5.1 Questionnaire Imprecision (Variability) and Inaccuracy (Bias) 149Figure 5.2 Recall-Only Method of Trip Reporting 156Figure 5.3 Maximising Trip Reporting in a Self-Completion Form 157Figure 5.4 Verbal Activity Recall Framework in Personal Interviews 157Figure 5.5 Example of an Activity Diary 158Figure 5.6 Sample Page of Personal Interview Survey Form 161Figure 5.7 Example of Self-Completion Questionnaire Survey 164Figure 5.8 Classification Question, with Branching 167Figure 5.9 Distribution of Response to a Single Stimulus 172Figure 5.10 Distribution of Response to Two Different Stimuli 172Figure 5.11 Part of a Paired Comparison Question 173Figure 5.12 Similarities Ranking Question 175Figure 5.13 Category Scale Question 176Figure 5.14 Likert Scale Question 177Figure 5.15 Semantic Differential Scale Question 178

List of Figures

vi

Figure 5.16 Comparing Category Scale and Magnitude Estimation Ratings 180Figure 5.17 An Open Question in a Personal Interview 187Figure 5.18 Field-Coded Questions (D&E) in a Personal Interview 190Figure 5.19 A Closed Question in a Personal Interview 191Figure 5.20 An Open Alternative in a Closed Question 192Figure 5.21 A Show Card for a Closed Question 193Figure 5.22 The Quintamensional Plan of Question Design 194Figure 5.23 Poor Placement of Tick-Box Labels 197Figure 5.24 An Introduction to a Set of Questions 205Figure 5.25 The Use of a Completed Sample Questionnaire 208

Figure 7.1 Graphical Representation Used in Validation Interviews 241Figure 7.2 Typical HATS Interview Procedure 259

Figure 8.1(a) Map used in Self-Coding of Location Coordinates (front) 277Figure 8.1(b) Map used in Self-Coding of Location Coordinates (back) 278Figure 8.2 General Procedure for Geocoding of Locations 279Figure 8.3 The Location of Full Street Addresses using MapInfo 283Figure 8.4 The Location of Cross Street Addresses 284Figure 8.5 The Occurrence of Multiple Cross Street Locations 285Figure 8.6 Cross Streets with Multiple Bordering CCDs 286Figure 8.7 The Allocation of a Cross Street Address to a CCD 286Figure 8.8 Sample Questionnaire to be Coded 293Figure 8.9 A Simple Spreadsheet Layout 294Figure 8.10 A Replicate Spreadsheet Layout 294Figure 8.11 A Relational Database for Travel Survey Data. 295Figure 8.12 A Relational Database Created in FoxBASE. 297Figure 8.13 Selection of a Trip Record in the Trip File 297Figure 8.14 Selection of Corresponding Person Record in the Person File 297Figure 8.15 Selection of the Corresponding Household Record. 298Figure 8.16 A Flatfile Created for Statistical Analysis 299

Figure 9.1 The Effect of Reminders on Questionnaire Response Rates 325Figure 9.2 The Speed of Response by Response Type 326Figure 9.3 The Effect of Household Size on Level and Speed of Response 327Figure 9.4 The Effect of Age on Level and Speed of Response 328Figure 9.5 The Effect of Employment and Sex on Speed of Response 328Figure 9.6 Travel Characteristics as a Function of Response Speed 331Figure 9.7 Average Stop Rate as a Function of Response Wave 332Figure 9.8 Stop Rate as a Function of Cumulative Response 333

Figure 10.1 A Flat File Created for Statistical Analysis 340Figure 10.2 Data Desk's Representation of Variables as Icons 340Figure 10.3 The Contents of Data Desk Variable Icons 341Figure 10.4 A Histogram of the Number of Trips per Person per Day 342Figure 10.5 Summary Statistics of the Number of Trips per Person per Day 342Figure 10.6 A Boxplot of the Number of Trips per Person per Day 343Figure 10.7 A Scatterplot of Trip Rate vs Household Vehicles 344Figure 10.8 Splitting the Data into Groups Based on Vehicle Ownership 344Figure 10.9 Boxplots of Trips per Day for the Vehicle Ownership Groups 345Figure 10.10 Scatterplot of Person trips vs Household Vehicles 345Figure 10.11 A Simple, Bivariate Linear Relationship 349Figure 10.12 A Simple, Bivariate, Monotonic Non-linear Relationship 349

List of Figures

vii

Figure 10.13 Specification Error and Model Complexity 351Figure 10.14 Measurement Error and Model Complexity 352Figure 10.15 Relationship between Total Error and Model Complexity 353Figure 10.16 The Effect of Bad Data on Total Model Error 353Figure 10.17 Scatter Diagram of a Positive Linear Relationship 357Figure 10.18 Scatter Diagram of a Negative Linear Relationship 357Figure 10.19 Scatter Diagram showing No Relationship 358Figure 10.20 Scatter Diagram showing Non-Linear Relationship 358Figure 10.21 Geometric Representation of the Regression Hypothesis 361Figure 10.22 Scatter Diagram of n Data Points 362Figure 10.23 Prediction if Error Is Assumed to Be in Xi Measures 363Figure 10.24 Selection of Linear Regression Option in Data Desk 367Figure 10.25 Results of Linear Regression Analysis in Data Desk 367Figure 10.26 Selecting a Residuals Scatterplot in Data Desk 368Figure 10.27 Results of Residuals Scatterplot in Data Desk 368Figure 10.28 Results of the Ungrouped Linear Regression 369Figure 10.29 Distribution of the Residuals 370Figure 10.30 Raw Attribute Scores for Four Car Models 378Figure 10.31 Factor Scores for Four Car Models 379Figure 10.32 Frequency Distribution of Car versus Transit Users 380Figure 10.33 Definition of Discrimination between Two Subpopulations 382Figure 10.34 Logit and Probit Curves for Equations 10.65 and 10.66 394

Figure 11.1 Comparison of 3-D and 2-D Pie-Charts 397Figure 11.2 Comparison of Non-Zero and Zero Origin Charts 398Figure 11.3 Representing One-Dimensional Quantities with 2-D Objects 399Figure 11.4 Comparison of Low Data-ink and High Data-ink Ratio Graphs 400Figure 11.5 Poor Choices of Shading Patterns 400Figure 11.6 Superfluous Gridlines and 3-D Effects 401

List of Tables

viii

LISTof TABLESTable 2.1 Sources of Existing Transport Data 27Table 3.1 Comparison of Successive Sample Surveys and Panel Studies 37Table 3.2 Uses of Each Survey Method 72Table 4.1 Possible End-States of Hypothesis Testing 117Table 4.2 Results of the Stratified Sample Airport Upgrading Survey 128Table 4.3 Results of the Cluster Sample Airport Upgrading Survey 131Table 4.4 Results of the Replicated Sample Airport Upgrading Survey 135Table 4.5 Sample Data for Half-Sample Replication Example 136Table 4.6 Sample Data for Balanced Half-Sample Replication 139Table 5.1 Summary of Scale Types 171Table 5.2 A Simple Stated Response Experimental Design 184Table 6.1 Pilot Studies for the 1981 Sydney Travel Survey 216Table 7.1 Response Behaviour of Non-Response Validation Households 242Table 9.1 Responses by Age and Sex 309Table 9.2 Trip-Rates in Responses by Age and Sex 309Table 9.3 Population Breakdown by Age and Sex 310Table 9.4 Response Rates by Age and Sex 310Table 9.5 Population Expansion Factors by Age and Sex 310Table 9.6 Total Trips in Population by Age and Sex 310Table 9.7 Marginal Population Totals by Age and Sex 311Table 9.8 Expanded Population Totals after First Iteration 311Table 9.9 Expanded Population Totals after Second Iteration 312Table 9.10 Estimated Population Expansion Factors by Age and Sex 312Table 9.11 Total Trips in Population by Age and Sex 313Table 9.12 Incomplete Information for Various Question Types 314Table 9.13 Incomplete Information for Various Question Types 315Table 9.14 Incomplete Information for Various Types of Respondent 315Table 9.15 Non-Reported Trips for Various Types of Respondent 316Table 9.16 Trip Characteristics of Non-Reported Trips 316Table 9.17 Increases in Mobility after Allowing for Non-Reported Trips 317Table 9.18 Non-Reporting Correction Factors for Expected Added Stops 318Table 9.19 Non-Reporting Correction Factors for Unexpected Added Stops 319Table 9.20 Non-Reported Stop Weights (phone connected) 320Table 9.21 Non-Reported Stop Weights (phone not connected) 321Table 9.22 Reasons for Non-Response in Self-Administered Surveys 322Table 9.23 Household Characteristics of SEQHTS Respondents by Wave 329Table 9.24 Personal Characteristics of SEQHTS Respondents by Wave 330Table 10.1 Input Data and Results for Weighted Regression 371Table 10.2 Attributes Used to Rate Various Car Models 377Table 10.3 Factor Groupings for Attributes of Table 10.2 379Table 10.4 Results of Discriminant Analysis on Chicago Data 384Table 10.5 Additional Chicago Commuters to be Classified 385Table 10.6 Observations on Carbon Monoxide Level (parts per million) 386Table 11.1 Characteristics of the Friendly Data Graphic 402


Richardson, Ampt & Meyburg 1

1. Introduction

1.1 THE ROLE OF SURVEYS

All persons involved in transport and land-use planning will at some stage beinvolved with data collection. Even if not directly concerned with the design andconduct of surveys, they will certainly wish to use data at some time, and at thatstage they will realise what should have been done in the design andadministration phases of the survey. Each individual's introduction to surveydata may have widely differing emotional connotations. For some, it may be ahorrific experience as they try to grapple with data which has been collected bysomeone else, only to find that the documentation associated with that dataset isincomplete, misleading or simply non-existent. For others, who face the prospectof collecting a new set of data, it may be a challenging professional experiencewith ample opportunity for initiative and the exploration of new frontiers ofknowledge in survey methodology. This book is intended to guide the lattergroup, and console the former.

Chapter 1

2

1.1.1 Types of Surveys

Surveys are of particular relevance to transport and land-use planning in severalspecific areas. Land-use surveys are an integral component of transport planninginformation requirements. This is due to the fact that travel is a so-called "deriveddemand". That is, travel, in itself, has no inherent value; it is useful only insofar asit facilitates participation in other activities. Thus with respect to passengertransport, travel enables individuals to participate in an activity at a location somedistance away from their present location. If the activity did not exist, or if itcould be undertaken at the individual's present location, then there would be noneed for travel to occur; that is, if there were no land-use activities, there wouldbe no travel. The spatial location and intensity of land-use activities is measuredby land-use surveys.

The amount of travel which takes place between land-uses will depend on thequality and quantity of the transport system which connects the land-uses, andsurveys of the transport system inventory play a major role in specifying the locationand characteristics of the available transport system. This system, which includesboth public and private modes of transport, may be described in terms of threebasic components: the right-of-way, terminals, and vehicles. The right-of-wayincludes the roads, tracks and paths which may be used by different types ofvehicles and which can be described in terms of length, direction of movement,capacity, and design standards. The terminals include public transport stations,modal interchanges and parking stations and can be described in terms oflocation, throughput capacity, and holding capacity. The vehicles include bothpublic and private vehicles and may be described in terms of total number,passenger (or goods) capacity, comfort level, and various operatingcharacteristics.

The combination of land-use activity and a transport system invariably results intrip-making, and to measure the type and extent of trip-making it is necessary toconduct travel pattern surveys by one means or another. Such travel patterns maybe described in terms of who is going where, with whom, at what time, by whichmode and route, and for what purpose. The measurement of such travel patternsis perhaps the unique part of transport survey methods, but determining themost effective way of obtaining the above information has often received littleattention.

One effect of trip-making is to change the way in which the transport systemoperates. To establish whether the transport system is coping adequately withthe demands being placed on it, it is therefore necessary to conduct transportsystem performance surveys. Such surveys seek to measure performance

Introduction

3

characteristics, such as travel times, travel time variability, passenger waitingtimes, vehicle occupancies, and system safety.

Each of the above types of surveys has the common characteristic that it attemptsto measure system characteristics as they now exist. However, a major task ofthe transport planner is to attempt to predict changes which will occur in thesystem. In particular, the planner is often required to predict changes in traveldemand as a result of changes in the physical system or changes in the operatingcharacteristics of that system. In attempting to predict the demands which will beplaced on a transport system, it is well recognised that various groups in thepopulation will react differently to changes in the transport system. To identifythese groups, it is necessary to incorporate demographic and socio-economic surveyswithin the overall transport survey framework. It is also well recognised thatindividuals react not to actual changes in the transport system but to perceivedchanges in that system. For that reason, perception and attitude surveys often forma vital component of transport planning surveys. Data from these surveys areoften used as inputs to travel demand models. In many cases, a transport surveywill fulfil more than one of the above roles.

While all types of surveys in transport are referred to, and while the principles ofquestionnaire design can apply to any survey type, the main focus of this book ison travel pattern surveys, that is surveys which ask people how, where or whythey travel. Furthermore, while we cover those surveys which we have calledintercept surveys (i.e. on board public transport vehicles, at the road side, and atactivity centres such as shops and airports) a great deal of attention is placed onhousehold surveys - since the methodology used here can frequently be appliedto all other survey types.

1.1.2 Survey Purposes

Regardless of the subject matter to be covered within a survey, transport surveysmay serve several purposes, either alone or in combination. First, they maymerely attempt to describe existing conditions at a given time in order to ascribean order of magnitude to various transport phenomena. Secondly, they mayseek to establish causal explanations of conditions at a given time so that greaterunderstanding of transport system behaviour may be obtained. Thirdly, it maybe desired that after analysis of the survey results, predictive models will bederived so as to forecast future transport conditions or to predict the effects ofsystem changes. Fourthly, rather than predict the effects of system changes, it isoften more appropriate, or convenient, to measure the effects of system changes.In this case, before-and-after surveys may be used to assess the effects of thesesystem changes. Fifthly, an extension of the before-and-after concept (wheresurveys are generally conducted at two points in time) is the establishment of a

Chapter 1

4

regular series of monitoring surveys whereby changes in transport systemcharacteristics or behaviour may be established over a long period.

In addition to the above-mentioned purposes, surveys may also play two furtherroles which, while perhaps technically undesirable, should be clearly recognised.Surveys, for example, may often be used as "report padding" to fill an otherwiseempty document and to lend weight to the conclusions or opinions contained init. Alternatively, the conduct of a survey may be a convenient method of puttingoff a decision. The use of surveys in this way, as an aid to procrastination, mayoften be embedded within the more general gambit of calling for a Committee ofInquiry when decisions are not wanted (yet). A clear recognition of the purposeof a survey, in terms of any one of the above seven categories, can greatly aid inthe initial selection of the survey technique, the design of the survey method andthe later interpretation of results.

This is not to imply, however, that the latter two survey purposes should beaccepted unquestioningly by the professional transport surveyor. If the purpose ofthe report appears to be report-padding or procrastination, the client should firstbe questioned more insistently on the objectives of the survey. If noprofessionally acceptable objectives can be specified, then you, as the surveydesigner, may face a difficult decision. You can either refuse to carry out thesurvey or else agree to design the survey (perhaps under protest). The groundsfor refusing to carry out the survey are not simply academic piety; rather it is forthe highly pragmatic reason that if too many such surveys are carried out, thewhole area of transport surveys may be brought into ill-repute in the publicmind. If, however, the conduct of the survey is inevitable, either by you or bysomeone else, then you should attempt to make the most of the situation and tryto incorporate as much experimentation into the conduct of the survey aspossible. As will be seen later, there are many ideas in survey design andadministration which require empirical validation. If the survey can be used totest several survey design hypotheses, then that survey will have gained alegitimate (research) objective.

1.2 THE TRANSPORT SURVEY PROCESS

The conduct of a survey is not an informal procedure. Rather, it should follow aseries of logical, interconnected steps which progress toward the final end-product of the survey. The stages in a typical sample survey are shown in Figure1.1, and the issues to be addressed within each of these stages are listed on thenext pages.

Introduction

5

SampleDesign

PilotSurvey

DataCoding Data

Editing

Data Analysis

Presentationof Results

Selection ofSurvey Method

SurveyInstrument

Design

PreliminaryPlanning

Tidying-Up

Data Correctionand Expansion

SurveyAdministration

Figure 1.1 The Transport Survey Process

Chapter 1

6

(a) Preliminary Planning(i) Overall Study Objectives(ii) Specific Survey Objectives(iii) Review of Existing Information(iv) Formation of Hypotheses(v) Definition of Terms(vi) Determination of Survey Resources(vii) Specification of Survey Content

(b) Selection of Survey Method(i) Selection of Survey Time Frame(ii) Selection of Survey Technique(iii) Consideration of Survey Errors

(c) Sample Design(i) Definition of Target Population(ii) Sampling Units(iii) Sampling Frame(iv) Sampling Method(v) Sampling Error and Sampling Bias(vi) Sample Size and Composition(vii) Estimation of Parameter Variances(viii) Conduct of Sampling

(d) Survey Instrument Design(i) Types of Survey Instrument(ii) Question Content(iii) Trip Recording Techniques(iv) Physical Nature of Forms(v) Question Types(vi) Question Format(vii) Question Wording(viii) Question Ordering(ix) Question Instructions

(e) Pilot Survey(s)(i) Adequacy of Sampling Frame(ii) Variability within Survey Population(iii) Estimation of Non-Response Rate(iv) Size of the Pilot Survey(v) Suitability of Survey Method(vi) Adequacy of Questionnaire (schedule)(vii) Efficiency of Interviewer Training

Introduction

7

(viii) Suitability of Coding, Data Entry, and Editing Procedures(ix) Suitability of Analysis Procedures(x) Cost and Duration of Surveys(xi) Efficiency of Organisation

(f) Administration of the Survey(i) Procedures for Survey Administration of:

Self-Completion, Personal Interview, Telephone, Interceptand In-depth Interview Surveys

(ii) Survey Execution and Monitoring(iii) Quality Control(iv) The Use of the Computer in Transport Surveys

(g) Data Processing(i) Selection of Coding Method(ii) Preparation of Code Format(iii) Development of Data Entry Programs(iv) Coder and Data Entry Training(v) Coding Administration

(h) Data Editing(i) Editing of Field Sheets(ii) Verification of Data Entry(iii) Development of Editing Computer Programs(iv) Consistency and Range Checks

(i) Data Correction and Expansion(i) Editing Check Corrections(ii) Secondary Data Comparisons(iii) Corrections for Internal Biases

(j) Data Analysis and Management(i) Exploratory Data Analysis(ii) Model Building(iii) Interpretation of Results(iv) Database Management(v) Provision of Data Support Services

(k) Presentation of Results(i) Verbal Presentations(ii) Visual Presentations(iii) Preparation of Reports(iv) Publication of Results

Chapter 1

8

(l) Tidying-Up(i) Documentation of Survey Method(ii) Storage and Archival of Data(iii) Completion of Administrative Duties

In the survey process outlined above, there are three types of linkages betweenactivities: forward, feedback and backward linkages. The forward linkages arerelatively obvious, e.g. the questionnaire design cannot begin until the surveymethod has been selected. The feedback linkages indicate that two or moreactivities must be performed sequentially in a closed loop, e.g. having performedthe pilot survey, it may be necessary to redesign the questionnaire and then pilottest the new questionnaire. Backward linkages indicate that information must betransferred from an activity which occurs later in the process to one which occursearly in the process. For example, the design of the questionnaire will be affectedby the coding procedure to be used later, while the coding procedure will dependon the type of analysis to be performed on the data. While such backwardlinkages may not be highly visible, it is important that consideration be given tothem so that decisions made early in the process will not proscribe options forlater data analysis.

1.3 TRADE-OFFS IN TRANSPORT SURVEY DESIGN

As will become evident at several points throughout this book, the authorsbelieve that the essence of good survey design is being able to make trade-offsbetween the competing demands of good design practice in several areas (suchas sample design, survey instrument design, conduct of surveys, and dataweighting and expansion) so as to arrive at the most cost effective, high qualitysurvey which meets the needs of the client within budget constraints. The overallnature of these trade-offs is shown in Figure 1.2.

The underlying nature of this trade-off process is the so-called "Architect'sTriangle", in which quantity and quality are traded-off against cost. A trade-offoccurs because it is impossible to control all three of the major elements in Figure1.2.; at best, only two of the three can be controlled by the survey designer. Thusgiven a fixed budget, as is normally the case, the decision to obtain data of aspecified quality will automatically control the quantity of data which can becollected. Alternatively, within a fixed budget, specification of the quantity of datato be collected will immediately dictate the quality of data which can be collected.That is, we can collect a greater quantity of low quality data or we can collect alimited amount of higher quality data for a given budget. Generally, the lattercourse of action is to be preferred.

Introduction

9

SURVEYRESOURCES

QUALITYOF DATA

Depth ofQuestions

Number ofQuestions

QUANTITYOF DATA

QualityControl

InstrumentDesign

SampleQuality

SurveyMethod

Informationper Respondent

Number ofRespondents

SelectionProcess

SamplingFrame

ResponseRates

SampleSize

Figure 1.2 Trade-Offs in the Transport Survey Process

The quality of data to be collected is a function of the survey method selected andthe quality of the sample (insofar as the sample is free of bias). The quality of dataobtained from any survey method will, in turn, be a function of the quality of thesurvey instrument design (i.e. does it collect information on the variables ofinterest in an unbiased way) and the quality control procedures put in place forthe implementation of that survey method (i.e. what follow-up procedures willbe used to verify the quality of the data collected). The quality of the sample willdepend on the ability of the sampling frame to truly represent the population,and the extent to which the sample selection procedures result in a randomselection from the sampling frame.

The quantity of data collected will be a function of the number of respondents inthe final dataset and the amount of information obtained from each respondent.This, in itself, presents a trade-off situation because any attempt to collect moreinformation from each respondent (beyond a threshold level of information)may result in less respondents responding. The total number of respondents will

Chapter 1

10

obviously depend on the size of the sample drawn from the population and theresponse rate obtained from that sample. The amount of information obtainedfrom each respondent will depend on the number of questions asked in thesurvey, as well as the depth of the questions asked. Thus some surveys can beeffective with only a large number of highly directed questions, while othersneed to explore a few topics in depth. The extent of this trade-off is therefore aspecific design decision on the part of the survey designer. The trade-off will alsobe partly determined by the respondents themselves. As the length of the surveyincreases, the response rate will generally decrease (the rate of decrease willdepend on such factors as the interest level of the survey topic to therespondents, and the overall quality of the survey instrument design). There willtherefore be a point at which an increase in the number of questions asked willresult in the collection of less data in total, because of the more than proportionaldecrease in the response rate. The survey designer should therefore be cognisantof this interaction when making the explicit trade-off between number ofrespondents and information obtained per respondent.

The nature of these trade-offs can be illustrated by a simple example related tothe quality of the survey instrument and the size of the sample. Both of thesecomponents of survey design have an effect on the quality of data obtained froma survey, in that each can affect how well the reported information matches thetrue situation in the population. As will be described later in this book, a largersample can reduce the uncertainty involved in assuming that the sample resultsrepresents the results which would have been obtained from the population.Similarly, a good survey instrument (e.g. a good questionnaire) can reduce theuncertainty that the answers obtained from respondents represent the realconditions about which we intended to obtain information. These relationshipsbetween instrument quality, sample size and the uncertainty involved in thesample results are shown in Figure!1.3.

It can be seen that uncertainty can be reduced either by designing a better surveyinstrument or by selecting a larger sample, or both. However, improvements inthe survey instrument or the sample do not come without some cost. In eachcase, there will be fixed costs involved (either in the design of a very poor qualityinstrument, or the selection of a very small sample), plus marginal costs forincreases in the sample size and improvements in the survey instrument, asshown in Figure 1.4.

Introduction

11

Figure 1.3 Effects of Instrument Quality and Sample Size on Uncertainty

Figure 1.4 Effects of Instrument Quality and Sample Size on Cost

Figure 1.4 postulates that the marginal costs associated with improvements ininstrument design and increases in sample size are different, but linear. In reality,the marginal costs will probably not be linear, but this does not detract from thepresent argument.

Since uncertainty is affected by both the instrument design and the sample size,this means that each combination of instrument design and sample size willpossess a specific degree of uncertainty in the final results. This can be displayedin the form of uncertainty isoquants (lines of equal uncertainty) for combinationsof instrument design and sample size, as shown in Figure 1.5. It can be seen thatas either the instrument design is improved or the sample size is increased, thelevel of uncertainty with the sample results decreases. Importantly, differentcombinations of instrument design and sample size can produce the same level ofuncertainty, as shown by the combinations of the two factors lying along a single

Chapter 1

12

isoquant. This implies that the two factors can be traded off against each other toachieve the same end results in terms of uncertainty of the sample results (notethat this argument does not extend to the case of reducing bias in the results,where a poor survey instrument or a poor sample is consistently giving thewrong result. In such a situation, taking a bigger sample and using a biasedsurvey instrument will simply give us the wrong answers with more certainty).

. .

Instrument Quality

Sam

ple

Size

Decreasing

Uncertainty

Instrument Quality

Sam

ple

Size Inceasing Cost

Figure 1.5 Combined Effects of Instrument Quality and Sample Size

Figure 1.5 also shows that different combinations of survey instrument qualityand sample size will result in different total costs, as shown by the cost isoquantsin the right half of Figure 1.5. As either the instrument design is improved or thesample size is increased, the cost of the survey increases. Importantly, differentcombinations of instrument design and sample size can produce the same surveycost, as shown by the combinations of the two factors lying along a singleisoquant. This implies that the two factors can be traded off against each other toachieve the same end results in terms of the cost of the survey. One of theseisoquants represents the budget for the survey and, even though combinationsof instrument design and sample size above this isoquant would result indecreased uncertainty in the results, such combinations are not feasible within afixed budget. The essence of effective survey design is to find the rightcombination of survey instrument quality and sample size which minimises theuncertainty of the results within the available budget.

The search for the most cost-effective survey design can be depicted by thediagrams shown in Figure 1.6, where the cost isoquant line from Figure 1.5corresponding to the budget is overlaid on the uncertainty isoquant diagram ofFigure 1.5. The point where the cost isoquant is tangential to the uncertaintyisoquant (as indicated by the small circle) is the combination of instrument quality

Introduction

13

and sample size which minimises the level of uncertainty for a given level ofexpenditure. In the case of the diagram on the left of Figure 1.6, where the costisoquant reflects a lower marginal cost for improving instrument quality (seeFigure 1.4), the optimal situation is one where more attention is paid toimproving instrument quality than increasing the sample size.

Figure 1.6 Optimal Combinations of Instrument Quality and Sample Size

However, if the marginal cost of increasing sample size was lower than themarginal cost of improving instrument quality, the cost isoquant line would besteeper and this would result in a different optimal design as shown in thediagram on the right of Figure 1.6. Thus the "best" survey design is dependent onthe quality and the relative costs of instrument quality and sample size.

The foregoing argument has been made to illustrate the nature of the trade-offsinherent in survey design, and to demonstrate that concentrating on either theinstrument design or the sample design, to the virtual exclusion of the other, willmost likely be a pointless strategy in trying to find the most cost-effectivesolution for the client's needs.

However, at this point in time in the evolution of the field of surveymethodology, the above argument must remain qualitative rather thanquantitative. This is because the areas of sample design and instrument designhave developed in different directions and at different rates. Sample design ishighly quantified and estimates of uncertainty can be calculated as a function ofsample size. However, the area of instrument design has largely developedthrough the experience of individual survey designers and the dissemination of alimited amount of research findings. At this point in time, we do not know(quantitatively) how improvements in survey instrument design will affect the

Chapter 1

14

uncertainty in the results obtained from respondents, nor has much beendocumented about the costs involved in improving survey instrument designs.

One of the objectives of this book is to promote the adoption of a balancedapproach to total survey design, and to advance the state of knowledge aboutthe role of survey instrument design in the conduct of transport and activitysurveys.

1.4 STRUCTURE OF THE BOOK

The structure of this book follows from the survey process outlined in Figure!1.1.Thus Chapter 2 describes some of the tasks to be addressed in the preliminaryplanning of a survey and, in particular, describes some steps which are oftenoverlooked but which are essential to a survey if it is to function smoothly inlater stages of the process. Chapter 3 describes the types of survey methodscommonly used in transport and land-use planning and the circumstances whereeach method might be used most effectively. Advantages and disadvantages ofeach method are discussed and, in particular, the types of biases inherent in eachmethod are described. Chapter 4 outlines some procedures which can be used inselecting a sample and highlights the important, but distinctively different,concepts of sampling error and sampling bias. Chapter 4 also follows on bydescribing the techniques used to determine minimum sample size requirementson the basis of achieving an acceptable sampling error.

Chapter 5 is concerned with the principles involved in the design of surveyinstruments (e.g. questionnaires and interview schedules). The various optionsavailable for obtaining detailed records of trip and activity patterns are alsodescribed in this chapter. The techniques involved in recording trip patterns areof particular importance and serve to distinguish transport survey methodologyfrom the more general social survey methodology. The trip recording techniquesare, however, more generally applicable to the recording of many forms ofactivity patterns.

The conduct of pilot surveys is the focus of attention in Chapter 6. Pilot surveysare often ignored, or undervalued, by survey designers; this chapter highlightsthe reasons for conducting pilot surveys and emphasises the value of such pilotsurveys and pre-tests. Chapter 7 describes the design of administrativeprocedures for various survey types, and highlights the importance of a pre-planned approach to follow-up procedures to enable later corrections for non-response and other sources of survey bias.

Once the data has been collected via the methods outlined in Chapters!2 through7, attention turns to working with the data which has been collected. It should berealised that, at this stage, not much more can be done to improve the quality of

Introduction

15

the data; if the design has not accounted for the various aspects of bias and errorbefore now, then little can be done subsequently. In Chapter 8, the areas of dataprocessing known as coding, editing and data entry are covered. These tasks,which are often tedious, are essential prerequisites to the analysis of the data andshould not be skipped over in one's hurry to get to the analysis phase of thesurvey process. The quality of the analysis is only as good as the quality of theinput data; so ensure that the data set is "clean" (i.e. free from error and bias)before embarking on the analysis.

An important component of obtaining clean data is the use of various datacorrection and expansion techniques which attempt to make the sample datamore nearly representative of the population which it is trying to represent.These techniques are described in detail in Chapter 9, where attention is focussedon the use of a variety of data weighting techniques to account for socio-demographic differences in the sample and the population, as well as for theeffects of non-response and non-reported information.

Chapter 10 outlines the wide variety of data analysis techniques which can beused on the clean data set. Attention is focussed on the use of widely availablemicrocomputer spreadsheets and database programs, which facilitate the entry,editing, analysis and presentation of data in one integrated package, and on theuse of commercially available microcomputer statistical packages. This chapteroutlines the concepts of "exploratory data analysis" which can be used to get agood feel for the nature of the data collected. There is also a relatively briefoverview of a number of multivariate analysis techniques which can be used inthe "model-building" phase of data analysis. Aspects of on-going databasemanagement are also covered in this chapter.

Chapter 11 concludes the book by providing some guidelines for the preparationof survey reports and the presentation of survey results in a variety of ways. Italso highlights the importance of adequate survey method documentation andtidying-up of the many details associated with a survey project.

A special feature of this book is the Survey Design Checklist located in theyellow-paged Appendix at the rear of the book. This Checklist consists of a seriesof questions which record the major features of a survey, and which serve as areminder of the decisions which must be made in the design and conduct of thesurvey. Each question is linked back to the content of the book by the provisionof references to relevant sections in the book. The Checklist pages can bephotocopied and used as a design aid for any survey. The completed Checklistalso serves as a component of the documentation of the survey. Finally, whenthis book is used as a textbook, the Survey Design Checklist serves as a usefulmeans of revision for the student of transport survey methodology.

Chapter 1

16

Whilst attempting to provide a comprehensive framework for the conduct oftransport and land-use activity surveys, it will be obvious that this book does notattempt to cover all aspects of sample survey design in equal detail. In particular,it does not cover the detailed statistical design associated with various types ofcomplex sampling strategies; numerous textbooks have been written on thedetails of sample design (e.g. Cochran, 1977; Kish, 1965; Yates, 1971) and thereader is referred to these for further details. Similarly, although an overview ofexploratory and multivariate data analysis is given, relatively little is providedabout the details of the many techniques which can be used in the analysis phase,since the range of such techniques is quite wide and the selection of the analysistechnique will vary between individual surveys. For a comprehensive descriptionof many multivariate data analysis methods, the interested reader is referred toStopher and Meyburg (1979).

The reader should also be aware that this book concentrates on the application ofsocial survey techniques to the recording of personal travel and activitybehaviour. While many of the same principles apply, this book does notconcentrate on traffic surveys (see Taylor and Young (1988) for an exposition oftraffic survey methods) or commercial vehicle surveys.

Finally, as the reader either now knows, or will soon find out, the real learningexperience in survey design comes not from reading books but from conductingsurveys in the field. Only then will some of the seemingly trivial points made inthis book assume their full importance; while this book provides a frameworkfor the conduct of transport surveys, it is no substitute for experience.



2. Preliminary Planning of Surveys

At the preliminary planning stage of the survey process, a number of basic issuesmust be addressed before a decision is taken to proceed with the design andconduct of a sample survey. It is assumed throughout this book that it is always asample survey being considered, because in transport it is usually very difficultto survey an entire population; it is also usually unnecessary to sample the entirepopulation, as will be shown in Chapter 4.

2.1 OVERALL STUDY OBJECTIVES

It should be realised from the outset that the collection of data is one part of amore comprehensive transport planning process. Many contemporary authorshave suggested various ways in which the urban transport planning processshould operate (e.g. Hutchinson, 1974; Stopher and Meyburg, 1975; Dickey et al.,1975; Morlok, 1978; Ortúzar and Willumsen, 1994). By adapting the criticalcomponents of each of these versions of the planning process, a general transportplanning systems process can be postulated as shown in Figure 2.1. It can be seenthat data collection is but one of the tasks leading towards the evaluation,selection and implementation of a particular transport strategy. For example, inplanning for the introduction of a toll road system, collecting data about whocurrently uses the route would be an important component of the process leadingto a decision about whether a toll should be introduced, how much it should cost,and which route it should take.

Chapter 2

18

CONSTRAINTS

SELECTION

IMPLEMENTATION

EVALUATION

MONITORING

CONSEQUENCES

DATA COLLECTIONCRITERIA

OBJECTIVES RESOURCES

ALTERNATIVES

RE-EXAMINATION OF GOALS

MODELS

GOALS

VALUES

SYSTEMBOUNDARIES

PROBLEMDEFINITION

Figure 2.1 The Transport Planning Process

The starting point in the planning process, however, (if indeed a starting pointcan be defined in such a continuous process) is the definition of the problem (orproblems) under consideration. It is probably the most important singlecomponent of the planning process. Very often, the careful definition of theproblem will greatly assist in suggesting possible solutions. Indeed, the explicitenunciation of the problem may well be a crucial step in the solution of theproblem itself, and may obviate the need for surveys and subsequent dataanalysis.

As an aid to understanding the significance of problem definition, consider thefollowing definition of a problem: "A problem for an individual or a group ofindividuals is the difference between the desired state for a given situation at agiven time and the actual state. This difference cannot be eliminated immediately(if ever)." (Dickey et al., 1975). Four factors concerning this definition are ofsignificance. Firstly, the problem is the concern of a limited population and is notnecessarily of general concern to the population at large. This fact is of someimportance when attempting to define the "goals of the community"; there is noone set of goals, they are different for everybody. This disparity is often a sourceof considerable misunderstanding, since those people (such as planners) notdirectly affected by the problem may find it difficult to comprehend the natureand/or magnitude of the problem.

Preliminary Planning

19

Secondly, the identification of the desired state may in itself present a problem, inthat it may be very difficult to determine just what that desired state is, or shouldbe, for a particular situation. For example, the identification of a desirable level ofair pollution may present problems because the desirable state may not berepresented by a zero pollution level. There are finite levels of air pollution whichare easily tolerated by human, and other living, organisms. However, increasingair pollution levels above a threshold brings with it substantial problems. Thedetermination of these threshold levels, as a possible representation of a desiredstate, may involve considerable difficulty and controversy.

Thirdly, the fact that the problem may never be solved makes transport problemssomewhat different to other mathematical or physical science problems whichmay have been previously encountered. This insolubility is due to the complexinterweaving of various transport problems. Thus, the best solution for oneproblem may well create problems in other areas. For example, reducing the airpollution emissions from internal combustion engines may result in increasedfuel consumption. Alternatively, solving a problem for one group in thecommunity may simply create a problem for another group of individuals, e.g.shifting heavy trucks off one road and onto another road.

Fourthly, it is important to talk about problems "at a given time". This is because,firstly, the nature of transport problems changes over time and, secondly, even ifa transport problem could be solved, the solution to that problem would create arise in the level of expectation of the population in question such that a gapwould once again appear between the desired and actual state. This dilemma canbe discouraging for the planner since it implies that most problems can never besolved "once and for all"; the planner can never satisfy the community.

Nonetheless, the importance of problem definition is extremely high, ashighlighted by Armstrong (1978) who states that "a good job of identifying theproblems will reduce the likelihood of Type III Errors (Type III Errors are goodsolutions to the wrong problems)."

Since a problem has been defined as a discrepancy between the desired andactual states of the system, it is obviously necessary to ascertain these two statesbefore attempting the definition of the problem. The desired state of the systemmay be reflected in the values and goals of the community. Stopher and Meyburg(1975) made the useful distinction between values, goals, objectives and criteria.Each of these elements represents a restatement, and refinement in moreoperational terms, of the preceding element. To attempt problem definition, boththe values and goals must have been defined. Thus, for example, if the value wasdefined as being "Environmental Amenity", one possible goal might be defined as"the containment of traffic noise levels in residential areas".

Chapter 2

20

To determine whether a problem existed with respect to this goal, it would benecessary to ascertain whether, in fact, traffic noise levels were presentlyconsidered to be satisfactory in residential areas. To do this, it would benecessary to resort to some form of data collection to determine the present state ofthe system. At this stage of the planning process, the data collection might berelatively informal. for example, letters of complaint to the authorities abouttraffic noise levels in residential areas might suffice as an indication of the presentstate of the system with respect to this parameter.

If a discrepancy exists between the desired and actual states, then a problem maybe deemed to exist with respect to this aspect of the system. If no discrepancyexists, then other aspects of the system should be examined to determine whetherproblems exist in those areas.

Having defined the problem, or at least expressed one's perception of theproblem, it is then possible to determine the system boundaries within whichsolutions to the problem will be sought. The definition of the system boundariesis performed on the basis of an intuitive assessment of the likely impacts of theproblem and may be subject to later revision. As shown in Figure 2.2, thesesystem boundaries may be defined in three dimensions; spatial, social andtemporal.

The spatial dimension defines the system boundaries with respect to thegeographical area to be considered in an evaluation study or a sample survey.With respect to transport, a useful trichotomy is between site, corridor andnetwork studies. Alternatively, the spatial dimensions may be defined inaccordance with geopolitical boundaries pertaining to local government, stategovernment and national government boundaries. The social dimension inFigure 2.2 relates to the social groups who are to be included within the systemboundary of the study. These groups may be defined in terms of theirrelationship to the transport system. For example, as a result of any change to thetransport system there may be system users who benefit from the change, userswho lose as a result of the change, and non-users who may be either adversely orbeneficially affected by the change. The introduction of a high occupancy vehicle(or transit) lane readily brings to mind examples of all three of these.Alternatively, the social groups may be defined with respect to a wide range ofsocio-economic variables where, for example, the objective of the study might beto examine the distributional impacts of the current system or changes to thatsystem. The temporal dimension in Figure 2.2 relates to the time horizon overwhich a solution is to be sought. Thus problems, or impacts, may be consideredwith respect to their short-term, medium-term, or long-term implications.Obviously, the interaction between solutions identified for each of these time-frames needs to be identified and assessed for compatibility.


21

Non-Users

User Losers

User Gainers

Corridor

NetworkSite

Short-termMedium-termLong-term

SOCIAL

SPATIALTEMPORAL

Figure 2.2 The Dimensions of System Boundaries

Given an explicit statement of the problem, and the boundaries within which theproblem is perceived to exist, it is then possible to refine the goals into morespecific, operational measures, termed objectives and criteria, which will later beused as a basis for comparison in the evaluation phase of the planning process.

Continuing the traffic noise example quoted earlier, the objective might be"acceptable traffic noise levels on residential streets in inner urban areas", whilstthe final criterion might be " a maximum noise level of 68dB(A) L10 (18 hour) at adistance of 1 metre from the building facade for residential buildings along MinorStreet in Hushville". At this stage, there now exists a specific numerical desiredstate to work towards in seeking a solution. Obviously, other criteria couldequally well be specified for this aspect of the transport system, as well as forother aspects. The degree to which each criterion is satisfied will depend, amongother things, on the importance placed on each criterion.

The allocation of resources to the solution of the problem is a critical step since itaffects other key components of the planning process. This allocation, in terms offinancial and manpower commitments, will determine the amount and type ofdata which can be collected, the type, number and scale of feasible alternativesolutions which can be considered, the degree of sophistication needed, and ifpossible, the models which will be used in the analysis.

A key feature of the transport planning process is the use of models to describe theoperation of the system. Conceptually, it would be possible to investigate theoperation of the system by actually implementing an alternative in the field and

Chapter 2

22

then observing its effects on the surrounding ecological, economic and socialenvironment. However, this approach has a number of economic, financial,political, social and engineering drawbacks and, apart from a limited number oftrial schemes, this method is seldom used. Hence some form of system model, orsimplified abstraction of the real world, must be relied upon to generatepredictions of consequences. A framework for consideration of the different typesof system model is shown in Figure 2.3. There are three different types of model;supply models, which describe how physical systems work; demand models;which describe how users of the system make their decisions; and impact models,which describe the impacts of usage of the system in terms of the economic,environmental and social consequences.

EconomicImpacts

SUPPLYMODELS

IMPACTMODELS

PopulationCharacteristics

DEMANDMODELS

SystemOperating

Characteristics

Usageof

System

PhysicalSystem

Characteristics

EnvironmentalImpacts

Social EquityImpacts

Figure 2.3 The Systems Modelling Process

The formulation of the system models is governed by the resources available andthe objectives of the analysis. In many cases, the "model" is no more than theexperience of the particular planner involved in the analysis. In other situations,the model is a complex mathematical model which takes account of the many


23

systems interactions. In all cases, the model simply makes predictions of thelikely consequences of the alternatives to be analysed.

The generation of the alternatives by which the problem might be solved ispossibly the most challenging part of the process from a professional point ofview, in that it requires considerable creativity on the part of the planner togenerate alternatives which will meet the desired criteria within the constraintsimposed on the problem solution. While much of the transport planning processis concerned with the application of logical and reasoned thought processes, thegeneration of alternatives should concentrate on illogical and unreasonablethought processes. All transport planners should become familiar with thewritings on lateral thinking by Edward deBono (e.g. deBono, 1967, 1972, 1988). Inhis works, deBono stresses that no new ideas can come from logical thinking; allnew ideas comes from illogical and somewhat random thoughts. With this inmind, deBono invented a number of thinking strategies such as provocation,random words and "po" statements (po stands for "provocation operation").deBono describes how such strategies can result in significantly new ideas whichcan overcome problems which could not have been solved through theapplication of conventional ideas. However, deBono also stresses that, oncediscovered, all of these new ideas must be able to be explained and justified interms of conventional logical thought processes; in essence, all great new ideasare obvious in hindsight. Combining logical and illogical thinking is one of thegreat challenges in transport planning.

The range of alternatives which might be considered is quite wide and mayinclude one of more of the following:

• do nothing• change transport technology• construct new facilities• change methods of operation• substitute some other activity for transport• change the regulations or legislation• change pricing policies• change public attitudes• tackle the urban problems which cause transport problems

The prediction of the consequences of various alternatives may necessitate arevision of the system boundaries if it appears that there are likely to besubstantial impacts outside of the existing system boundaries. This may theninvolve a revision of the objectives and criteria and a change in the models to beused to predict an expanded set of consequences.

The comparison of the predicted consequences with the stated criteria is thenperformed in the evaluation process. If no alternatives are deemed to be acceptable

Chapter 2

24

as a result of the evaluation, then a search should be made for new alternativeswhich may be acceptable in accordance with the stated criteria. If, after anintensive search, there appear to be no acceptable alternatives then it may benecessary to perform a re-examination of the goals and objectives to determinewhether they are unattainable and whether it may be possible to lower thestandards of the criteria without serious consequences.

If one or more of the alternatives are finally deemed acceptable, then a selection ofthe best alternative is made on the basis of the stated criteria. This alternative isthen the subject of implementation in the field, depending on any constraints whichmay be imposed on implementation by parties outside of the transport planningprocess. Examples of such external constraints include political and electoralconsiderations.

The final phase of the planning process is the monitoring of the performance of theselected alternative under actual field conditions. This monitoring process willgive rise to data on the actual operation of the alternative. This data on operationand consequences may provide a basis for recalibration, or reformulation, of thesystem models to enable better predictions to be made of future consequences.This monitoring may also suggest changes which should be made to the selectedalternatives to improve operations. The changes can then be modelled andevaluated to predict new operating conditions. Finally, monitoring should beperformed to ascertain any changes in goals and objectives which may affect theselection of alternatives over time. The inclusion of this monitoring step isessential and highlights the fact that planning is a continual process which doesnot stop with the generation of a set of plans for implementation. These plansmust be continually revised in accordance with changes in community goals,changing economic conditions and developing technology.

Only after the overall study objectives have been specified can the question ofhow to design and conduct the survey be considered. In particular, the objectivesof the study will determine the content of the survey and the analyticaltechniques required to address the stated problem. The definition of the systemboundaries will assist in specifying the survey population and other componentsrelevant to the design of the sample. It should also be recognised and acceptedthat the explicit statement of the problem may well obviate the need for a survey,either because the statement of the problem provides the solution to the problemor because it becomes clear that a sample survey may not assist in devisingsolutions to the problem.

2.2 SURVEY OBJECTIVES

Given the objectives of the overall study, and noting the various survey purposesas outlined in Chapter 1, it is necessary at this stage to define in quantitative and


25

qualitative terms the objectives of the survey. Basically, one must ask just whatquestions are to be answered by the survey and how the information obtainedfrom such a survey can be used to assist in the overall planning process. Unlessthese questions can be answered clearly at this stage, it is highly likely thatproblems will occur at several later stages in the survey process.

It may well be useful, however, to distinguish between two different types ofsurvey; project planning surveys which are surveys addressing a specific issue(e.g. what kind of people use public transport?) and research study surveyswhere data is being collected for more general use with multiple purposes ratherthan a specific application in mind. At the outset, the questions to be answered ina research survey will be much less clearly defined than in a project planningsurvey. This is understandable because, by the very nature of research, if oneknew all the questions to be asked at the outset, then the study would no longerbe a research study. However for project planning surveys, there would be muchless uncertainty about the concepts and questions involved; all that is missing isthe empirical validation of these concepts. In the discussion to follow, it will beassumed that project planning surveys are the central topic and hence it shouldbe possible to clearly define the survey objectives.

An example of survey objectives for a large-scale travel survey may be drawnfrom the 1981 Sydney Travel Survey, which was a personal interview survey of20,000 households carried out in Sydney, Australia between July and December1981 (Ampt, 1981). The stated objectives of this survey were:

(a) To provide a description of current travel behaviour, so that requestsfor information from various transport and planning authoritiescould be accommodated; and

(b) To collect data for specific policy analysis purposes, and for short-term forecasting using disaggregate behavioural models of transportchoice.

In order to achieve these major stated objectives, the following informalobjectives were also formulated (Ampt and West, 1985):

(c) To improve on aspects of the methodology deemed to beproblematic in the 1971 Sydney Area Transportation Study (SATS)survey and in other documented surveys;

(d) To ensure that the data reflected important travel-related factorssuch as flexible working hours; and

(e) To include information on behavioural (travel and activity)alternatives other than those reported on the sample travel day.

Chapter 2

26

2.3 REVIEW OF EXISTING INFORMATION

Before embarking on the collection of a new set of data, it is prudent to ascertainjust how much is already known on the subject in question. Whilst this wouldappear to be a commonsense step, it is often overlooked, with the result thatmany surveys either obtain data which simply duplicates that which is alreadyavailable, or else obtain data which is not completely appropriate to the problemat hand. The review of existing information can provide assistance in two majorways.

First, it may be possible to unearth existing relevant data sources. If one isextremely fortunate, it may be possible to find data which are exactly thoserequired, and hence the need for a further survey is completely obviated. Inmany cases, however, even though appropriate data is found it may still benecessary to re-analyse the data in the light of the problem at hand. Wigan (1985)gives an excellent review of some of the opportunities and pitfalls involved insuch secondary use of transport survey data.

Often not all the required data is found, and hence the need for a further surveyis only partially removed. Whilst not removing the need for further datacollection, other data sources may be of great assistance in the design of thatsurvey. For example, estimates of population variance for relevant parametersfrom a previous survey may assist in sample size calculations, while knowledgeof the composition of the population (e.g. what percentage of people own cars)may assist in stratified sample design. Finally, data may be found in a differentform but one which nevertheless will be useful as a cross-check on the accuracyof the data which is subsequently collected. Many sources of existing data areavailable for transport planning purposes and a comprehensive list of such datasources would be extremely long - and would vary from country to country. Anexample of such data sources, compiled for Australian conditions by Bowyer(1980), is shown in Table 2.1. Axhausen (1994) has also prepared a verycomprehensive list of major travel surveys which have been carried outthroughout the world since the 1950s.

The second way in which an information review might provide assistance is inthe revelation of methodological procedures which may be appropriate in thesurvey. The art of survey design is based on the acquisition of knowledgethrough experience, and there is no substitute to learning from one's mistakes inthe field. Unfortunately, many survey designers appear to take this generalisationto extremes, with the result that their surveys contain the same basic mistakesthat have been made in hundreds of previous surveys. Whilst there is nosubstitute for experience, there are many pitfalls which have been reasonablywell documented in the literature. A cursory review of this literature will helpavoid some of the common faults in survey design. There is a problem in


27

transport surveys in that the literature is widely dispersed. Often the surveymethod is only briefly described as an adjunct to the main topic of the paperwhich is usually the analysis of the survey data to assist in policy formulation.Nonetheless, there is a growing body of literature which will assist in suggestingsurvey techniques which are most appropriate for particular problems. Forexample, the 1981 Sydney Travel Survey adopted the use of a verbal activityrecall framework on the basis of work done by Jones and colleagues in the late70s (reported in Jones et al., 1983) and in response to dissatisfaction with othertrip recall techniques used in the 1971 SATS survey.

Table 2.1 Sources of Existing Transport DataSOURCE VALUE LIMITATIONS

National Bureau of Statistics:Population Census

Provide a broad outline ofpopulation groups andcharacteristics

Limited transport-related items.Individual records not available.

Urban Transport Studies;Home Interview Surveys

Detailed person andhousehold travel data

Little data on non-travelactivities nor on likely attitudesto future transport options.

Family Expenditure Surveys Household expenditurebehaviour.

Small nationwide sample; notyet suitable for longitudinalstudies.

Community Services; Patientand attendee records athospitals, welfare centres,schools, play groups.

Identifies "disadvantaged"groups; design and operationof small paratransit services.

Does not cover"disadvantaged" persons whoare not on the records. Somerecords not easily accessed.

Employers and Unions;Employee and Membershipdata

Basis for identifying potentialusers of employer-basedtransport schemes.

Possible problems with privacyand availability.

Operator Data; Transit andTaxi Systems.

Identifying and monitoringusers of these services

May not reflect characteristicsand attitudes of non-users.

Political "Network"; Petitionsby individuals or groups togovernment.

Enables local "grass roots"needs to surface.

Danger of demandssnowballing or wrong solutionbeing fitted to an expressedneed.

Monitoring Schemes;Tracking Patronage andProductivity.

Can aid in "fine-tuning" aparticular scheme andprovides research data forother schemes.

Requires commitment tomonitor over adequate timeperiod, and a statement ofsuccess criterion.

Universities; Institutes;Special Purpose Studies.

Exploratory "research value"insights into populationattitudes and activity/travelbehaviour.

Location specific.

Government Departmentsand Authorities (e.g. ElectricUtility Companies, MotorVehicle Registrations)

Survey sample frames Access to files

Source: Bowyer (1980)

In addition to identifying survey techniques and practical pitfalls, a preliminaryliterature review should also be conducted with respect to the subject of thesurvey. This may enable the survey objectives to be specified more precisely, inthe light of both previous survey results and also of theoretical considerationswith respect to the subject matter.

Chapter 2

28

2.4 FORMATION OF HYPOTHESES

If it appears that new data must be collected in order to satisfy the surveyobjectives, then it is necessary that the correct type of data be collected. Valuableguidance can be obtained in this respect by indulging in a little "theorising" or"kite-flying" as to the possible hypotheses which one may wish to test once thedata have been collected. Such theorising is far from being an academic exercise;it is in fact fundamental to the methods of scientific enquiry on which much oftransport systems analysis is based. As noted by Stopher and Meyburg (1979), thescientific approach consists of four major phases: hypothesis formation,observation, testing and refinement of hypotheses. Survey methods are the basisof the observation phase, but this must be preceded by hypothesis formation.

The rationale for hypothesis formation is relatively simple; without it, we have noidea of what we want to measure and to what level of accuracy suchmeasurement is required. However, given an hypothesis about how thecomponent of the transport system under investigation works, it is possible toclarify several aspects of the survey process. First, the dependent andindependent variables in the hypothesised relationship are established. Forexample, the dependent variable may be trips per person per day while theindependent variables might be personal income, age, sex and car ownership.Second, the types of analysis techniques which may be appropriate to derive thehypothesised relationship is determined. Third, the unit of analysis may bedetermined by the theoretical construct, e.g. is choice of a mode of travel afunction of an individual or of the household to which the individual belongs?Fourth, the requirements of the specified analysis techniques will give anindication of the degree of accuracy needed in the survey data. Bates (1979)provides a good example of this stage of the survey process with respect to theconstruction of disaggregate models of transport choice.

It should be noted that not all hypotheses which are proposed before a survey isconducted turn out to be correct hypotheses. Therefore, it is prudent to establisha range of plausible hypotheses and then determine the survey requirementssuch that each of the hypotheses can be adequately tested.

2.5 DEFINITION OF TERMS

At the end of the hypothesis formation stage, the survey designer is usually leftwith a number of concepts and terms whose general meanings are understood,but which are not totally clear with respect to details. It is therefore important, atthis early stage of survey design, to specify clearly the definition of terms to beused in the survey so that everyone concerned with the design and conduct of thesurvey is clearly aware of the interpretations to be used. The problem ofcommunicating these definitions to survey respondents in a clear and


29

unambiguous fashion is a very different issue which will be covered later whendealing with survey instrument design.

In transport surveys, some common terms which need specific definition include:

(a) HouseholdIn the past, for many of the cases encountered in transport surveys, thedefinition of a household was relatively straightforward where thehousehold consisted of the standard nuclear family, i.e. husband, wifeand dependent children. However, current estimates show that such anuclear family is now in the minority and is further decreasing. Singleparent families, extended families, and non-related people living in afamily situation make the definition of a household more difficult. Theredo exist some standard definitions of households (related to how peoplein a dwelling utilise such items as the cooking facilities) although for thepurpose of some transport surveys these definitions may not beappropriate. For example, if the survey objectives pertain to the way inwhich household members interact in choosing to take part in differenttypes of activities outside the home and the methods of transport theyuse to reach them, non-related household members will generally behavevery differently than those who are related.

(b) TripProbably the most difficult concept to describe in transport surveys isthat of a "trip". There are many possible definitions, and differentdefinitions may be appropriate under different conditions. Axhausen(1994) describes many of these in some detail. The definition of a trip maybe related to a change in land-use activity at the origin and destination, itmay be influenced by the mode used for the trip, by the length of the trip,and by the length of participation in activities at each end of the trip.Most trips are defined in terms of one-way movements, but somereported trip rates are given in terms of round-trips (sometimes calledjourneys or sojourns).

The most common differentiation between trips, however, is betweenunlinked trips (sometimes called stages or legs or stops) and linked trips.Unlinked trips are usually defined as all movements on a public street,meaning that walking to the bus stop, travelling on the bus, and walkingfrom the bus to the destination would be three separate unlinked trips orstages. This type of definition is important in many cases; for example,when researching exposure to the risk of accident since walk trips areequally vulnerable to accidents as are other types of travel. In general,walking is being seen as a more important component of travel than inthe past, so that this definition is becoming increasingly used. Linked

Chapter 2

30

trips are usually defined based on activities. Each time the activitychanges, a new trip is said to occur. In the above example, there would beonly one linked trip to the destination.

Since it is clear that the definition of a trip is vital to the survey results, itis critical that the survey designer(s) agree on a definition at this stage sothat questions can be designed with a single definition of "trip" in mind.

2.6 DETERMINATION OF SURVEY RESOURCES

In most cases, the available resources dictate the design of a survey, rather thanvice versa. Whilst it may be theoretically desirable to first design the survey toachieve pre-specified aims, and then to work out the resources required, it is thecase in most real-life surveys that the survey must be designed to fit withincertain pre-specified resource constraints. The resources needed for the conductof any survey can be defined in terms of three items: money, time andmanpower.

2.6.1 Costs of Surveys

The cost of a survey can be divided into five major areas of expenditure:(a) Salaries

(i) Professional staff(ii) Administrative staff(iii) Computer data entry staff(iv) Field staff(v) Specialised consultants

(b) Travel Costs and Living Expenses for Field Staff(c) Services

(i) Printing of questionnaires(ii) Vehicle operating costs(iii) Data entry costs(iv) Publication costs(v) Postage costs(vi) Telephone and fax costs

(d) Equipment and Supplies(i) Computers

(e) Other Costs(i) Overheads(ii) Publicity(iii) Transport of material(iv) Rental of office space(v) Incentives and give-aways


31

Because of inevitable cost over-runs, a good rule-of-thumb found in manysurveys is to ensure that there is about 150% of the estimated cost actuallyavailable for conduct and analysis of the survey, if necessary. As will bediscussed in Chapter 6, the pilot survey is a good time to check the estimatedcosts of the main survey.

2.6.2 Time Estimates for Surveys

The time needed to complete the survey should be estimated by ensuring thatadequate time is available for each of the stages of the survey described inChapter 1. A bar chart, or critical path diagram, showing overlapping andsequential activities is a useful aid in estimating the survey time requirements.However, so many things can go wrong in a survey that it is virtually impossibleto state fixed rules for making time estimates under a wide range ofcircumstances. Therefore, it is again useful to ensure that there is at least 150% ofthe estimated time available, since several stages of the survey process canconsume far more time than originally envisaged, e.g.!editing of data.

The two components which are most often overlooked in terms of time allowanceare the pilot surveys and time for analysis of the data. Both are critical to thesuccess of the survey, and more importantly, to the quality and usefulness of thedata being collected. Their omission can be to the detriment of an otherwisepotentially robust data set.

2.6.3 Personnel Requirements

In estimating personnel requirements, five different categories of staff may berequired: professional (for design and analysis), administrative, computersupport, field staff, and specialist consultants. Depending on the size of thesurvey and the way in which it is to be conducted, some of the categories definedabove may either be overlapped or eliminated. Thus, for small surveys, theprofessional and administrative tasks may be handled by one person (or group)while the field and editing tasks may be performed by the one group of people.Alternatively, for large surveys it has been common practice for authorities tocontract the entire survey out to survey consultants, and to maintain liaisonthrough a small professional or administrative staff group. Irrespective of themanner in which the survey is to be conducted, it is essential that adequate staffbe available at all five levels; otherwise delays and/or inefficiencies willinevitably occur. It should also be remembered that not all tasks can necessarilybe performed by the same person. The designer of a questionnaire may notnecessarily be the best type of person to administer the questionnaire in the field;similarly, a person who likes the uncertainties of the field survey situation maynot be suited for the repetitious task of entering data into the computer.

Chapter 2

32

2.6.4 Space Requirements

One of the components of survey resources which is often overlooked is therequirement of space for all types of surveys. The storage of questionnaires, theseating of administrative staff and the location of numerous data enterers (withtheir computers) can take up considerable office and storage space and needs tobe considered at an early stage of planning. For example, a self-completion travelsurvey of the kind described in Section 7.2 with a gross sample of 20,000households needs storage for about 10 tonnes of paper - that is about 40 cubicmetres, or a double garage (floor to ceiling) worth of space!

2.7 SURVEY CONTENT

Determining the content of a survey is an important yet sometimes difficult task.The degree of difficulty is increased greatly if the objectives of the survey havenot yet been clearly defined. In such a case, where it is not known how the data isto be analysed, the determination of survey content is often left to the intuition ofthe survey designer; many topics are included in the survey content "just in case"they may turn out to be useful at a later date. Large-scale household travelsurveys have often suffered from this "shot-gun" approach to survey contentdetermination.

On the other hand, if the survey objectives are clearly defined, then the task ofdetermining survey content becomes relatively straightforward; if an item ofinformation is required for the intended analysis, then it should be included inthe survey content. Even proceeding in this relatively systematic fashion,however, it is possible that the survey content list can be quite large. The task ofrefining the list to arrive at those questions actually asked on the questionnairewill be discussed later in the chapter on survey instrument design, when thetrade-offs between desired information and questionnaire length are considered.At the moment, then, the survey content list is like a "wish-list" of desiredinformation which one would like in order to achieve the objectives of the surveyand the overall study.

It is important to realise at this early stage that the survey content list shouldinclude items which may not be directly related to the survey topic underinvestigation, but which will be necessary when weighting the data to representthe population from which the sample is drawn. Many surveys have flounderedat the data weighting stage because the variables in the secondary data source, towhich the weighting or expansion factors are to be keyed, have not been includedin the questionnaire in an appropriate form.



3. Selection of Survey Method

The task of selecting the appropriate survey method is crucial to the efficiency ofthe overall survey effort. The choice of survey method will usually be the resultof a compromise between the objectives of the survey and the resources availablefor the survey. This compromise, or trade-off, can be neatly illustrated as shownin Figure 3.1.

A trade-off occurs because it is impossible to control all three of the majorelements in Figure 3.1; at best, only two of the three can be controlled by thesurvey designer. Thus given a fixed budget, as is normally the case, the selectionof the survey method, with an associated degree of quality control, willautomatically control the quantity of data which can be collected. Alternatively,within a fixed budget, specification of the quantity of data to be collected willimmediately dictate the quality of data which can be collected. That is, we cancollect lots of low quality data or we can collect a limited amount of higherquality data for a given budget. Generally, the latter course of action is to bepreferred.

In determining the total quantity of data to be collected, a further trade-off ispresent between the sample size from which data is collected and the amount ofdata collected from each respondent in the sample (i.e. the number of questionsin the questionnaire or interview). Within a limited budget for coding, editingand analysis, it will be necessary to trade-off the number of questions against thesample size; which one takes precedent will depend on the purposes of the

Chapter 3

34

survey and the length of the survey content list. Thus some surveys can beeffective with only a limited number of highly directed questions, while othersneed to explore topics in depth. The extent of this trade-off is, therefore, a specificdesign decision on the part of the survey designer. The trade-off will also bepartly determined by the respondents themselves. As the length of the surveyincreases, the response rate will generally decrease (the rate of decrease willdepend on such factors as the interest level of the survey topic to therespondents, and the overall quality of the survey instrument design). There will,therefore, be a point at which an increase in the number of questions asked willresult in the collection of less data in total, because of the more than proportionaldecrease in the response rate. The survey designer should therefore be cognisantof this interaction when making the explicit trade-off between sample size andsurvey length.

SURVEYRESOURCES

QUANTITYOF DATA

QUALITYOF DATA

SurveyMethod

QualityControl

SurveyLength

SampleSize

Figure 3.1 Trade-offs in Selection of the Survey Method

In making the above trade-offs, it is important to remember that the surveyresources are defined in three dimensions: time, money and personnel. The abovetrade-offs should be considered separately with respect to each of the threeresources. Thus, a survey method and sample size combination which is feasiblewith respect to the money available may not be feasible with respect to the timewhich is available for the survey, or vice versa.

3.1 TIME FRAME FOR SURVEY

One basic decision which must be made for all surveys is the choice of time framefor the data collection effort. It is possible to collect data from a cross-sectionalsurvey or a time-series survey. A cross-sectional survey involves the collection ofdata at one point in time over a large number of identifiable groups whichconstitute a fraction of the total population. On the other hand, a time-series

Selection of Survey Method

35

survey involves the collection of data using the same survey method, usuallyfrom a smaller number of groups in the population, at a number of successivepoints in time (a before-and-after survey is a limited form of time-series survey).

Some authors (e.g. Hensher, 1985) make the useful distinction between time-series and longitudinal data, whereby time-series data commonly defines timeitself as the independent variable in the data set, whereas longitudinal datadefines the independent variables to be those factors which are causally related tothe dependent variables. Both types of data are collected over several periods oftime, but the emphasis in the analysis is different. Time-series data is typified byquarterly data bases used by economists studying macro-economic relationships,or annual traffic counts used by traffic engineers to estimate traffic growth rates.The major emphasis in transport planning data, however, is on longitudinal datawhich can assist in increasing understanding of the underlying causes of travelbehaviour.

Possibly the best known, and certainly longest running, longitudinal survey intransport is The Dutch National Mobility Panel survey which has been carriedout by the Netherlands Ministry of Transport and Public Works since 1984 (vanWissen and Meurs, 1989). It was designed as a general-purpose travel survey,with the primary objective to analyse public transport policies. A large volume ofresearch in many aspects of travel has in fact been drawn from the panel data. Amore recent general purpose longitudinal study was begun in the Puget SoundRegion of the United States in 1989 (Murakami and Watterson, 1992). Otherexamples of specific studies are a car ownership study in Sydney, Australia(Hensher, 1986), a survey of car and bus use in the U.K. (Goodwin, 1989) and oftelecommuting in California (Goulias et al., 1992).

A further distinction is often made between longitudinal surveys andlongitudinal data. Longitudinal data need not be generated from time-seriessurveys only; it is possible to use retrospective reports and prospectivelyfocussed intentions surveys to obtain quasi-longitudinal data at a cost which ismuch closer to that of a conventional cross-sectional survey. However, given theproblems with the recall ability of respondents and the propensity of respondentsto "tidy up the past" to be more in accord with known outcomes (the cognitivedissonance effect, Festinger, 1957), it is likely that the use of retrospective reportscreates deliberate or unconscious distortion of information supplied by therespondent (Wall and Williams, 1970). While retrospective reports may have afunction in establishing respondents perceptions of prior conditions, they shouldnot be used as surrogates for objective data which could have been, but were not,collected at those prior times. The use of surveys which ask people to say whatthey will do in the future (often under changed conditions) are discussed inChapter 5 under stated preference surveys. The use of these surveys without

Chapter 3

36

using retrospective data, however, is not generally considered to providelongitudinal data.

In many cases, it is implicitly assumed that longitudinal data and cross-sectionaldata will yield the same behavioural relationships. For example, if one wanted topredict future household trip generation rates at a time when car ownership is ata generally higher level than at present, a standard method would be to conducta cross-sectional survey and, from this data, derive trip generation rates forhouseholds with different levels of car ownership. It would then be assumed thatthe relationship between car ownership and trip generation, which was obtainedacross different households in the population at one point in time, will be validwhen used for one household over a period of time. Thus, over time, as ahousehold moves to a higher car ownership level, it would be assumed that theywill take on the trip generation characteristics of those households who currentlyhave that higher car ownership level. While in many cases this assumption maybe a reasonable approximation, it should be realised that it is an unprovenassumption. In some cases, the assumption may well be grossly incorrect becausethe dynamic changes which occur as a result of a change in one variable may notbe considered.

If survey data is to be used to develop predictive models of travel and activitybehaviour, it therefore appears that longitudinal data, which can account fordynamic changes in behaviour, is the preferred type of data to collect (seeHensher (1982, 1985) for further arguments to this effect).

3.1.1 Longitudinal Surveys

Within longitudinal surveys, there are two distinct variations: successive samplesand panel studies.

• The successive samples survey involves the selection of a separatesample from the same population at each stage in the time-series.Thus, while each sample represents the same population or marketsegment within a total population, the individuals within eachsample are different from survey to survey. This is often called arepeated cross-section survey.

• A panel study, on the other hand, selects a sample at the start of thetime-series and retains the same individuals within the sample forthe entire series of surveys.

Longitudinal surveys exhibit a number of advantages and disadvantagescompared to cross-sectional surveys. Each type of longitudinal survey also hasrelative advantages compared to each other. A discussion of some of theadvantages and disadvantages of longitudinal panel studies will therefore


37

highlight some of the features of each longitudinal survey method, and oflongitudinal surveys in general.

The advantages of a longitudinal panel study include:

(a) Whereas the successive sample survey provides information only on thenet change in the characteristics of the sample, the panel study surveyalso provides information on the gross changes. Table 3.1 illustrates thesefeatures. For example, a successive samples survey may indicate thatusage of a bus service within the population of interest has decreasedfrom 40% to 35%. A panel study will provide the extra information that,as well as there being a net decrease of 5%, this change is the result of25% of the total population ceasing to use the bus while a different 20% ofthe sample started to use the bus. This extra information is useful to theoperator of the bus service in two ways. First, while there has been anoverall trend of declining patronage, this trend is not entirely one-way. Itwould appear that there is scope for increased usage of the bus within acertain segment of the population; the operator should identify thisgroup, establish the reasons for their increased usage of the bus, and thenseek out ways in which this potential can be maximised. Second, whileoverall there has only been a modest decline in patronage, there has beena dramatic loss of patronage from the existing users (62.5% [250/400] ofthe original users stopped using the bus between the two surveys). Onceagain, the operator should identify those who discontinued using the bus,identify the reasons for their decisions, and adopt strategies to arrest theloss of existing users.

Table 3.1 Comparison of Successive Sample Surveys and Panel Studies

*Assume total populationsurveyed = 1000 1993 1994

Successive SurveyBus use 40% 35%

Panel Survey• 1993 only 250 (25% of 1000) -• 1993+1994 150 150• 1994 only - 200i.e. Total users 400 350

i.e. Bus use 40% 35%

Chapter 3

38

(b) By tracing the behaviour of a group of individuals over a long period oftime, one obtains an idea of the dynamics of change. That is, not only is itpossible to observe static equilibrium states at different points in time,but it is also possible to trace the way in which individual members of thesample arrived at the equilibrium states (if, indeed, they are equilibriumstates). In this way we can identify the conditions which triggered aresponse and the way in which the panel study member adapted overtime to changes in conditions or to the responses that they actually made.For example, what pressures or opportunities caused a household to buya second car, how did they adapt to the pressures before they bought thesecond car, and how has their behaviour and travel patterns changedafter purchase of the car. Tracing panel study members in this wayenables a much richer and more valid understanding of their responses tochange. Trying to understand such dynamic behaviour by means of across-sectional survey, is like Newton trying to estimate the gravitationalconstant by looking at a photograph of an apple falling from a tree (if hehad a camera available in those days!).

(c) By adding a few new questions to each round of a panel study survey, itis possible to accumulate more information about each respondent thanwould be possible in either a cross-sectional or a successive samplessurvey. A good discussion of updating a panel survey questionnaire canbe found in Goulias et al, (1992). These new questions should pertain onlyto those respondent characteristics which are expected to remainrelatively stable over time, such that it does not matter at which interviewthey are asked. The advantage of this procedure is that more informationcan be obtained without drastically increasing the size of any oneinterview.

(d) By interviewing the same people on a number of occasions, a greaterdegree of rapport and cooperation can be built up between theinterviewer and the respondent. This may allow questions (of a slightlysensitive nature) to be asked on later occasions where they could notsuccessfully be asked in an initial interview.

(e) A great advantage of panel studies is that the between-sample variance iseliminated from the sample design. Thus, whereas for successive samplessurveys it is assumed that the samples at each stage of the time-series are(statistically) the same, for panel studies it is known that they are thesame. This means that, all other things being equal, smaller samples canbe chosen for panel studies than for successive samples surveys.


39

While panel studies do possess some significant theoretical advantages, as notedabove, they also are fraught with a number of potentially serious surveymethodology problems as follows:

(a) It is often difficult to find respondents who are willing and able to beinterviewed on repeated occasions. More importantly, given that suchpeople can be found, it is highly probable that, unless special care is takenin sample selection, they will represent a particular segment of thepopulation. Significantly, for transport surveys, these people may well beless mobile in terms of both locational and travel patterns, i.e. they stay inone residence for a long time, and they are more often at home withnothing to do (except answer interview questions!). The selection ofpeople in the initial sample may therefore be biased toward people whowill agree to be interviewed on repeated occasions. Such self-selectionposes special problems in sample design; although the socio-demographic composition of the sample may match that observed insecondary sources, there may well be an internal invisible substitution ofwilling panel members for unwilling panel members. Many of the issuesof panel survey sampling design are discussed in van de Pol (1987).

(b) Even if an unbiased sample can be initially chosen for the panel study,then sample mortality is a major problem as the panel study progresses.People die, move out of an area, change jobs or socio-economiccharacteristics, or move to another stage in the life cycle. While suchchanges do not automatically rule panel members out of continuedparticipation in a panel study, these effects (and others) can tend torapidly diminish the number of people who remain eligible formembership of the panel study population. Not only is sample mortalitya problem in terms of reduced sample size, but more importantly it canresult in a biased sample remaining in the study if, for example, it is themore mobile members of the population who leave the panel. Thisattrition bias can cause serious problems in the interpretation of trendsobserved over the life of the panel. While it is extremely dubious tosimply replace panel members who drop out of the study (Moser andKalton, 1979), it is possible to adopt a hybrid design (such as a rotatingpanel design) which combines the features of panel studies andsuccessive samples surveys (Curtin, 1981).

(c) Repeated interviewing can have a conditioning effect on the respondentwhereby an initial attitude of indifference towards the subject of thesurvey can change to one of interest and greater involvement in thesubject of the survey. This may, in some circumstances, result in changesin respondent behaviour (which may be measured in later surveys). Morelikely, however, it will result in changes in respondent attitudes (which

Chapter 3

40

may again be the subject of measurement). If these effects are not allowedfor in the initial design, then serious biases can occur in the results. Onthe other hand, one can use these effects to advantage as an instrument ofattitude formation as is done, for example, in the Delphi survey method(Dalkey and Helmer, 1963).

(d) Repeated interviewing can also have the reverse effect wherebyrespondents tend to lose interest in what might have initially been aninteresting subject to them. The same can also happen to interviewers. Insuch circumstances, both respondent and interviewer may, in laterinterviews, simply go through a stereotyped routine where both theinterviewer and the respondent know the answer which will be givenbefore the question is asked. Additional interviews of this type may resultin very little additional information.

(e) In an effort to reduce some of the negative effects of repeatedinterviewing with the same interviewer, it is possible to changeinterviewers in each round of the study. However, unless carefulexperimental design is used in this procedure, the change in interviewercan produce equally undesirable effects (Jones et al., 1986). This isbecause changes in response attributed to the respondent may really bedue to changes in the respondent-interviewer interaction over thedifferent rounds of the survey.

(f) If a longitudinal survey is being used to develop an index for monitoringof a system (e.g. surveys of customer satisfaction, or monitoring trip ratesover time) then it is essential that the questions used to obtaininformation for construction of the index should remain stable over time.This desire for stable questions, however, can pose a problem where it isrealised after the early rounds of the longitudinal survey that thequestions being asked may not be the most appropriate questions to ask.In this case, one is forced to make a trade-off between the stability and theappropriateness of the questions. This problem occurs, for example, in theselection of questions on transport for inclusion in the national Census ofPopulation. The use of a pilot survey is, therefore, all the more critical inthis case since it can largely eliminate the problem if designed carefully.

(g) Where a longitudinal survey is being used to assess changes in systemperformance as a result of change in the system's physical characteristics(as, for example, in before-and-after surveys), it is often difficult to ensurethat no other changes occur, either in the system or in the externalenvironment, which would have an effect on the system performancemeasures. The longer the period that the longitudinal survey covers, themore difficult will be this problem.


41

(h) While panel surveys have the advantage that they eliminate between-sample variance, they have the problem that one must take extraprecautions to ensure that the initial sample is indeed representative ofthe population under consideration, since one must stay with this samplefor the entire series of surveys (van de Pol, 1987).

(i) As with all longitudinal surveys, panel studies have the considerabledisadvantage that, because of their very nature, one must be prepared towait for some time before the results will become available. Obviouslyone can allay this problem to some extent by releasing results from eachwave of the panel study as if they were simple cross-sectional surveys.Because of the delay in obtaining the final results, however, it issometimes difficult to convince prospective clients of the need for a panelstudy. This is particularly the case where the client is involved in thepolitical arena, and where the length of the panel study may be longerthan the term for which the politician is elected. Because of the difficultyin securing funding, it is often necessary to make a panel study part of anomnibus survey, which serves the different objectives of several differentclients, in order to obtain sufficient funding to make the survey viable. Insituations where results are needed relatively quickly (as is most often thecase), longitudinal surveys are relatively impractical unless suitable datahappen to have been collected over several time periods in the past. Thecorollary of this is that for longitudinal surveys, one needs a considerabledegree of foresight to predict what the issues requiring data will be whenone completes the surveys at some future date. One then only needs toconvince clients that they should start funding the collection of thatlongitudinal data now!

In general then, it can be seen that while longitudinal data is often the mosttheoretically appropriate type of data, it is also beset by many practical (and sometheoretical) difficulties which militate against its use. There is, however, a body ofliterature which will help the prospective user of this methodology, some part ofwhich has already been mentioned. Choosing the sample and dealing with theidiosyncrasies of a changing population has been addressed by van de Pol (1987).Jones et al. (1986) have set out a series of guidelines on motivating interviewersand respondents in longitudinal research. If personal interviews are to be used,Morton-Williams (1993) has many useful comments. For a public transport userspanel, Golob and Golob (1989) have some practical advice. And finally, Meurs etal. (1989) have some interesting comments on that vexing question ofmeasurement error in panel surveys.

Despite the recent interest in the longitudinal method, cross-sectional surveysremain the most common form of survey used in transport research.

Chapter 3

42

3.2 TYPES OF DATA COLLECTION TECHNIQUE

Having decided on the time frame for the survey, the next decision is to decideon the type of data collection technique to be employed either once in a cross-sectional survey or repeatedly in a longitudinal survey. It should be noted that ifa longitudinal survey is to be undertaken, then it is imperative that the same datacollection technique be employed in each wave of the longitudinal survey. Aswill be shown later, each data collection technique has its own peculiar biases. Byusing different techniques in each wave of a longitudinal survey, it may be thatany differences in results which are observed from wave to wave are due to thedifferent data collection techniques used, rather than to any real differences in thesubject of the survey.

Essentially, there are eight different data collection techniques which may beemployed:

(a) Documentary searches(b) Observational surveys(c) Household self-completion surveys(d) Telephone surveys(e) Intercept surveys(f) Household personal interview surveys(g) Group surveys(h) In-depth surveys

These survey methods vary in complexity, in the types of information which canfeasibly be collected, and in the level of interaction between the survey designerand the respondents in the survey.

In documentary searches, the subjects of the survey are inanimate objects(documents) and there is no response required of these objects. In observationalsurveys, the subjects of the survey may be either inanimate (e.g. roadside featuresin an inventory survey) or animate (e.g. pedestrians in a pedestrian flow survey),but no specific response is required from these objects; they are merely expectedto behave in their normal manner while they are being observed. As noted inChapter 1, these two types of survey techniques are not the main focus of thisbook, although they are discussed briefly in the following pages to put them incontext. In self-completion surveys, the subjects of the survey are definitely animate,and they are required to respond and participate in the survey. However, thecontact between the respondent and the surveyor is second-hand, and is madeonly via a written questionnaire. With a telephone survey, while there is contactwith an interviewer, it is not face-to-face. Intercept surveys are those surveysconducted outside the home and while the respondent is in the process of using amode of transport (e.g. on a train) or of participating in activity (e.g. shopping).


43

They can be either personal interview or self-completion. Even with self-completion methods, however, the questionnaires are usually distributed bysurvey staff, meaning that there is some contact with the research group. Withthe personal interview survey, contact is more direct and involves face-to-facecontact between the respondent and the interviewer. Under these circumstances,there is a far greater possibility for interaction (both beneficial and harmful to thepurposes of the survey) between respondent and interviewer. While thisinteraction may allow more complex data to be collected, it also allows for agreater degree of bias to enter into the survey results. This interaction iscompounded in the group survey which takes place with a group of people (oftenoutside the home). In the in-depth survey, however, interaction is not just a by-product of the survey, but rather it is a design feature. In most cases, it isexpected that there will be considerable interaction between the members of thegroup being surveyed, and that this interaction will yield much richer data thancould be collected in one-on-one interviews with individual members of thehousehold or other group.

3.2.1 Documentary Searches

A documentary search is simply a search of existing published and unpublisheddocuments and databases in an attempt to uncover the type of information whichis required in the survey. As noted in Chapter 2.3, a documentary search mayeither obviate the need for further data collection or else it may simply assist inthe design of further data collection efforts. The use of documentary searches togenerate items of information for secondary analysis is best illustrated by anumber of documents which are regularly published containing collations oftransport statistics. In the U.S. context, such documents would include theNational Transportation Statistics Annual Report (USDOT, 1986), theTransportation Energy Data Book (Holcomb and Kosky, 1984), the NationalUrban Mass Transportation Statistics (Section 15 Data) (USDOT, 1982), theTransit Fact Book (APTA, 1985), and the Aircraft Operating Cost andPerformance Report (USDOT, 1985). In Australia, Transport Indicators (BTCE,1994) and in the UK (UK Department of Transport, 1993a; Central StatisticalOffice, 1994) are further examples.

These documents, which are only a sample of the wide range of documentscovering various modes and geographic locations, provide much basic datawhich can at least provide order-of-magnitude estimates of many importantparameters. In assembling data from such documentary searches, however, caremust be taken to ensure that the conditions surrounding the initial collection ofthe data are known and allowed for. Thus, definitions and initial data collectionprocedures should be assembled as well as the results obtained from the initialsurveys. In addition, it is often useful in interpretation of the assembled data, ifthe background of the originating organisation and the purpose of the initial data

Chapter 3

44

collection are also known. In this way, any potential biases in the data can morereadily be recognised. Wigan (1985) and Thoresen (1983) provide someinteresting comments on experiences with documentary searches in anAustralian context, which should be of general relevance in other geographiccircumstances.

Documentary searches may also be useful when attempting to obtain transportsystem inventory data. Thus, timetables and route maps for public transportsystems may provide useful information for the construction of a public transportnetwork model. Similarly, street maps of various scales will often provideenough information for the construction of a network model for a road system.Aerial photographs may be useful in establishing a sampling frame of houses in astudy area for use in a household interview survey (although it should be notedthat the aerial photographs will not be able to determine whether multiplehouseholds exist within a single residential building).

In some cases, the documents themselves become the unit of analysis in a survey.For example, in a study of transport legislation, the Acts, Bills and Regulationsbecome the units of the survey population. Similarly, in a study of editorialopinion on transport matters (Olsson and Kemp, 1976), the unit of analysis was adocumentary source; the newspapers containing the editorials. In other cases, thedocuments become the subject of the survey (rather than the units of analysis) in,for example, the study by Cohen et al. (1981) where they examined whichtransport journals were most frequently read by the transport profession (at leastthe United States transport profession).

3.2.2 Observational Surveys

Observational surveys, while being relatively infrequent in social science surveys,are commonplace in transport and, more particularly, traffic surveys. Two basictypes of observational survey are the direct and the indirect observational survey.Examples of direct observational surveys include:

(a) Transport inventory surveys (e.g. using techniques such as videorecording (Fahey and Chuck, 1992), digital imaging (Dickinson andWaterfall, 1984; Wigan and Cullinan, 1984), and instrumented vehicles(e.g. Bonsall and Parry, 1991));

(b) Traffic count surveys of different types (e.g. link counts, intersectioncounts, cordon counts, screen-line counts, transit route counts, boardingand alighting counts, etc.);

(c) System performance surveys (such as travel time surveys, intersectiondelay surveys (Richardson and Graham, 1982), and public transportperformance surveys (Attanucci et al., 1981));


45

(d) Traveller tracing, either to check accuracy of reported trip characteristics(e.g. Hensher and McLeod, 1977), to obtain direct information on routechoice characteristics (Wright, 1980), or to observe the speed andacceleration characteristics of vehicles in traffic systems (Akcelik, 1983);

(e) Vehicle classification surveys, in which vehicle types are identified byvarious means such as manual visual recognition, profiles frominductance loops, and pattern recognition from single tube axle counters(Bayley, 1983). In addition to identifying vehicle types, a greater emphasisis being placed on obtaining the weights of heavy vehicles by means of awide variety of weighing-in-motion (WIM) devices.

A critical feature of survey types (b) through (e) is that it should be recognisedthat the system performance measures obtained on any one day are but singlepoints from the distributions of the system performance measures. For example,travel time will vary from day to day for any given trip, as will public transportpassenger waiting time which is dependent on the coordination of arrivals anddepartures of different modes of transport. In designing surveys of systemperformance, allowance must be made for the variability inherent in each of theparameters under observation.

Examples of indirect observational surveys include:

(a) Wear patterns (caused by vehicles or pedestrians) which mayindicate predominant traffic flows;

(b) Accident debris or skid marks to indicate hazardous sites in a roadnetwork; and

(c) Fuel sales, and other economic indicators, to estimate total activity invarious transport sectors.

In general, indirect observational techniques are used less than directobservational techniques in transport surveys. This contrasts with the socialscience survey field where a large group of indirect observational surveytechniques has been developed under the general title of "unobtrusive" surveytechniques (Webb et al., 1966).

One of the major uses of observational surveys in transport planning has been toobtain data which has been used to check on the validity of results obtained frompersonal interview or self-completion surveys. Thus screen-line counts have beenused to check on origin-destination predictions from a household survey. Oneshould, however, realise certain factors when making such a comparison. First,the observational survey should account for variability in the systemperformance measures, as mentioned earlier. Second, it must be realised that

Chapter 3

46

many household surveys are conducted over a reasonably long period of time(e.g. six months). In such circumstances, it is somewhat difficult to decide justwhich traffic counts should be used for comparison. Third, it should be ensuredthat the same definitions are used in each type of survey (e.g. personal interviewsurveys would generally predict person-trips, whereas cordon counts wouldobserve vehicle-trips). Finally, biases in either or both of the survey methods mayneed to be accounted for before the results of the surveys are compared (e.g. theunder-reporting of particular types of trips in personal interview surveys).

3.2.3 Household Self-Completion Surveys

Self-completion questionnaire surveys are one of the most widely used forms ofsurvey technique in transport. Self-completion surveys are defined to be thosewhich the respondent completes without the assistance of an interviewer. Withthis type of survey, respondents are required to perform the three tasks on theirown. These three tasks are to read and understand the question, to mentallyformulate an answer, and to transcribe this answer onto the questionnaire formitself. In personal interview surveys, the respondent is wholly responsible onlyfor the second of these three tasks; the interviewer is primarily responsible for thefirst and third.

Several types of basic survey format can be described, depending on the methodsused for collection and distribution of the questionnaire forms. These variationsinclude:

(a) Mail-out/mail-back surveys

This is the most basic form of the self-completion survey and the one thatis most often employed. The questionnaire is mailed to the respondentsand they are asked to return it by mail after they have answered thequestions. The return postage is generally paid by the survey designer,although the precise method by which this postage is paid can have aneffect on response rate as will be described in Chapter 7.

(b) Delivered to respondent/mailed-back

Where it is suspected that response rates may be particularly low, orwhere it is believed that respondents may need some personalinstructions on how to complete the survey form, it may be desirable topersonally deliver the questionnaire form to the respondent. In this waythe purpose of the survey can be explained, any questions about thesurvey can be answered immediately, instructions can be given, andquestions answered, about how to fill out the survey form. Such personalcontact will generally increase the response and will also result in ahigher quality of response to the questions.


47

(c) Delivered to respondent/collected from respondent

In addition to delivering the questionnaire form, it is also possible tocollect the form from the respondent at some later date. This furtherincreases the response rate by putting some pressure on the respondentto complete the survey before the collector returns, and also enables thecollector to resolve any specific problems which the respondentencountered while filling out the questionnaire form. Naturally, however,the increased response obtained in methods (b) and (c) can only beobtained at considerable extra expense for the personal delivery andcollection of the questionnaire forms. However, where a high responserate is essential, as in a National Census, then this method may be themost cost-effective way of obtaining these responses. This method isfrequently used when "long-term" diaries (e.g. 7 day diaries) aredistributed (e.g in the National Travel Survey in the U.K. (UKDepartment of Transport, 1993b).

Self-completion surveys have both advantages and disadvantages. The primaryadvantages of a self-completion survey include:

(a) Self-completion surveys are generally much less expensive than acomparable personal interview survey. Hitlin et al. (1987) estimate thatthe cost of a telephone interview survey was approximately three and onehalf times more expensive per completed response than a hand deliveredand collected self-completion survey. A personal interview survey is evenmore expensive than a telephone interview survey (Ampt, 1989). Careneeds to be taken, however, that this cost reduction is not a falseeconomy. Self-completion surveys almost always have lower responserates than personal interview surveys, and hence the cost per returnedquestionnaire is much higher than the cost per distributed questionnaire.More important, the value of a returned questionnaire is a function of theoverall response rate because of the effect of non-response bias. Thus, thevalue-for-money from self-completion surveys may be lower than itinitially appears unless care is taken to minimise, or otherwise accountfor, non-response effects.

(b) A wide geographic coverage is possible in the sample, because postalcharges do not generally vary as a function of the distance involved. It isusually just as expensive to send a questionnaire 1 kilometre in the mailas it is to send it 1000 kilometres. There is therefore no incentive to limitthe sample to a local area (as there would be in a personal interviewsurvey). The only restriction which needs to be considered is the timerequired for the mail to reach the more remote areas of the study area;this may be a critical factor in the design of a schedule of reminder letters.

Chapter 3

48

(c) Because there is no interviewer involved in the survey, interviewer effectsare eliminated as a possible source of response bias. The type ofinterviewer effects which can occur will be discussed in Section 3.2.6.

(d) The respondent has ample time to consider the questions beforeproviding an answer, and hence it is possible to obtain consideredresponses to questions in a self-completion survey. It is also possible for arespondent to consult documents, if necessary, in order to provide factualinformation as an answer to a question (e.g. consult a log book to provideinformation about vehicle operating costs).

(e) The respondent can choose the time and place in which to complete thequestionnaire (except for questionnaires which must be completedquickly, such as on-board a vehicle). Because of this, there is lessincentive for respondents to hurry through the questionnaire in order toresume the activity they were engaged in when they received thequestionnaire. This generally results in a more complete answering of thequestions. The danger with allowing the respondents to defer answeringthe questionnaire is that they will completely forget about answering it.

While the above advantages are attractive features of a self-completion survey insome circumstances, the self-completion questionnaire survey is not withoutsome considerable difficulties, such as:

(a) The most consistent criticism of self-completion surveys has been the highlevel of non-response. Reported response rates of between twenty andfifty percent are not uncommon, and this allows ample opportunity forserious biases (see Chapter 9) to enter the data (Brög and Meyburg, 1981).However, by incorporating rigorous design and validation measures,more recent self-completion travel surveys in Australia have achievedresponse rates of up to 75% (Richardson and Ampt, 1993a). The mostreliable way to address the response rate is to incorporate a series ofreminder letters and/or questionnaires into the survey design - amethodology which raises the cost of the survey method to some extent.

(b) The layout and wording of the questionnaire MUST be extremely clearand simple. Definitions must be clear and easily understood by thepopulation being studied (not just to the survey designer), because thereis no interviewer on hand to clarify the intent of the questions. Theamount of time and effort which must be put into questionnaire design istherefore considerable.

(c) With a self-completion survey, it is difficult to ensure that the correctperson fills out the questionnaire form. Even though the questionnaire


49

may state quite specifically who should fill out the form, there is noguarantee that this occurs and there is no simple way of checkingwhether this is the case. In travel surveys, it is important that each personrelates their own experiences to avoid the problems of under-reportingassociated with proxy data (Ampt, 1981). It is therefore vital toincorporate validation measures which give some information on proxyreporting in the self-completion survey design.

(d) Responses from self-completion surveys tend to be skewed towards themore literate sectors of the population which tends to travel in a differentway than the remainder of people. This means that rigorous follow-upprocedures for non-respondents are required to ensure robust data.

(e) In general, only simple, one stage questions can be asked, since questionswhich require complex filtering, which depends on answers givenpreviously, usually need the skill of an interviewer.

(f) In self-completion surveys the answers on the questionnaire form must beaccepted as final; there is often no chance to probe further to clarifyambiguous or unclear answers. For this reason it is important to plan andtest the inclusion of ways of following-up respondents. One way to dothis is to ask for their phone number and phone if clarification is needed,and another is to follow-up a sample of people with a personal interview(Section 7.2.2).

(g) Spontaneous answers cannot be obtained in a self-completion survey, soopinions given may not in fact be the respondent's own opinion at thattime, but may be the result of discussion with others at a later time. Forthis reason, self-completion surveys are generally not the bestinstruments for conducting attitudinal surveys.

(h) A self-completion questionnaire offers no opportunity for the collection ofsupplementary information on such things as the respondent's residentialenvironment or the respondent's attitude towards the survey.

(i) Answers to questions cannot be treated as independent since therespondent has the opportunity to scan the entire list of questions beforeanswering any of them. For this reason, obvious cross-checks which aretransparent to the respondent cannot be used very effectively.

In summary, self-completion questionnaires are characterised by the ease withwhich they can cover a wide geographical area, and by their moderate cost. Theyneed to have substantial effort invested in their physical design and appearance.Finally, the desirability of using reminder letters or questionnaires means that the

Chapter 3

50

entire survey period needs to be considerably longer than that for personalinterviews.

3.2.4 Telephone Surveys

The telephone survey has been used for many years in market research surveysoutside the area of transport (Blankenship, 1977; Frey, 1983, Groves, et al., 1988).In fact, in the United States it has become the most commonly used mode ofconducting structured market research interviews (Frankel, 1989). It has beenused to a much lesser extent in transport research, but there are a number oftransport studies which report its use in one form or another (Stopher andSheskin, 1982b; Hitlin et al., 1987; Stopher, 1992; Ampt, 1994).

The growth of telephone interviewing in the 1970s and early 1980s led to thesetting up of centralised telephone interviewing installations for many surveys, adevelopment which revolutionised telephone interviewing. Dedicated telephoneinterviewing facilities allow for stricter control and closer supervision ofinterviewer behaviour than is possible with from-home telephone surveys orwith personal interviews (Morton-Williams, 1993).

The telephone survey method has a number of advantages which include:

(a) The telephone survey offers the possibility of wide geographic coverage -particularly in a given urban area where rates for phone calls frequentlydo not vary with distance. This is in significant contrast to the personalinterview survey where cost factors usually mean that samples areclustered. And when compared with self-completion methods, althoughtelephone rates are not as geographically uniform as postage rates, it isalmost as easy and inexpensive to contact a remote site in a telephonesurvey as it is to contact an accessible site.

(b) Because telephone interviews are usually performed from a centrallocation, it is possible to have much better supervision of the interviewersin order to maintain a higher level of quality control on the completedinterviews.

(c) By centralising the interview facility, it is possible to use ComputerAssisted Telephone Interviews (CATI). In this method, the interviewerreads the questions from the computer screen and then types theresponses directly into the computer as they are received over the phone.This allows rapid checking of responses for consistency, and eliminates thetask of transcribing the answers onto paper and then later entering theseanswers into the computer.


51

(d) Telephone surveys are generally cheaper than personal interview surveysbecause of the reduction in labour needed to conduct the survey and theabsence of field and travel costs associated with having interviewers inthe field. The cost advantage is more noticeable in the United States thanin other countries since in this country special telephone lines areavailable at concessionary rates, and where random sampling for face-to-face interviewing is relatively complex and expensive. In Britain,however, the situation is reversed; long distance telephone calls arerelatively expensive and random sampling for telephone surveys is moreproblematic than for person interview surveys (Morton-Williams, 1993, p.156).

(e) Use of the telephone is ideal for validating and clarifying queries aboutresponses people have made in self-completion surveys (Richardson andAmpt, 1994). Respondents are phoned after data enterers have identified"suspected" errors in their replies. This method achieves an almost 100%response rate (since people being phoned have already responded once tothe survey) and has the advantage of significantly increasing the validityof results, with minimal additional cost.

(f) In conducting surveys in today's multilingual societies, dealing withpeople who do not speak the language of the interviewer occurs fairlyoften. Telephone surveys offer an effective method of dealing with theseoccurrences by having a central core of multilingual interviewers who canbe called to the phone as required to complete an interview whenlanguage problems arise. This is much more efficient than trying to havea large number of multilingual interviewers who must performinterviews in the field (where their special talents are most often notneeded).

(g) Because of the speed and low cost of contacting households, it issometimes possible to use a telephone survey to identify rare populations(i.e. groups in the population who constitute a very small fraction of thatpopulation). By using an appropriate set of filter questions at thebeginning of the interview, the interviewer can quickly identify whetherthe household belongs to the rare population or not.

The telephone survey method, however, has some potentially seriousdisadvantages, including:

(a) There is a limit on the length of survey which can be successfully completedover the phone. While some individuals will be willing to spend as muchas 20 minutes or more being interviewed by telephone, the overallresponse rate drops rapidly after about 10 to 15 minutes (Stopher, 1985a).

Chapter 3

52

(b) The number of people in a household with whom it is possible to carry outthe interview is almost always limited to one. It is rarely possible toencourage more than one person to respond. As will be discussed inSection 7.4, proxy interviewing (one person reporting on behalf ofanother) is not recommended at all in travel surveys. This, combined withthe limited length-tolerance for phone interviews, is a significant reasonfor not using telephone interviews for household travel surveys where allhousehold members should be interviewed.

(c) With an increasing amount of direct marketing being done by means ofthe telephone (some of which is disguised as a sample survey), it isbecoming more and more difficult for sample survey researchers toestablish their credibility at the beginning of an interview. The generalpublic is becoming rightfully wary of anyone who introduces themselvesas conducting a survey. It is imperative that the survey designer identifya means whereby the credibility of the surveys can be established beforethe telephone call is made. This can be by means of an introductory letter(although this can be difficult since the respondent's address will oftennot be known in advance if, for example, random digit dialling is used toselect the sample) or by means of an introductory telephone call in whichan appointment is made to call back at a more convenient time for therespondent. This latter method also has the possibility of reducing theintrusiveness of a telephone interview survey.

(d) Because of the fact that only those households with phones can beincluded in a telephone survey, there is an obvious potential for samplebias to occur. The extent of this bias will depend on the extent of phoneownership in the population under consideration, and the extent towhich those households with phones display differing characteristics tothose households without phones or with unlisted numbers. Certainly ithas been shown that non-phone owners and people with unlisted phonenumbers have different socio-demographic than those with phones (non-phone owners are younger, single person households, less affluent andmore likely to be unemployed, low car owning or of ethnic background(Morton-Williams, 1993), and unlisted households are generally thereverse).

More important, however, is the variation among non-phone owners withrespect to the variables under consideration (e.g. travel patterns). It hasbeen demonstrated (Richardson, 1986) that such differences can occur,but that they can be corrected for by means of standard socio-demographic weighting techniques (to be described in Chapter 8).Nonetheless, it is essential to recognise the potential for such sample bias


53

and to plan the survey design such that these corrections can be made at alater stage.

(e) The use of the telephone interview usually means that telephone bookshave been used to select the sample. In addition to the problem of non-phone owning households being excluded from the sample, there areother problems such as out-datedness associated with their use. These areelaborated in Section 4.3.

(f) Unlike other forms of survey, there is no chance of follow up for non-respondents in a telephone survey. If a respondent refuses to participatein the survey, then little or no background information on thisrespondent can be obtained. This makes it extremely difficult to correctfor response bias at a later stage.

(g) Because of the nature of a telephone survey, no visual aids can beemployed in such a survey. All communication to and from therespondent must be by means of the spoken word, and this results insevere limitations on how questions can be asked and answered. Forexample, it is not possible to use show cards to overcome the "order bias"which occurs when a list of possible responses is read to the respondent.Some possibilities for overcoming this problem include sending visualaid materials to the respondent before the telephone interview takesplace. Like sending an introductory letter, however, this requires that anaddress is known for each respondent. A novel technique which doesaway with voice communication is the use of digital touch-phone signalswhich can be used to enable the respondent to reply to a pre-supplied listof questions.

While telephone surveys are seen by some as an area with significant potential inthe collection of transport survey data, we would recommend that they should beused with caution, especially for data which is not factual and straightforward.

Telephone surveys for gathering data on travel and activities can be summed upas follows. They are generally not useful:

• for surveys in which all persons in the household are to beinterviewed,

• any collection of travel data where the household has not beencontacted by letter or in person.

They are general useful:

Chapter 3

54

• to follow up queries arising from self-completion or personalinterview surveys, and

• for commercial or business-based surveys.

For a good discussion of the advantages and pitfalls of telephone interviewing,see Groves et al., (1988).

3.2.5 Intercept Surveys

Intercept surveys are those surveys which take place at a site which is not in ahousehold - where people are intercepted in the course of carrying out an activityof some type. They include surveys on-board public transport vehicles, at cordonpoints on roads, and at other activity sites such as shopping centres, work placesor transport nodes such as airports. The surveys which are carried out at theseplaces can have more or less interaction between surveyor and respondents,depending on the objectives of the survey and the location of the intercept. Allintercept surveys, however, involve personal contact with respondents in oneform or another - either to distribute questionnaire forms or to actually ask aseries of questions.

The major types of intercept surveys are:

(a) On-board vehicle distribution/mail-back

In many cases, it is desired to conduct a survey of a particular group oftransport system users, e.g. public transport patrons. To attempt to findthese people by means of a general household survey would be almostimpossible, because they represent such a small percentage of the totalpopulation. A more efficient method is to limit the population to includeonly those people, and to use a survey method which will identify onlymembers of that population. On-board vehicle surveys are an effectivemeans of conducting such surveys. In on-board surveys, surveyors rideon–board the vehicle and distribute questionnaire forms to thosepassengers on the vehicle. The passengers may then be required to fill outthe questionnaire forms at their convenience, and then return them viathe mail. A comprehensive description of such surveys is given inStopher (1985). They have the advantage of being moderately cheap, butthe disadvantage of generating low response rates, since it is not possibleto encourage or remind people to respond in any way.

(b) On-board vehicle distribution/on-board vehicle collection

As an alternative to method (a), it may be possible to collect thecompleted questionnaire forms before the respondents leave the vehicle.


55

For some modes, such as in-flight surveys, this poses no particularproblems since there will generally be ample time for the passenger tocomplete the survey before the end of the trip. They may even welcomethe survey as something constructive to do to pass the time on the flight.A very successful way of distribution/collection on planes is to distributequestionnaires to all persons on entry and to have surveyors at thedestination airport to collect them from disembarking passengers (AmptApplied Research, 1988).

It should be noted that these types of survey on public transport vehiclesmay pose considerable practical difficulties. For example, not everyone isseated in the vehicle, not everyone has a writing instrument, and noteveryone has sufficient time to complete the questionnaire form beforethe end of their trip. Nonetheless, such surveys are often used, and it isthen a matter of careful design and administration to ensure that thesurvey can be successfully completed under the particular conditionsexpected to be encountered. Pilot surveys are always critical to thesuccessful execution of this type of survey.

(c) On-board distribution/collection plus mail-back

In some studies, hybrid on-board surveys, which combine elements ofboth methods (a) and (b), have been used successfully (Sheskin andStopher, 1982a; Hensher, 1991). The method involves using a two-partquestionnaire form. The first part is the more usual postcard style, clearlymarked for filling out and return on the bus. The second part is a morelengthy form to be taken away by the bus traveller, filled out later in theday, and mailed back. This method allows for considerably moreinformation to be obtained than can be acquired from the standard on-board distribution/mail-back method.

The robustness of the results of the intercept surveys which are carried out on-board a vehicle and described here in (a) - (c) depends very heavily on thevoracity of the sampling method used to select the vehicles, routes and peoplewhich are included in the sample survey. For a discussion of issues related tosampling for on-board surveys, see the discussion in Section 4.4.4.

(d) Roadside distribution/mail-back surveys

Where the mode of transport under consideration is the private car, thenthe method of distribution to pin-point those users is often the roadsidequestionnaire survey. In this survey method, questionnaire forms aredistributed to motorists as they pass by a particular point, or set of points,on the road. To enable the questionnaire forms to be distributed, it is

Chapter 3

56

necessary that the motorist be stationary at that point. This can beachieved in one of two ways; either the motorist can be stopped at anatural feature of the roadway (such as a traffic signal or a toll-booth), orelse the motorist can be deliberately stopped by the survey team(preferably with the assistance of local police officers). After motorists arestopped, they are given a questionnaire form and a brief explanation ofthe purpose of the survey. Respondents are then asked to fill out thequestionnaire form at their convenience and return it by mail. They havemany similarities with the on-board vehicle distribution/mail-backsurveys in (a) above and are mostly differentiated by the longerintroduction time given to the surveyor. Descriptions of these types ofsurveys may be found in Richardson and Young (1981) and Richardson etal., (1980).

The small amount of research done to validate surveys such as thesesuggests that there can be some significant biases using this method - forexample an analysis of the 1961 London Travel Survey (LCC 1963)compared results of intercept interviews of passengers using London railtermini with results from mail-back surveys handed out at the same sites,and concluded that the mail-back survey over-represented regularcommuters to central London. Similarly, Sammer and Fallast (1985)reported on a mail-back survey near Salzburg which showed that local(Austrian) residents were over-represented relative to foreign (German)residents. In both cases its was hypothesised that the over-representedgroups had a greater stake in the journeys they were making becausethey participated in them more frequently, and therefore had a greaterincentive to respond.

(e) Intercept interviews

Sometimes intercept surveys involve personal interviews with the driversof vehicles or travellers as they are stopped at the intercept point. In thesecases, the respondents are stopped by an interviewer who asks them aseries of questions - most usually about origin, destination and times oftravel, with some details on socio-demographic status. The presence of aninterviewer generally ensures a much higher response rate than formethods which involve mailing back a postcard.

Some recent work in Shipley in England presented an opportunity to test thedifference between road-side or on-vehicle distribution/mail-back and interceptinterviews (Bonsall and McKimm, 1993). It was found that response rates variedfrom 60% (at a morning peak site where a prize draw was on offer) to 33% (at anall-day site with no prizes) with responses being as high as 62% for the journey towork and as low as 29% for the journey to employer's business. In addition,


57

people who made the trip more frequently tended to respond better as did males.Although roadside and on-vehicle distribution surveys have frequently beenused to define trip matrices, trip frequency distributions and trip purpose, thereis clearly a fairly high risk in doing this without carrying out verification surveyssuch as intercept interviews where response rates are much higher.

(f) Activity centre distribution/mail-back or interview

This type of survey is similar to the above intercept methods in that thedistribution of the questionnaire or administration of the interview isdesigned to capture the population of interest at a natural point ofcongregation (e.g. shopping centres, airports). If the activity centre is of atype that repeat visits by the population could be expected on a regularbasis (such as work or school), then a mail-back could be wholly or partlyreplaced by having respondents deposit the completed forms at somestrategic locations at the activity centre.

The biggest challenge in activity centre surveys is choosing a samplewhich is representative of all people visiting the centre. This is aparticularly vexing issue in public areas of airports or shopping centreswhere there is no "funnel-mechanism" to allow distribution ofquestionnaires or interview of all people or of a random sample ofpeople. The usual sampling approach to surveys in these activity centresis to use an uncontrolled quota sample (Section 4.4.7), i.e. to survey acertain number of people of given types (e.g. certain age groups, gender,destination) without knowing the proportion of these types which exist inthe population (of people who visit the activity centre). These type ofsurveys (using uncontrolled quota sampling) have three major problems:

• there is no knowledge of the population from which the sampleis drawn;

• there is no rigid sampling rule applied in the selection ofrespondents; and

• there is no information collected on those people approachedwho do not respond.

If each of these problems are addressed, surveys at activity centres can beas reliable as other intercept surveys.

A characteristic which is common to all intercept surveys is the inability (orlimited ability) of the researcher to follow-up non-respondents. As already notedin the discussion on self-completion surveys, and as will be seen in that onpersonal interview surveys, one of the key factors which ensures a high degree ofreliability of the data is the ability to gain information on non-respondents, andthereby to weight the data according. Consideration should be given to this

Chapter 3

58

limitation of the methods at the time the survey is being designed. In many casesit may be possible to set up a small control survey-within-the-survey whereeither a follow-up procedure is implemented (e.g. using car registration numbersat roadsides, or ticket numbers on planes) or where a small part of the survey isconducted with a close to 100% response rate (using personal interview withmultiple interviewers).

3.2.6 Household Personal Interview Surveys

A personal interview survey is defined as one in which an interviewer is presentto record the responses provided by the respondent in answer to a series ofquestions posed by the interviewer. For this reason, many of the interceptsurveys discussed in Section 3.2.5 as well as the telephone surveys outlined inSection 3.2.4 could also readily fall under this category. Our discussion in thissection is, however, limited to personal interviews which take place in the home.Personal interview surveys have long been associated with transport, with homeinterview surveys providing the major means of data collection for the transportstudies of the 1960s, 70s and 80s.

A household personal interview survey may be chosen, in preference to a self-completion survey, for any of several reasons:

(a) In general, higher response rates may be obtained from personal interviewsurveys than from self-completion surveys. Response rates of the order of75% to 85% are not uncommon. This tends to minimise the effects of non-response bias, although it does not completely eliminate this bias as willbe discussed in Chapter 9.

(b) The personal interview survey allows for considerable flexibility in the typeof information collected. Attitudes, opinions, open-ended verbal answersand other non-quantitative information are much more readily collectedin a personal interview survey than in a questionnaire survey. Complexsequence guides or filters can be used if required, since interviewers(unlike respondents) are always given training prior to the commence-ment of the survey.

(c) The presence of an interviewer means that explanations can be givenregarding the meaning of questions or the method in which answers areto be given. As will be explained later (Chapter 7.3.2), the interviewermust generally adhere to a fixed set of questions, some of which need tobe asked verbatim, but explanations of the meaning of questions aregenerally permitted so long as they do not influence the answer by therespondent. In a travel survey, this is particularly important in relayinginformation to respondents about the level of detail required forreporting trip and activity behaviour.


59

(d) Personal interview travel surveys can be carried out over a much shortertime period than self-completion surveys which need up to 6 weekselapsed time to incorporate sufficient reminder notices into the surveyprocedure (see Section 7.2.1).

(e) Since many surveys can be quite long, an interviewer can be effective inmaintaining respondent interest and in ensuring that the full set ofquestions is completed.

(f) By noting the interest of the respondent in the survey and the way inwhich the questions (especially attitudinal questions) are completed, theinterviewer can make a valuable assessment of the validity of the recordedanswers.

(g) The interview situation is valuable where it is desired to obtainspontaneous answers from a particular individual. Thus interview surveysare particularly suited, perhaps even essential, for attitude surveys.

While being particularly effective in several aspects of transport data collection,personal interview surveys are not without their own distinct disadvantages,including:

(a) Personal interview surveys are relatively expensive. Typically they wouldbe three to ten times more expensive per returned questionnaire than aself-completion survey (this of course depends on the quality of the self-completion survey). The high cost of personal interview surveys isprimarily due to the high labour content of interview surveys. Acomparison of costs between persona interview and self-completionappears in Ampt and Richardson (1994).

(b) In order to reduce travel expenses and interviewer lost-time, manyhousehold-based personal interview surveys make use of clustering ofhouseholds or survey sites on a geographic basis. This causes the"effective sample size" to be reduced with consequent reductions in theaccuracy of estimates from the data.

(c) The interview situation is basically a human interaction between aninterviewer and a respondent. Such interactions are rarely, if ever,completely neutral. The resulting interaction (often termed interviewerbias) may affect each participant (and the data which is collected) invarious ways including:(i) The personal characteristics of the interviewer (e.g. age, sex,

nationality, general appearance) may influence the answersbecause of the impression made by the interviewer on therespondent (and vice versa).

Chapter 3

60

(ii) Respondents are being asked to disrupt their normal routine inorder to answer the interview questions. If such a disruption isinconvenient, this may distort the answers given byrespondents (if, for example, they wish to return to theirnormal routine as soon as possible).

(iii) An interviewer with strong opinions may subconsciouslycommunicate these opinions to the respondent by the way inwhich the questions are asked or the answers are received. Insome circumstances, some respondents may agree, or disagree,with statements depending on their perception of thetemperament of the interviewer.

(iv) Answers to questions early in the interview allow theinterviewer to build up a picture of the respondent's generalbehaviour and attitudes. The interviewer may then interpretlater answers (especially vague answers) to be consistent withthis picture even though the respondent may have intended togive a contradictory or apparently inconsistent answer.Alternatively, the interviewer may construct a picture of thetype of person being interviewed (based, for example, on socio-economic characteristics of the respondent) and then interpretvague answers to fit within an idea of the expected responsefrom that type of person.

(d) Personal interview surveys are not suited for situations where questionsrequire a considered response, or where factual information is requiredwhich is not immediately available. The time delay involved in obtainingsuch responses is either a waste of the interviewer's time, or else therespondent feels embarrassed at making the interviewer wait for theanswer.

In summary, personal interview surveys are best for attitude surveys, for surveyswhere the concepts are complex or where there is a complex series of sequencingrequired. They are more expensive than their self-completion counterparts andhave to be designed thoroughly to minimise interviewer bias, but their highresponse rates and their ability to be carried out within a relatively short timeinterval make them ideal in cases where high quality data is required within amedium time frame.

3.2.7 Group Surveys

In the personal interview surveys described above, attention was focussed on therole of interviewer effects in the survey procedure. Interaction betweenindividuals is inherent in many forms of survey procedure which attempt torecord human behaviour, and in most surveys our intention is to minimise this


61

interaction in an attempt to produce results which are unaffected by the presenceof an interviewer (Collins, 1978). The survey techniques outlined in this and thefollowing section, however, take a substantially different view of theseinteraction effects. Instead of seeing interaction as being a totally negativephenomenon, they utilise the interaction between interviewer and respondent,and more particularly, the interaction between respondents, to enable thecollection of a data base which is much richer in terms of its ability to explain thedynamics of travel and activity patterns.

A wide range of interactive and/or group survey techniques are available foruse. Because of the fact that the output of these surveys is unlikely to be a hardset of statistics, but rather a better understanding of the problem at hand, suchsurvey methods have often been referred to as "qualitative" survey methods(Gordon and Langmaid, 1988). Basically, there are two types of qualitativemethodologies: the group discussion and the in-depth interview (Section 3.2.7).This section deals with group discussions which are also often known as focusgroups.

The basic concept of group discussions is that a small number of people (usuallybetween seven and nine) who are specially recruited according to apredetermined set of criteria, exchange experiences, attitudes and beliefs about aparticular issue. Depending on the survey objectives, the criteria may be that thegroup be similar (for example, they may all be public transport users, or theymay all live in a certain area) or dissimilar (for example to include professionaldrivers, regular drivers and those who do not have a driver's licence). Thediscussion is carried out under the guidance of a trained facilitator who:

• directs the flow of the discussion over areas that are important to thepurposes of the survey;

• recognises important points and encourages the group to explore theseand elaborate upon them;

• observes all non-verbal communication between respondents, or betweenrespondents and the facilitator, or between respondents and the issuebeing discussed;

• creates an atmosphere that allows respondents to relax and lower some oftheir defences;

• synthesises the understanding gained with the problems and objectives ofthe survey; and

• tests out hypotheses generated by the information gained so far.

Chapter 3

62

Group discussions are usually tape-recorded to ensure that there is an accuraterecord of the interactions and to relieve the facilitator of the responsibility oftaking notes during the session. They are sometimes also video-recorded to allowanalysis of non-verbal, as well as verbal reactions after the discussion.

These discussions have several advantages which contribute to their usefulnessin such situations, including:

• the group environment with 'everyone in the same boat' is lessintimidating that a one-on-one in-depth interview;

• one respondent's experiences or feelings tend to trigger reactions fromother respondents, whereby ideas which had laid dormant in the secondrespondent are now brought to the surface. It is, therefore, a good vehiclefor creative expression from all respondents;

• the process highlights the differences between respondents (especially ifrespondents have deliberately been chosen with different backgroundsand experiences), thus making it possible to observe a range of attitudesand behaviours in a relatively short time;

• groups can be observed (by people other than the facilitator), thusmaking it particularly useful for professional staff of a transport agencywho can experience respondents vocabulary, attitudes and reactions first-hand;

• spontaneity of response is encouraged in a group setting, yieldinginsights which may not be available from a one-on-one interview; and

• by careful selection of members of the group, the social and culturalinfluences on attitudes and behaviour are highlighted.

Group discussions are not, however, without their disadvantages, including:

• group processes may inhibit the frank exchange of attitudes and beliefs,especially from minority members of the group, and may lead tounrealistic and excessive recounting of behaviour;

• the group may react negatively to the facilitator, the subject matter or thediscussion environment, and may freeze up;

• the strong personality of one respondent may overawe the otherrespondents who either withdraw or simply agree; and


63

• the group may lose perspective on the real issue, by getting too close tothe problem and by discussing something in great depth which, in reality,may be more of an instinctive reaction.

For the above and other reasons, group discussions by themselves are not themost appropriate survey technique under the following circumstances:

• when statistically significant data is needed;

• where the subject matter is of a personal nature or where it involvespersonal finances;

• where social norms strongly predominate, creating extra pressure forconformity within the group;

• where a detailed understanding of a process is involved, and where it isunrealistic to expect all in the group to have the required level ofunderstanding;

• where personal opinions on the subject matter are highly varied, andwhere the group becomes too heterogeneous to obtain useful informationwhich could be generalised outside of the group; and

• where difficulties are experienced in recruiting the target sample (e.g.members of minority groups with specific transport system usagepatterns).

Within the overall structure of group discussions, there are a number ofvariations. Mini-group discussions consist of 4-6 respondents instead of the usual8-9, and are useful when focussed information is required quickly (e.g.immediate reactions to a new public transport marketing campaign), and whereit may be difficult forming a larger discussion group. While normal groupdiscussions may last from 1-2 hours, extended discussions may last from 3-4hours (usually with some form of break). Extended discussions are particularlyuseful when you wish to use a variety of methods which involve the respondentsin actually performing some task, which will provide you with more informationabout their real attitudes and beliefs. This is particularly the case where therespondents need more time to become comfortable with each other, or wherethere may be some time required to explain the tasks to them and allow themsome time to familiarise themselves with the tasks. Reconvened groupdiscussions, as the name implies, involve the group meeting on more than oneoccasion. These discussions are particularly useful when the respondents areasked to engage in a particular activity between the two meeting times (e.g. whencar drivers are asked to try various forms of public transport). This provides the

Chapter 3

64

researcher with the opportunity to obtain immediate reactions to such activitiesin a controlled environment.

Brainstorming is a particular form of group discussion in which the respondentsare asked to suspend evaluation of any ideas proposed by others in the group (orby themselves) and to simply concentrate on thinking about innovative solutionsto the problem proposed by the facilitator. While these sessions sometimesappear to be virtually out-of-control, they require a very disciplined approachfrom the facilitator to ensure that the respondents behave in this fashion. Thefacilitator needs to be specially trained in techniques to aid creativity such asvarious methods of lateral thinking, thinking by analogy, fantasy solutions, wordassociations and imagery. Role-playing is sometimes used to help directrespondents in this case. For example, a respondent who is a professional drivermay be asked to take on the role of a person who always walks and be asked tocomment on a proposal to reduce the pedestrian access to favour cars.

In summary, group discussions are very useful for exploring issues for which theresearcher has no clear-cut expectations about what will be found, but wherethere may be some hypotheses which needs to be tested in a fairly informalsetting. They are, therefore, ideal for pre-pilot testing of questionnaires, and forexploring the jargon used by the population on a particular survey topic. Theyare not appropriate for providing statistically significant data.

3.2.8 In-Depth Interviews

The rise in popularity of in-depth interviews in transport studies follows directlyfrom viewing travel as a "derived demand"; that is, we almost always travel inorder to carry out an activity outside the home - not for the sake of travel itself.Because of this, a full understanding of travel behaviour can only be achieved byviewing travel as but one of a series of activities which are carried out in time andspace (Jones, et al., 1983; Carpenter and Jones, 1983).

In-depth interviews are defined as those which are orientated to penetratingbelow the superficial question-and-answer format of structured or semi-structured personal interviews, which have attention paid to building rapport inorder to facilitate the expression of sincere beliefs and attitudes. The interviewsusually last for an hour or more, and are tape-recorded (in general, no writtennotes are taken in the interview to enable the interviewer to concentrate on whatthe respondent is saying).

3.2.8.1 Individual In-Depth Interviews

In-depth interviews usually take place with a group of people (most commonlyall people in a household) but they can also take place with just one person.Particular benefits of the individual in-depth interview are:


65

• longitudinal information can be gathered on one respondent at a time(e.g. travel patterns or expenditure information);

• problems with dominance and intra-group rivalry are absent, enablingboth majority and minority opinions to be expressed;

• personal material can be discussed, especially with an experiencedinterviewer (e.g. values which may be considered socially unacceptable,such as a love of V-8 engines).

3.2.8.2 Interactive Group Interviews

The most common form of in-depth interview in transport studies is theinteractive group interview. These interviews usually take place in householdswhere all members of the household take part in the interview. Jones (1985)provides an excellent comprehensive description of these techniques.

As noted by Jones, the interactive group surveys developed in transportation (ofwhich Jones (1979a) and Brög and Erl (1980) are the best known) share threecommon features: interaction between participants, use of visual aids as astructuring device, and the development of gaming simulation techniques. Withrespect to interaction between participants, the interactive group survey isdifferent, compared to a conventional structured personal interview travelsurvey, in that:

(a) interaction in the interview is exploited rather than suppressed;

(b) interviews are tape recorded, rather than recorded in written form duringthe interview, because the interviewer would not have enough time toconduct the interview and record the responses;

(c) interviewers are highly skilled and require a detailed understanding ofthe problem being studied;

(d) there is no formal questionnaire, although the interviewer may refer to atopic list during the interview;

(e) questions are framed during the course of the interview, in response toprevious discussion; and

(f) probes and supplementary questions do not need to be phrased neutrally,as they would be in a structured interview; in some circumstances, theinterviewer may act as a "devil's advocate" in order to elicit more detailedresponses from the respondents.

Chapter 3

66

Visual aids have an important role in a successful interactive surveymethodology by introducing some element of structure into the discussion. Inparticular they may provide

- an aid to comprehension;- a (passive) prompt;- an aide memoir;- a check on the feasibility of responses;- a means of dealing with complex issues;- a device for obtaining quantifiable data; and- a means of contributing to a more relaxed interview

environment.

As an example of the use of visual aids in an interactive survey, consider theinteractive survey described by Jones (1979a, 1980) as the Household Activity-Travel Simulator (HATS). This technique employs the use of a display board,known as a HATS-board, as a central focus for the interactive survey (see Figure3.2). Travel or activity diaries which have been completed by members of thehousehold for a specified period before the interview are translated into aphysical representation of a given day by means of coloured blocks along threeparallel axes to indicate at-home activities, away-from-home activities andconnecting travel activities. The spatial arrangement of these activities and trips isalso shown on the HATS-board map. Household members are then given thechance to describe their activity-boards and to discuss possible interactionsbetween the activity-boards of several members of the household. The HATS-board provides the focus for these discussions, by fulfilling the functions listedabove.

The third element of interactive surveys is the use of a gaming simulationapproach within the interview. The use of these "what-if" techniques enables theanalyst to investigate the range of likely adaptations to a variety of policyoptions. For example, in the HATS survey, after the group members havefinished discussing their current travel and activity patterns, they are asked torespond to various changes in policies which may affect either their activity ortravel patterns. The responses are shown by rearrangement of the blocks on theHATS-board and/or rearrangement of the spatial patterns on the map. Problemswhich arise as a result of these responses immediately become visible on theactivity-boards and map, in terms of gaps between blocks, overlapping blocks, notravel block linking at-home and away-from-home activities, and inconsistenciesbetween activity-boards for various members of the household.


67

7 8 9 10 11 12 1 2 3 4 5 6 7 8 9 10 11

7 8 9 10 11 12 1 2 3 4 5 6 7 8 9 10 11

NON-HOMEACTIVITIES

HOMEACTIVITIES

TRAVEL

TIME OF DAY ACTIVITIES REPRESENTEDBY COLOUR-CODED BLOCKS

MAP OF AREA WITHCOLOURED MARKERSTO SHOW LOCATIONOF ACTIVITIES

Figure 3.2 A Completed HATS Display Board(Source: Jones, 1985)

These inconsistencies result in individual adaptations being tested by onemember of the household and then re-checked with other members' adaptations.In Monopoly-game fashion, the members of the household engage inconversation and trade-offs to find feasible household adaptations to the policychanges. In the process, numerous infeasible adaptations are suggested and

Chapter 3

68

rejected. After a final adaptation is adopted by all members of the group, theHATS-board displays are coded onto new activity diaries and these, togetherwith a tape recording of the discussion, form the basis for future analysis.

The HATS technique can be most effectively used to investigate the adaptation toforced changes (e.g. curtailment of a public transport route), but may also be usedto investigate the effects of a permissive change (e.g. providing a new activity atan away-from-home location) which allows, but does not necessitate, a change tobe made. In the later case, greater attention needs to be paid to the feasibility ofany suggested adaptations.

The HATS technique, or variations of it, has been used in a variety of situationsincluding the assessment of the attitudes of bus driver's towards a variety ofpatterns of shift work (Bottom and Jones, 1982), the reactions of households tovarious policy measures aimed at reducing transportation energy consumption(Phifer et al., 1980), the identification of transportation needs of disabled people(Faulkner, 1982), and possible reactions to congestion management strategies(Ampt and Jones, 1992).

A slightly different approach to the use of simulation in interactive surveys hasbeen adopted by Brög and Erl (1980). In their methods, they adopt the concept ofthe "situational approach" which assumes that an individual's travel patterns arenot simply a function of that individual's preferences, but also depend on thesocial and environmental situation in which that individual exists. Once again,they use the idea of an in-depth survey in which an interactive "what-if"simulation is employed, but they present the results of the simulation in terms ofa hierarchical decision-tree which illustrates the effects of situational constraintson that individual. An example of such a decision-tree is given in Figure 3.3,which demonstrates the potential for usage of a new transit route in Melbourne,Australia (Ampt et al., 1985). It can be seen that although a large proportion ofthe population under study had the new mode objectively available to them (25%in Reservoir and 75% in Kingsbury), most of these potential users weresubsequently eliminated from consideration by a variety of situational andperceptual filters. This method of presenting the results of an interactive groupsurvey have been found to be highly effective in explaining the results of such asurvey to a non-technical client. The method has been used in a range ofapplications, including studies of the potential of bicycle travel (Brög and Otto,1981), long distance personal travel (Brög, 1982), and urban public transport(Brög and Erl, 1981).


69

ObjectiveOption

Constraints

Information

Time

Cost

Comfort

SubjectiveWillingness

Final Potential Market

Suburb of Reservoir Suburb of Kingsbury

100

73.8 26.2

12.8 11.8

0.7 11.1

0.0 11.1

2.3 8.8

0.3

8.5

100

24.6 75.4

38.2 23.6

5.6 18.0

0.0 18.0

6.2 11.8

3.0

1.6 24.6 13.6 61.8

8.8

No Yes No Yes

Y

NY

N

Y

N

NY

NY

NY

NY

NY

NY

NY

NY

NY

Figure 3.3 Decision Trees in the Situational Approach Survey(Source: Ampt, et al., 1985)

Chapter 3

70

While interactive survey techniques have been found to possess a number ofdistinct advantages when attempting to understand travel behaviour, they dohave a number of limitations, including:

(a) The interactive survey method is expensive. The cost per interview maybe three to four times more expensive than a structured personalinterview survey;

(b) Interactive surveys must be carried out by experienced interviewers whohave a good knowledge of the subject matter of the survey. Often it is theresearcher or principal investigator who must conduct the interview. Thisis necessary because only a person in this position will have enoughdetailed understanding of the topic to be able to follow up avenues ofthought which arise spontaneously in the interview and which appear tobe most productive. Thus, as well as being costly, interactive surveys,because of their limited manpower base, can also be very timeconsuming;

(c) Because of their time and cost requirements it follows that, within alimited budget, only a relatively small number of interactive surveys maybe completed for any one study. This may pose problems whenattempting to make generalisations from such a small sample; and

(d) Interactive surveys do not yet provide data in a form that is amenable tothe construction of detailed mathematical models of travel behaviour ortraffic flows. While progress has been made in this direction (e.g. theCARLA simulation model developed at Oxford University (Clarke,1984)), the end-product of interactive surveys is not the production ofseemingly precise estimates of traffic flows or transit patronage. Rather,they seek to provide a better basic understanding of travel behaviour.

On the other hand, there are many advantages to these methods. Although thereare some cases where interactive methods may form the main approach tostudying a transport problem or issues, most of the application of this methodlies in using it in the context of a more conventional, quantitative travel study.Jones (1985) suggests that there are three broad roles which interactive techniquescan perform in this context, as shown in Figure 3.4.

The first of these is exploratory. Before the main study is begun, interactivemeasurement can be used as an aid to define transport problems and/or policyissues, and to help formulate the appropriate quantitative methodology. This caninclude content and wording of the questionnaires, the hypothesis formation orthe provision of guidelines for appropriate modelling structure. This is probablythe best developed and most frequently used role of the interactive interview.


71

Figure 3.4 The Roles of Interactive Interviews(Source: Jones, 1985)

The second role for interactive surveys is investigative. Once a conventional travelsurvey has been completed, the analysis may raise a number of issues whichcannot be answered by the data in a statistical way. This occurs often when thesurvey concentrated on what was happening in the study area, not why ithappened - as is frequently the case in travel surveys. Interactive interviews canbe employed in these situations to exploring certain issues in greater depth withrespondents, or as a means of communicating findings and their implications todecision makers.

Finally, interactive surveys may be used as investigative tools. Rather than beingconfined to a role in preparation or conclusion of a survey, they may actually beused as a component of the main study phase (e.g. Ampt and Jones, 1992), as onemeans of investigating the subject matter. This is often achieved by using in-depth interviews or group discussions alongside structured surveys as acomplementary means of obtaining information about behaviour. In some casesthe association may be even close, with interactive methods being used to informmodel development through the use of gaming simulation to guide modelinputs, structures and forecasts (e.g. Brög and Zumkeller, 1983).

3.3 SUMMARY OF SURVEY METHOD SELECTION

The final choice of survey method will depend upon a matching of thecharacteristics of the individual survey methods, as outlined above, with theobjectives of the survey. This will be tempered by the resources available for theconduct of the survey. Stopher and Banister (1985) summarise the issues to befaced in the selection of a survey method and Table 3.2 builds on this to includethe survey types discussed in this chapter.

Chapter 3

72

Table 3.2 Uses of Each Survey Method

Data Type

Survey Type Factual Travel FactualDemographics Attitudes, Opinions

Documentary searches Yes Yes Yes

Observational surveys Yes No No

Household self-completion Yes Yes Limited

Telephone surveysHouseholdIndividualValidation

NoYesYes

YesYesYes

LimitedLimited

Yes

Intercept surveys Limited Yes No

Household personalinterview Yes Yes Yes

Group surveys Limited Limited Yes

In-depth surveys Limited Limited Yes

In line with the "Total Design" concept proposed by Dillman (1978), severalauthors have suggested the need for a dual or multiple survey mechanisms forthe collection of data. In such a process, a battery of survey methods would beused to collect data, with each method being used in its most appropriate context.For example, as noted earlier, Sheskin and Stopher (1980) used a combined on-board questionnaire survey and a take-home questionnaire survey for gatheringdata on public transport users. Stopher (1985a) suggests the combination ofpersonal interviews with mail-back questionnaire surveys, and telephone surveyswith mail-out/mail-back surveys (Stopher, 1982). Shaw and Richardson (1987)report on the combination of an activity centre interview with a mail-backquestionnaire survey. Brög and Erl (1980) point to the need to use a variety ofmethods within an overall study design and offer the following generalguidelines:

(a) Structured questionnaires are appropriate for obtaining socio-demographic data at a person and household level;

(b) Diaries are the best means of obtaining activity and travelinformation (otherwise respondents tend to recall typical behaviourrather than provide an accurate record for the survey period);

(c) Land-use and transport system information is best obtained fromofficial statistics or by carrying out an inventory;


73

(d) Face-to-face, in-depth unstructured interviews are appropriate forexploring perceptions and attitudes;

(e) Group discussions are a better means of obtaining information abouthousehold organisation, decision rules, etc. than direct questioning,since household members are often unaware of their ownorganisational structures; and

(f) Simulation techniques are a useful means of exploring response tochange, as they make explicit the constraints, options and decisionrules which contribute to the observed adaptation.

In general, after reading this chapter, we think it becomes clear that the selectionof the survey method can only sensibly take place after a careful period ofpreliminary planning to ensure that the survey method chosen is best suited tomeasuring what is needed to fulfil the study and survey objectives. Furthermore,we hope it also becomes clear that the selection of a survey method is vital toensuring that the highest quality data is collected in the most cost effective way.



4. Sampling Procedures

The selection of a proper sample is an obvious prerequisite to a sample survey. Asample is defined to be a collection of units which is some part of a largerpopulation and which is specially selected to represent the whole population.Four aspects of this definition are of particular importance: first, what are theunits which comprise the sample; second, what is the population which thesample seeks to represent; third, how large should the sample be; and fourth,how is the sample to be selected?

4.1 TARGET POPULATION DEFINITION

The target population is the complete group about which one would like tocollect information. The elements of this group may be people, households,vehicles, geographical areas or any other discrete units. The definition of thetarget population will, in many cases, follow directly from the objectives of thesurvey. Even so, there may be several problems to be resolved in the definitionof the population.

To outline the types of questions which may need to be resolved in the definitionof a survey population, consider a survey which had the objectives ofdetermining the travel patterns of elderly, low-income residents living in aparticular area of Melbourne, Australia (Richardson, 1980). The objective of thestudy for which the survey was undertaken was to identify the degree to which

Chapter 4

76

this group was transport-disadvantaged and to suggest means by which theymight be provided with an improved level of transport service. To define thepopulation for this survey, several issues had to be resolved. First, a definition of"elderly" had to be adopted. With no particular theoretical justification, it wasdecided to define elderly as being equivalent to being "of retirement age". At thetime of the survey, this meant 60 years of age for women and 65 years of age formen. Second, "low-income" also had to be defined. Whilst many income and/orwealth indicators could theoretically have been used for this purpose, thereappeared to be no easily identifiable level of income which would serve as adefinition of "low-income". In fact, the practical definition of "low-income" used inthis survey was not resolved until later in the survey process when the selectionof a sampling frame was being made. Third, the geographical extent of the studyarea had to be defined. Since the study was initiated and supported by theSherbrooke Shire Council, it was easily resolved that the study shouldconcentrate on residents of the Sherbrooke Shire. However, given that, therewas still some debate as to whether the survey should cover the entire Shire orwhether it should concentrate on a number of relatively inaccessible areas withinthe Shire. Mainly because very little was known of elderly travel patterns in thearea at the time, it was decided to survey the entire Shire to build up a generalpicture of elderly travel patterns. The survey population thus defined was low-income (yet to be defined) residents of retirement age living in the Shire ofSherbrooke.

4.2 SAMPLING UNITS

The survey population is composed of individual elements. Thus continuing withthe example of the Shire of Sherbrooke survey, the elements of the populationwere the individual elderly residents. However, the selection of a sample fromthis population was based on the selection of sampling units from the population.Sampling units may or may not be the same as elements of the population; inmany cases, they are aggregations of elements.

Thus in the Sherbrooke survey, it was decided to select elderly households ratherthan individual elderly residents. This decision was made for three reasons. First,it was thought that elderly travel patterns and needs were best addressed byconsidering the total travel of the elderly household. Thus, even though anindividual may not travel greatly, the needs of that individual may be serviced ifanother member of that household was able to travel on their behalf, at least forneeds like shopping and personal business. Second, it was considered that wherean elderly household consisted of two elderly individuals it would be difficult torestrict the survey to just one of them without the other feeling left out of things.Third, it was considered that the productivity of the interviewers would beincreased by having them interview all elderly residents in one household ratherthan just one selected individual in each household.

Sampling Procedures

77

Given the definition of the sampling unit as being an elderly household, it wastherefore necessary to define an elderly household. The adopted definition of anelderly household was one in which at least one elderly individual (as previouslydefined) resided on a permanent basis.

In more general situations, sampling units may typically include such entities as:

(a) Individuals(b) Households(c) Companies(d) Geographic regions (zones, cities, states, nations)(e) Vehicles(f) Intersections or road links(g) Other features of the transport network

4.3 SAMPLING FRAME

Having identified the desired survey population and selected a sampling unit, it isnecessary to obtain a sampling frame from which to draw the sample. Asampling frame is a base list or reference which properly identifies everysampling unit in the survey population. Clearly, the sampling frame shouldcontain all, or nearly all, the sampling units in the survey population.

Depending on the population and sampling units being used, some examples ofsampling frames which could be used for various transport surveys include:

(a) Electoral rolls(b) Block lists (lists of all dwellings on residential blocks)(c) Lists by utility companies (e.g. electricity service connections)(d) Telephone directories(e) Mailing lists(f) Local area maps(g) Census lists (if available)(h) Society membership lists(i) Motor vehicle registrations

Chapter 4

78

Each of these sampling frames, however, suffers from one or more of thedeficiencies outlined below:

(a) InaccuracyAll sampling frames will generally contain inaccuracies of one sort oranother. Lists of individuals will contain mis-spelt names and incorrectaddresses. Maps will often have incorrect boundary definitions, will notbe true to scale and will have streets and other features which simply donot exist.

(b) IncompletenessAs well as having incorrect entries, sampling frames may simply not havesome valid entries at all. For example, electoral rolls may not have thenames of individuals who have recently moved into an area or of non-citizens of the country; temporary workers may not be included on a listof company employees; off-road recreational vehicles may not be onmotor vehicle registration records; some telephone numbers are unlisted.

(c) DuplicationEntries may also be duplicated on some sampling frames. For example,telephone directories may list individuals and companies more than onceunder slightly different titles; an individual may appear on the electoralroll for two different areas soon after they have moved from one area toanother; an individual may appear on a Local Government electoral rollmore than once, if they own land in more than one ward of the LocalGovernment Area.

(d) InadequacyA sampling frame is said to be inadequate if it simply does not provide aclose correspondence with the desired survey population, but has beenadopted for the sake of convenience.

(e) Out-of-dateWhilst a sampling frame may once have been adequate, accurate,complete and with no duplications, this situation may not last forever.Conditions change and, with these changes, sampling frames go out-of-date. Thus telephone directories commonly do not include people whohave moved in the previous 18 months; electoral rolls are usually up-to-date only at election time; lists of all sorts are compiled for specialpurposes but are not kept current after the immediate need for the listhas passed.

In many cases, the reason for the deficiencies listed above is that the list whichone wishes to use as a sampling frame, has been compiled for a completely

Sampling Procedures

79

different reason. Thus electoral rolls only need to be current at the time ofelections; telephone directories only list those who have telephones; maps arecompiled for specific purposes to an acceptable degree of accuracy for thatpurpose. While lists from utility companies are probably the most up-to-date,they too can rapidly become out-of-ate as people move and houses becomevacant. Before adopting a list as a sampling frame, it is wise to ascertain thereasons why the list was initially compiled, the criteria for inclusion on the list andthe methods used for up-dating of the list. Having said this, it should be notedthat the acceptability of lists for sampling frames will depend entirely on thescope and purpose of the survey to be conducted. If the deficiencies in a list willnot affect the results of the survey, then such a list may be entirely satisfactory asa sampling frame for that survey. The question of acceptability revolves entirelyaround the absence of an introduction of bias into the survey results.

To illustrate the restrictions imposed, and the opportunities offered, by theavailability of certain sampling frames, consider the sampling frame used in theSherbrooke Shire elderly travel survey (Richardson 1980). It will be rememberedthat the desired population was all low-income, elderly residents living in theShire of Sherbrooke at the time of the survey (April 1979). The question, then,was where could one find a list of such elderly residents. Since elderly had beendefined as "of retirement age" and since all retired people were eligible for old-age pensions, the obvious source was the Department of Social Security pensionmailing-lists for the Shire of Sherbrooke. However, enquiries revealed thatconfidentiality requirements would deny accessibility to individual names andaddresses on that list. Even if such names and addresses were available, however,the problem of defining "low-income" would still have remained. Anothersampling frame was therefore sought.

Several possibilities were suggested but each had attendant deficiencies. Twoproposals were given serious consideration. The first entailed the use of aninformal network of contacts established through Senior Citizens Clubs in thearea and through the Shire's community services activities (such as Meals-on-Wheels). It was considered that by taking the names of those directly involved inthese activities, and then asking them for the names of other elderly residentswhom they know to live in the area, that a reasonable sampling frame would beestablished. This method, however, had a number of problems. First, theestablishment of the network would require considerable time and effort;second, the completeness of the sampling frame would be unknown, but wouldprobably be biased towards those who already found it possible to travel to theSenior Citizens Clubs in the area and hence were less likely to be transport-disadvantaged; third, the accuracy of the addresses supplied for other elderlyresidents was open to question.

Chapter 4

80

The second proposal entailed the use of a list compiled by the Shire Councildetailing all those pensioners who had applied for a rebate of the land rates paidto Council in the previous year. Such rate rebates were means-tested and hencethis provided a definition of "low-income" which had so far proved elusive. Thissampling frame was, however, not without its own problems. First, the listcontained all pensioners who had received rate rebates and not just old-agepensioners; it would therefore be necessary to separate the old-age pensioners ata later stage. Second, the list obtained was for the previous year and hence mighthave been slightly out-of-date. Third, and most importantly, because of its verynature the list contained names of only those pensioners who owned a house (orland) in the Shire. Thus low-income elderly residents who rented property wouldnot be on the list. Since it might be expected that renters would be lower-incomethan owners then this omission could be of some significance. To the extent thatrenters comprised a portion of the total elderly resident population, the samplingframe would be in error. Despite this shortcoming, this sampling frame was usedin the study since time and resources prevented the establishment of a morecomprehensive sampling frame. Nonetheless, the results of the study had to beinterpreted with reference to the sampling frame used.

In the example cited above and in many other surveys, especially of relativelyrare populations, the establishment of a sampling frame can be a major problem.In many cases, the sampling frame available may largely define the populationand the sampling method. As a result, the detailed survey planning must oftenawait the identification of the available sampling frames. If no adequate samplingframes can be found, then it may be necessary to conduct a preliminary surveywith a view to establishing a suitable sampling frame. Alternatively, the surveycan be designed using a larger than required sampling frame and using filterquestions at the beginning of each questionnaire (or interview) to eliminate non-relevant sampling units from the survey.

4.4 SAMPLING METHODS

The object of sampling is to obtain a small sample from an entire population suchthat the sample is representative of the entire population. It is therefore of someimportance to ensure that the sample is drawn with care to ensure that it isindeed representative. The need for sampling is based on the realisation that intransport studies we are often dealing with very large populations. To attempt tosurvey all members of these populations would be impossible. For example, the1981 Sydney Travel Survey took approximately 12 months to collect, code andedit data on the travel patterns of 20,000 households. To attempt a 100% samplesurvey on the same annual budget would take over 50 years! Sample surveys arealso used because, not only it is often not possible to collect data on all membersof a population but, it is also not necessary. As will be seen later, quite accurate

Sampling Procedures

81

estimates of population characteristics can be obtained from relatively smallsamples.

The accuracy of sample parameter estimates, however, is totally dependent onthe sampling being performed in an acceptable fashion. Almost always, the onlyacceptable sampling methods are based on some form of random sampling.Random sampling entails the selection of units from a population by chancemethods such as flipping coins, rolling a die (not two dice), using tables ofrandom numbers or through the use of pseudo-random numbers generated byrecursive mathematical equations.

The essence of pure random sampling is that sampling of each unit is performedindependently and that each unit in the population has an equal probability ofbeing selected in the sample (at the start of sampling).

There are many types of sampling methods, each of which is based on therandom sampling principle. The most frequently encountered methods are:

(a) Simple random sampling(b) Stratified random sampling(c) Variable fraction stratified random sampling(d) Multi-stage sampling(e) Cluster sampling(f) Systematic sampling

In addition, there are a number of other sampling methods which, while used intransport surveys, are not based on random sampling and are therefore nothighly recommended. These methods include quota sampling and expertsampling. This section will describe each of the above sampling methods,indicating their strengths and weaknesses.

4.4.1 Simple Random Sampling

Simple random sampling is the simplest of all random sampling methods and isthe basis of all other random sampling techniques. In this method, each unit inthe population is assigned an identification number and then these numbers aresampled at random to obtain the sample. For example, consider a population of100 sampling units, as depicted by the 100 asterisks in Figure 4.1. The task is toselect a random sample of 10 sampling units from this population.

Chapter 4

82

* * * * * * * * * *

* * * * * * * * * *

* * * * * * * * * *

* * * * * * * * * *

* * * * * * * * * *

* * * * * * * * * *

* * * * * * * * * *

* * * * * * * * * *

* * * * * * * * * *

* * * * * * * * * *

Figure 4.1 A Population of 100 Sampling Units

The first step in simple random sampling is to name each of the sampling units.This is often done by assigning a unique identification number to each of thesampling units, even if they already have unique identifying names. Thus thepopulation of 100 sampling units now appears as shown in Figure!4.2.

00 01 02 03 04 05 06 07 08 09

10 11 12 13 14 15 16 17 18 19

20 21 22 23 24 25 26 27 28 29

30 31 32 33 34 35 36 37 38 39

40 41 42 43 44 45 46 47 48 49

50 51 52 53 54 55 56 57 58 59

60 61 62 63 64 65 66 67 68 69

70 71 72 73 74 75 76 77 78 79

80 81 82 83 84 85 86 87 88 89

90 91 92 93 94 95 96 97 98 99

Figure 4.2 A Population of 100 Sampling Units with Identifiers

Sampling Procedures

83

Using random number selection methods (to be described later in section 4.8), aset of ten random numbers is now selected, and the sampling unitscorresponding to these numbers are included in the sample, as shown by thebold numbers in Figure 4.3.

00 01 02 03 04 05 06 07 08 09

10 11 12 13 14 15 16 17 18 19

20 21 22 23 24 25 26 27 28 29

30 31 32 33 34 35 36 37 38 39

40 41 42 43 44 45 46 47 48 49

50 51 52 53 54 55 56 57 58 59

60 61 62 63 64 65 66 67 68 69

70 71 72 73 74 75 76 77 78 79

80 81 82 83 84 85 86 87 88 89

90 91 92 93 94 95 96 97 98 99

Figure 4.3 A Simple Random Sample of 10 Sampling Units

The question arises in this method as to whether sampling should be performedwith replacement, or without replacement. The usual practice is to samplewithout replacement such that each unit can be included in the sample only once;this is particularly so when dealing with individuals or households which arebeing sampled for an interview survey. It is possible, however, to sample withreplacement and simply include the results from those sampling units selectedmore than once as many times as they are selected. That is, if a household isselected twice, then you do not interview the household twice but merely includethe results from this household twice in the data set. Notwithstanding the above,conventional practice in transport surveys is to sample without replacement.

A variation on the random sampling method which is sometimes used intransport surveys is to select a number and then let the sampling unitsthemselves introduce the randomness. For example, in licence plate travel timesurveys, the usual practice is to record arrival and departure times of all vehicleswhose licence plate ends in a certain digit (for a 10% sample) or digits (for highersampling rates). Since it would be expected that the occurrence and time ofarrival of such vehicles would be random, the resulting estimates of traffic flowand travel time should be unbiased.

Chapter 4

84

Depending on the sample size, simple random sampling often gives highlyvariable results from repeated applications of the method. It would therefore bedesirable if this variability could be reduced whilst still maintaining thecharacteristics of a random sample. In other situations, the cost of simple randomsampling would be excessive. Several improvements on simple random samplinghave therefore been developed.

4.4.2 Stratified Random Sampling

The sample of ten sampling units identified in Figure 4.3 may well be a goodrepresentation of the 100 sampling units in the population. However, if we havesome prior information about the population, it may be clear that this is not thecase. For example, assume that the sampling frame depicted in Figure 4.2 comesfrom a list of employees in a company and that, for some other reason, theemployees are listed by gender such that the first 40 employees are female andthe second 60 employees are male, as shown in Figure 4.4.

00 01 02 03 04 05 06 07 08 09

10 11 12 13 14 15 16 17 18 19

20 21 22 23 24 25 26 27 28 29 Female

30 31 32 33 34 35 36 37 38 39

40 41 42 43 44 45 46 47 48 49

50 51 52 53 54 55 56 57 58 59 Male

60 61 62 63 64 65 66 67 68 69

70 71 72 73 74 75 76 77 78 79

80 81 82 83 84 85 86 87 88 89

90 91 92 93 94 95 96 97 98 99

Figure 4.4 A Simple Random Sample from a Stratified Population

It is clear from Figure 4.4 that we have inadvertently over-sampled females(selecting 5 out of 40) and under-sampled males (selecting 5 out of 60). As aresult, any inferences drawn from this sample will be biased towards thebehaviour or attitudes of females because they are over-represented in thesample compared to their representation in the population.

To overcome this problem, stratified random sampling makes use of priorinformation to subdivide the population into strata of sampling units such thatthe units within each stratum are as homogeneous as possible with respect to thestratifying variable. Each stratum is then sampled at random using the same

Sampling Procedures

85

sampling fraction for each stratum. When the same sampling fraction is used ineach stratum, this method is sometimes called proportionate stratified sampling.The resulting sample will then have the correct proportion of each stratum withinthe whole population, and one source of error will have been eliminated. Forexample, if the males and females in Figure 4.4 were each sampled at a rate of10% then the sample shown in Figure 4.5, where sampling unit 55 is substitutedfor sampling unit 33, would be a more representative sample than that shown inFigure 4.4.

00 01 02 03 04 05 06 07 08 09

10 11 12 13 14 15 16 17 18 19

20 21 22 23 24 25 26 27 28 29 Female

30 31 32 33 34 35 36 37 38 39

40 41 42 43 44 45 46 47 48 49

50 51 52 53 54 55 56 57 58 59 Male

60 61 62 63 64 65 66 67 68 69

70 71 72 73 74 75 76 77 78 79

80 81 82 83 84 85 86 87 88 89

90 91 92 93 94 95 96 97 98 99

Figure 4.5 A Stratified Random Sample from a Stratified Population

While the stratified sample in Figure 4.5 is obviously more representative thanthat in Figure 4.4 (in that males and females are represented in their correctproportions), the question remains as to whether the stratified sample is still arandom sample of the population. This question can be answered by reference tothe two criteria for random sampling noted earlier in this chapter. That is, asample is random if:

• each unit is sampled independently; and

• each unit in the population has an equal probability of being selectedin the sample (at the start of the sampling process).

With respect to stratified random sampling, it is clear that the second conditionobviously holds, in that all males and females each have the same chance ofselection (i.e. 10%) at the start of the process. With respect to the first condition,within each strata each unit is selected independently because simple randomsampling is being employed within each strata. Therefore, given that each strata

Chapter 4

86

is sampled independently of the other, the addition of two independent randomsamples will produce a third random sample.

To use stratified sampling, it is necessary that some prior information about thepopulation is available before sampling takes place. The prior information shouldalso relate to the variables which are to be measured in the survey. For example,if one were attempting to measure trip generation rates in a survey, thenstratification on the basis of car ownership would be more useful thanstratification on the basis of the day of the week on which the respondent wasborn (assuming both data sets were available prior to sampling). Whilst the latterstratification would ensure that we got the correct number of people born oneach day in our sample, we would not expect this to improve our estimate of tripgeneration rates. On the other hand, by having the correct average carownership in our sample, rather than (by chance) too high or too low an estimateof car ownership, we would expect a better estimate of trip generation rate.

Therefore stratified sampling requires that we have some prior informationabout each unit in our population which is relevant to the objectives of thesurvey.

Whilst stratified sampling is useful, in general, to ensure that the correctproportions of each stratum are obtained in the sample, it becomes doublyimportant when there are some relatively small sub-groups within thepopulation. With simple random sampling, it would be possible to completelymiss out on sampling members of small sub-groups. Stratified random samplingat least ensures that some members of these rare population sub-groups aresampled (assuming these sub-groups were used as strata and that the product ofthe sub-group population size and the sampling rate produces a number greaterthan one).

A final advantage claimed for stratified sampling is that it allows different surveymethods to be used for each of the strata. An example given by Stopher andMeyburg (1979), concerning stratification on the basis of access distance to a railstation, suggests that while the strata with shorter access distances may besurveyed by a postal questionnaire, the stratum with the longest access distanceshould be surveyed by personal interview on the basis that they are less likely tobe transit-oriented travellers. Whilst such a variation in survey method ispossible, care should be taken when comparing, or combining, the results for thedifferent strata because of the different biases built into each of the surveymethods.

A variation on stratified sampling is the use of multiple stratifications. Thus,instead of stratifying with respect to only one variable, the stratification can beperformed with respect to several variables thus creating an n-dimensionalmatrix of stratification cells. In selecting the number of dimensions for

Sampling Procedures

87

stratification and the number of strata within each dimension, attention should bepaid to the total number of stratification cells produced. Since the number of cellsincreases geometrically with the number of dimensions or strata, it is possible toproduce a large number of cells inadvertently. In such a case, the averagenumber of units in the sample within each cell could be quite small (perhapsfractional). Under these conditions, the necessary round-off errors in drawing asample could defeat the purpose of stratification, unless carefully controlled.

The method by which stratification is conducted will depend to a large extent onthe structure of the sampling frame to be used. In some sampling frames, thestratification may have already been performed in the compilation of thesampling frame list. For example, students at a University may already becategorised by Faculty. In such a case, an unrestricted random sample isconducted separately within each of the stratified lists. In other sampling frames,the list ordering may be completely random but it may be known how manysampling units belong in each stratum and, therefore, how many in the sampleshould come from each stratum. In this case, a random sample may be drawnfrom the entire list and, upon selection, each unit is placed in its correct stratum.When the required quota for each stratum has been sampled, further selectionsfor that stratum are rejected and another selection is made.

Finally, it should be noted that the concept of stratification can also be used afterthe data have been collected by means of a simple random sample survey. Insuch a case, the survey results can be adjusted so that each stratum is representedin the correct proportion. Such weighted "expansion" (see Section 9.1) isfrequently performed when the required stratification information does notbecome available until after the survey has been performed (e.g. when travelsurveys are performed in Census years). However, it should be noted that such aprocedure is strictly valid only when there is a sufficiently large sample sizewithin each of the strata to enable reasonable confidence to be held in each of thestrata results.

4.4.3 Variable Fraction Stratified Random Sampling

The above discussion of stratified random sampling has implicitly assumed thatwithin each stratum, the same sampling fraction will be used. Whilst this mayoften be the case, an added advantage of stratified sampling is that it allowsdifferent sampling fractions to be used in each stratum. Such variable fractionsampling may be desirable in three distinct situations.

First, as will be shown later, the accuracy of results obtained from a sampledepends on the absolute size of the sample, not on the fraction of the populationincluded in the sample. In some populations, stratification on the basis of aparticular parameter may result in a highly variable number of sampling units ineach stratum. If a constant sampling fraction were used for each stratum, one

Chapter 4

88

would obtain highly variable sample sizes in each of the strata, and hence highlyvariable degrees of accuracy in each stratum, all other things being equal. Toobtain equal accuracy in each of the strata, it would be necessary to use differentsampling fractions in each stratum so that approximately equal sample sizes wereobtained for each stratum.

The second factor which affects the accuracy of a parameter estimate obtainedfrom a sample is the variability of that parameter within the population; highervariability parameters require higher sample sizes for a specified degree ofaccuracy. If the sampling units within different strata exhibit different degrees ofvariability with respect to a parameter of interest, then it would be necessary touse high sampling fractions for strata with high variability if equal degrees ofaccuracy were to be obtained for each stratum.

The third reason for choosing variable fraction sampling is more pragmatic thantheoretical. It may be that the costs of sampling and/or collecting the data mayvary across the different strata. In such a case, a trade-off is necessary betweenthe anticipated costs of reduced accuracy and the known costs of obtaining thedata. It may therefore be desirable to reduce the sampling fraction in those stratawhere data is more expensive to obtain.

Therefore, while the sample shown in Figure 4.5 is more representative of thepopulation, the sample shown in Figure 4.4 may be more efficient in reducing thetotal sampling error in the sample of ten. The entire question of tailoring stratasampling fractions in accordance with the above principles is related to the idea ofsampling optimisation, whereby the efficiency of the data collection procedure ismaximised with respect to value-for-money.

For the moment, however, two drawbacks with variable sampling fractionmethods should be noted. First, the method may require far greater priorinformation about the population, including the size of each strata, the variabilityof the specified parameter within each strata, and the cost of data collectionwithin each strata. Second, because sampling units in each strata no longer havethe same chance of selection (because some strata are being deliberately over-sampled), one of the two basic conditions for random sampling has now beenviolated. Therefore, the raw data obtained is no longer a random samplerepresentation of the entire population. It will be necessary to assign weightingsto each of the strata samples during analysis to generate population estimateswhich are truly representative of the population. The added complications duringdesign and analysis of the survey generally mean that variable fraction samplingis considered only for relatively large surveys where there is the opportunity forthe extra costs involved to be recouped by large savings due to increased surveyefficiency.

Sampling Procedures

89

4.4.4 Multi-Stage Sampling

In simple random sampling, the first stage in the process is to enumerate (givenames or numbers to) the entire population. While this may be feasible for smallpopulations, it is clearly more difficult with larger populations. For example,identifying every individual in a large city or a nation is clearly a non-trivial task.In such circumstances, another variation of random sampling is called for. Multi-stage sampling is a random sampling technique which is based on the process ofselecting a sample in two or more successive, contingent stages. Consider, forexample, a multi-stage survey of travel patterns for an entire nation. Within anAustralian context, the process may proceed in five stages as follows:

(a) First-stage: divide nation into states and sample from totalpopulation of states.

(b) Second-stage: divide selected states into Local Government Areasand sample from these Local Government Areas within eachselected state.

(c) Third-stage: divide selected Local Government Areas into CensusCollectors' Districts and sample Census Collectors' Districts.

(d) Fourth-stage: divide selected Census Collectors' Districts intohouseholds and sample households.

(e) Fifth-stage: divide selected households into individuals and sampleindividuals.

At the end of this process we have a random sample of individuals from thenation (i.e. every individual had an equal chance of being selected at the start ofthe process) provided that appropriate sampling procedures are used at each ofthe stages. Thus at the first three stages, it would be necessary to sample states,Local Government Areas and Census Collectors' Districts by means of a selectionprocedure with probabilities proportional to size (PPS) if all individuals are tohave an equal probability of selection. Thus, larger population states would havea higher probability of selection at the initial stage. The PPS sampling procedurecan be easily applied to the initial three stages because the population within eachstate, Local Government Area and Census Collectors' District would generally beknown in advance of sampling (from other sources such as National Censusstatistics).

At the fourth stage, however, a problem arises because, without detailedknowledge of the size of each household in the selected Census Collectors'Districts, it would not be possible to use PPS sampling at this stage. Such detailedknowledge about individual household structure would generally be unavailable.

Chapter 4

90

Without this information, it can easily be seen that if one individual is to beselected from each selected household then an individual in a small householdhas a higher probability of selection than an individual in a large household. Tocorrect for this it may be necessary to place households in strata in the field bymeans of filter questions at the start of the interview. The number of householdsin each stratum would be directly proportional to the household size. When eachstratum is filled, no further questions would be asked of that household. Thus if xinterviews were conducted in single person households, then 2x interviewsshould be conducted in two-person households etc., such that each individual hasan equal chance of selection, irrespective of household size. Alternatively,households could be selected randomly as if they were all of equal size and thenadjustments could be made to the survey results by means of appropriateweighting factors to reflect the distribution of household sizes found in thepopulation.

The fifth stage also requires care in sampling. The interview should not beconducted with whoever opens the door or with anyone who is simply willing tobe interviewed. Rather, if individuals are the unit of investigation, randomsampling should be performed across all members of the household who aremembers of the population under investigation (perhaps, for example, there is anage-limit on members of the population). This random sampling may beformalised by printing, on the interview form, instructions to the interviewer forselection of a household member depending on the size of the household. Thesesampling instructions would be varied randomly, or systematically, from form toform to ensure the desired distribution of household members was obtained inthe sample (see Kish, 1965). Examples of such selection grids are provided byStopher (1985a) for selection of adults aged 18 or over, and are reproduced inFigure 4.6. The interviewer uses each grid on alternating occasions (odd and evengrids) and then, depending on the answers to filter questions about the numberof adults aged 18 or over and the number of males aged 18 or over, asks to speakwith a specified member of the household as indicated in the appropriateselection grid.

Sampling Procedures

91

ODD

Adults 18 and OverMales 18 + 1 2 3 4

0 THE WOMAN THE YOUNGERWOMAN

THE OLDESTWOMAN

THE SECONDYOUNGEST

WOMAN

1 THE MAN THE MANTHE OLDER

WOMANTHE YOUNGEST

WOMAN

2THE OLDER

MANTHE OLDER

MANTHE OLDER

MAN

3THE OLDEST

MANTHE YOUNGEST

MAN

EVEN

Adults 18 and OverMales 18 + 1 2 3 4

0 THE WOMANTHE OLDER

WOMANTHE OLDEST

WOMANTHE OLDEST

WOMAN

1 THE MAN THE WOMAN THE YOUNGERWOMAN

THE SECONDYOUNGEST

WOMAN

2THE OLDER

MANTHE OLDER

MANTHE YOUNGEST

MAN

3THE SECONDOLDEST MAN

THE SECONDOLDEST MAN

Figure 4.6 Examples of Respondent Selection Grids(Source: Stopher, 1985a)

Whilst multi-stage sampling may appear to be somewhat complicated from theabove description, its major advantage over simple random sampling lies, in fact,in its convenience and economy, especially for surveys of large populations.Thus, in multi-stage sampling, it is not necessary to enumerate all the samplingunits in the population. At each stage, the only sampling units which need to belisted are those which belong to the higher level sampling units selected in theprevious stage. Thus the expensive and time-consuming compilation of acomplete sampling frame list is avoided.

The disadvantage of multi-stage sampling is that the level of accuracy ofparameter estimates for a given sample size tends to be less than if a simplerandom sample had been collected (see Section 4.7 for more details). However,this reduction in accuracy needs to be traded off against the reduction in costs. Inmany cases, an increase in sample size for multi-stage samples can be paid for bythe savings accrued in not having to prepare a full sampling frame.

It should also be noted that at each stage in the multi-stage sampling process,different sampling methods can be applied. Thus stratified sampling and variablefraction sampling can be applied to meet certain objectives. For example, if

Chapter 4

92

certain states must be represented in the final sample (perhaps for political orother reasons outside the scope of the survey) then these states can besegregated into strata by themselves and sampled with certainty. The higherprobability of selection of the state must, however, be compensated for by lowerprobabilities of selection at later stages in the process, such that individuals in allstates have an equal probability of selection. This latter criteria is, in fact, theguiding light of multi-stage sampling; virtually anything is allowable at eachstage provided that individuals (or whatever the last-stage sampling elementhappens to be) have an equal chance of selection after all stages have beencompleted.

Multi-stage sampling can also be used in the design of on-board transit surveys.In such surveys, the sampling unit is the transit passenger, but it would beimpractical to have a sampling frame based on the names of all transitpassengers. Rather, the sample of transit passengers can be drawn in a four stageprocess, where each stage takes account of a different dimension in the transitpassenger population (see Stopher, 1985b; Fielding, 1985):

Stage 1: Geographically-stratified sampling of routes

Stage 2: Sampling of vehicles from the selected routes

Stage 3: Time-stratified sampling of runs on selected vehicles

Stage 4: Surveying of all passengers on selected runs

4.4.5 Cluster Sampling

Cluster sampling is a variation of multi-stage sampling. In this method, the totalpopulation is first divided into clusters of sampling units, usually on a geographicbasis. These clusters are then sampled randomly and the units within the clusterare either selected in total or else sampled at a very high rate. Like multi-stagesampling, cluster sampling can be much more economical than simple randomsampling both in drawing the sample and in conducting the survey. For example,interviewers' travel costs can be reduced substantially by the use of clustersampling. Also, if interviews are being conducted in a small number of relativelywell-defined areas, it is easier to maintain a higher degree of quality control onthe conduct of the interviews.

Like multi-stage sampling, the main problem with cluster sampling is thatsampling error will tend to be increased for any given sample size compared tosimple random sampling. This is even more of a problem for cluster samplingthan for multi-stage sampling. The effect of clustered sampling on sampling errorwill depend on the degree of similarity between the units in the cluster and thosein the total population (see Section 4.7). At one extreme, if all units in a cluster

Sampling Procedures

93

were identical to each other, but totally different from units outside the cluster,then each cluster could equally well be described by just one unit within eachcluster. The other units in the cluster would add no new information. In such acase, the effective sample size would be the number of clusters, not the numberof sampling units in all the clusters. At the other extreme, if sampling units withina cluster showed equal dissimilarity to other units in that cluster and to units inthe total population, then cluster sampling would result in the same distributionof sampling units as would simple random sampling. In this case, the effectivesample size for cluster sampling would be equal to the number of sampling unitsin all the clusters. Most sampling situations fall between these extremes andhence the effective sample size will be somewhere between the number ofclusters and the total number of sampling units in the clusters. For any onesurvey, the effective sample size will depend on the definition of the clusters andthe parameter to be estimated. The homogeneity of different parameters withindifferent clusters can vary substantially. The art of cluster definition is to findeconomical clusters which maintain heterogeneity in the parameters to beestimated.

4.4.6 Systematic Sampling

When random sampling is being performed in conjunction with a samplingframe list, it is frequently more convenient to use a technique called systematicsampling rather than rely on the use of random numbers to draw a sample.Systematic sampling is a method of selecting units from a list through theapplication of a selection interval, I, such that every Ith unit on the list, followinga random start, is included in the sample. The selection interval is simply derivedas the inverse of the desired sampling fraction. For example, Figure 4.7 shows asystematic sample drawn from our population of 100 sampling units, where 04has been randomly drawn as the starting number.

Systematic sampling is much simpler than truly random selection and, once thesampling frame list has been prepared, it can be carried out by inexperiencedclerical staff. The major task in systematic sampling lies in the preparation of anappropriate sampling frame list. An appropriately structured list can result in asystematic sampling procedure which automatically performs stratification withrespect to a number of variables. For example, it can be seen in Figure 4.7 thatthe systematic sample has also resulted in a stratified sample because of the wayin which the sample frame list was constructed in a stratified fashion.

Chapter 4

94

00 01 02 03 04 05 06 07 08 09

10 11 12 13 14 15 16 17 18 19

20 21 22 23 24 25 26 27 28 29 Female

30 31 32 33 34 35 36 37 38 39

40 41 42 43 44 45 46 47 48 49

50 51 52 53 54 55 56 57 58 59 Male

60 61 62 63 64 65 66 67 68 69

70 71 72 73 74 75 76 77 78 79

80 81 82 83 84 85 86 87 88 89

90 91 92 93 94 95 96 97 98 99

Figure 4.7 A Systematic Sample from a Stratified Population

Whilst the ordering of the sampling frame list can be used beneficially asdescribed above, it can also be a source of problems which cause the systematicsample to deviate substantially from a truly random sample. Three aspects of liststructure are of particular concern in this respect. First, and perhaps mostimportantly, care should be taken to ensure that the lists do not exhibit aperiodicity with respect to the parameter being measured, especially where theperiodicity corresponds to the selection interval, I. As an extreme example,consider a household survey to determine traffic noise nuisance effects in aninner urban area. With repetitious grid-street layouts, which predominate in suchareas, and the existence of uniform house block sizes, it is likely that the numberof houses between intersections will be constant. If the selection interval happensto be equal to this number, then the sampled houses will all be in the sameposition with respect to intersections (e.g. perhaps all corner blocks). The effect ofthis on the resultant measurements of perceived traffic noise could be quiteserious. Whilst this is an extreme example, there are many instances whereperiodicity does occur in lists. Ways of overcoming periodicity include choosing aselection interval which is neither a multiple nor fraction of the periodicity, or elsedividing the list into several segments and then choosing a new starting point atthe start of each segment.

The second problem with list ordering is that, when using systematic sampling,only certain combinations of sampling units can be sampled. Specifically, only Icombinations of sampling units can be sampled. Whilst this does not preventeach unit from having an equal probability of selection, it does violate the othercondition of random sampling, namely that each unit be chosen independently.Where the ordering of the list is completely random in itself, this will not be a

Sampling Procedures

95

problem. However, few lists are completely random. An alphabetical list, such asan electoral roll, has people of the same surname (often husband and wife) nextto each other on the list. If one person is included in the sample then the nextperson cannot be in the sample. Thus husbands and wives will rarely be bothselected by systematic sampling. Similarly, where systematic sampling is basedon house order in a street, neighbours in adjacent households cannot both beincluded in the sample. Thus to the extent to which there is serial correlationbetween neighbours on a list with respect to the parameter to be estimated, theresultant sample will not be truly random. This non-randomness will have aconsequent effect on the sampling error for the sample. In general, as thesimilarity between neighbours on the list increases, the sampling error estimatedfrom the systematic sample will be a greater overestimate of the sampling errorwhich would have been achieved from an equal size random sample.

The third problem with systematic sampling, especially when the selectioninterval is large, occurs when there is a trend in the parameter being measuredwithin the list. The estimated average value of the parameter will then depend onthe point at which sampling commences within the list. The degree to which thisis a problem will depend on how much the parameter varies between the firstand last units within each selection interval.

Despite the potential problems with systematic sampling, it is a very useful andsimple sampling method which can be used whenever sampling frame lists areavailable. Care should simply be taken to ensure that whatever ordering ispresent in the list is used to advantage rather than disadvantage.

4.4.7 Non-Random Sampling Methods

In addition to the above methods, which are based to a greater or lesser extenton random probabilistic sampling, there are a number of other samplingmethods which are not based on random sampling. The two principal methodsare quota sampling and expert sampling.

Quota sampling, as the name suggests, is based on the interviewer obtainingresponses from a specified number of respondents. The quota may be stratifiedinto various groups, within each of which a quota of responses must be obtained.This method, for example, is often used when interviewing passengersdisembarking from aircraft or other transport modes and for many types ofstreet interviews where passers-by are stopped and asked questions. The majorproblem with quota sampling is not that quotas are used for each sub-group(after all, this is the basis of stratified sampling), but that the interviewer is doingthe sampling in the field and this sampling procedure may be far from random,unless strictly controlled. Left to themselves, interviewers will generally pickrespondents from whom they feel they will most readily obtain a response. Thuspassers-by who appear more willing to cooperate, are not in a hurry, and are of a

Chapter 4

96

social class comparable to the interviewer will more likely be interviewed. In ahousehold survey, households which are closer to the interviewer's residence(and hence require less travel to reach), households whose members are moreoften at home, and households without barking dogs are more likely to beinterviewed. Such preferential selection can often cause gross biases in theparameters to be estimated in the survey.

Expert sampling, on the other hand, takes the task of sampling away from theinterviewer and places it in the hands of an "expert" in the field of study beingaddressed by the survey. The validity of the sample chosen then relies squarelyon the judgement of the expert. While such expert sampling may well beappropriate in the development of hypotheses and in exploratory studies, it doesnot provide a basis for the reliable estimation of parameter values since it hasbeen repeatedly shown that people, no matter how expert they are in a particularfield of study, are not particularly skilled at deliberately selecting randomsamples. A more appropriate role for the expert in sample surveys is in thedefinition of the survey population and strata within this population, leaving thetask of selecting sampling units from these strata to the aforementioned randomsampling methods.

4.5 SAMPLING ERROR AND SAMPLING BIAS

Despite all our best intentions in sample design, the parameter estimates madefrom sample survey data will always be just that: estimates. There are twodistinct types of error which occur in survey sampling and which, combined,contribute to measurement error in sampled data.

The first of these errors is termed sampling error, and is the error which arisessimply because we are dealing with a sample and not with the total population.No matter how well designed our sample is, sampling error will always bepresent due to chance occurrences. However, sampling error should not affectthe expected values of parameter averages; it merely affects the variabilityaround these averages and determines the confidence which one can place in theaverage values. Sampling error is primarily a function of the sample size and theinherent variability of the parameter under investigation. More will be said aboutsampling error when techniques for the determination of sample size arediscussed.

The second type of error in data measurement is termed sampling bias. It is acompletely different concept from sampling error and arises because of mistakesmade in choosing the sampling frame, the sampling technique, or in many otheraspects of the sample survey. Sampling bias is different from sampling error intwo major respects. First, whilst sampling error only affects the variabilityaround the estimated parameter average, sampling bias affects the value of the

Sampling Procedures

97

average itself and hence is a more severe distortion of the sample survey results.Second, while sampling error can never be eliminated and can only be minimisedby increasing the sample size, sampling bias can be virtually eliminated by carefulattention to various aspects of sample survey design. Small sampling errorresults in precise estimates while small sampling bias results in accurate estimates.

The difference between these two sources of error is sometimes confused, withattention being paid to reducing sampling error while relatively little attention ispaid to minimising sampling bias. In an attempt to underscore the differencebetween the two concepts, consider an analogy with rifle marksmanship asillustrated by the targets shown in Figure 4.8.

A C C U R A T E I N A C C U R A T E

PRECISE

IMPRECISE

Figure 4.8 The Distinction between Accuracy and Precision

These targets illustrate four essentially different ways in which rifle shooters mayhit the target. The top left target shows a marksman who consistently hits thebullseye. The bottom left shows one who centres his shots around the bullseyebut also tends to spray his shots; he seems to be able to aim at the right point buttends to suffer from slight movement of the rifle at the last moment so that hisshots are not consistent. The top right target shows the results of a marksmanwho consistently misses the bullseye; he holds the rifle rock-steady butunfortunately he is aiming at the wrong point on the target, maybe because thetelescopic sights on the rifle are out of adjustment. The bottom right shows ashooter who appears to be aiming at the wrong point, but because he also suffersfrom nervous jitters he sometimes hits the bullseye even though he is not aimingat it. These four situations may be categorised in terms of the precision and the

Chapter 4

98

accuracy of the shots; precise shooters always hit the same spot, while accurateshooters aim at the right point on the target.

It is fairly clear which of the four shooters would be regarded as the best; the topleft shooter shoots with both accuracy and precision in that he consistently hitsthe bullseye. It is also probably safe to say that the bottom left shooter is thesecond best in that he is at least on target (on average). However, it is not quiteso clear which of the remaining two are the worst. Is it better to be consistentlyoff-target, or inconsistently off-target (where at least you have some chance ofhitting the bullseye)? This judgement of the quality of marksmanship is mademore difficult when the bullseyes are removed to leave only the holes left by therifle shots, as shown in Figure 4.9. In this case, it is difficult to say whether the topleft or the top right group of shots came from the better marksman. Indeed, onemay argue that both groups are equally good. In the absence of any knowledgeabout where the marksmen were aiming, one is more readily swayed by theprecision of the shots in judging the quality of the shooter. Indeed, the top rightgroup of shots is now vying for the best group of shots, whereas in Figure 4.8 itwas vying for being the worst group of shots.


PRECISE

IMPRECISE

Figure 4.9 The Confusion between Accuracy and Precision

The above description of the marksman can be applied, by analogy, to the designand use of sample surveys. A precise survey is one which displays repeatability;that is, if administered on repeated occasions under similar conditions it will yieldthe same answers (irrespective of whether the answers are right or wrong). Onthe other hand, an accurate survey is one which displays validity, in that the

Sampling Procedures

99

survey is aimed at a correct sample of the correct target population. The precisionof a sample survey can be increased by increasing the sample size so as to reducethe possibility of unobserved members of the population having, by pure chance,characteristics which are different to those observed. The accuracy of a samplesurvey can be increased by ensuring that, first, the sampling frame does notsystematically eliminate some members of the population and, second, that thesample is obtained from the sampling frame in a truly random fashion.

Much attention is often paid to reducing sampling error (i.e. increasing precision)by means of elaborate sampling designs and large sample sizes. Relatively littleattention, however, is generally paid to increasing accuracy by means of reducingsampling bias to ensure that the questions are being asked of the right people.We are often guilty of "Type-III Errors", described by Armstrong (1979) as "goodsolutions to the wrong problems". By simply increasing sample sizes, and notpaying attention to the quality of the sample, we can always ensure that we willbe able to spend enough money to get precisely wrong answers! Indeed, byanalogy with Figure 4.9, when we do not know much about the true populationwe are trying to survey, then we assume that a precise answer is better than animprecise answer, irrespective of whether it is accurate or not.

In an attempt to improve the accuracy of sample surveys, we therefore need tobe more aware of the likely sources of sampling bias (the issue of increasingaccuracy by improving survey instrument validity will be discussed in Chapter5). Some common sources of sampling bias include:

(a) Deviations from the principles of random sampling including: thedeliberate selection of a "representative" sample which results in toomany observations at the extremes of the population distribution; thedeliberate selection of an "average" sample which results in too manyobservations near the average value and not enough at the extremes; theinitial selection of a random sample from which the investigator discardssome values because they are not considered to be random.

(b) Use of a sampling frame whose characteristics are correlated withproperties of the subject of the survey, e.g. using a telephone interviewsurvey to obtain car ownership rates. Richardson (1985) shows the effectof sampling bias in telephone surveys, and suggests means of correctingfor this bias.

(c) Substitution sampling, where the interviewer in the field changes thespecified sample because of difficulties experienced in obtaining theoriginally selected sample (e.g. barking dog in front garden, lack of timeto reach next specified household, non-response from selectedhousehold).

Chapter 4

100

(d) Failure to cover the selected sample. This can result in bias if thosesampling units left unsurveyed are atypical of the total sample. Forexample, in a household survey of transit usage, where the intervieweruses transit to reach the survey area, lack of time may result in thosehouseholds closest to the transit line being surveyed whilst those fartheraway are left unsurveyed.

(e) Pressures placed on interviewers by the method of payment adopted. Ifpayment is by interview completed, then there is an incentive to theinterviewer to complete as many interviews as possible in the shortestpossible time. This increases pressures for substitution sampling and alsoencourages the interviewer to complete each interview as quickly aspossible. For travel surveys, this will result in fewer trips being reportedby respondents during the interview. If payment is by the hour, then thereverse incentives are present. While more desirable than hurriedinterviews, the expenditure of large amounts of time on each completedinterview inevitably means that interviewer productivity must fall.

(f) Falsification of data by interviewer when the interview has not even beenconducted. Where such falsification is caused by difficulties in contactingthe respondent, then sampling bias may be introduced.

(g) Non-response effects. The bias introduced by non-response varies withthe type of survey method used. Thus for mail-back questionnaires, non-response is generally an indication of a low level of interest in the subjectof the survey by the non-respondent. For self-completion travel surveys,this often means that non-respondents travel less than respondents, atleast for the purpose and/or mode which is the subject of the survey.Empirical verification of this trend can be found in Brög and Meyburg(1981) and Richardson and Ampt (1994). On the other hand, for personalinterview surveys, non-response and in particular non-contact isgenerally more of a problem for those respondents who are moremobile, and hence less often at home to be contacted by the interviewer(see Brög and Meyburg, 1982). Thus for self-completion mail-backquestionnaire surveys, non-response will bias travel estimates upwardswhereas for personal interview surveys, non-response will bias travelestimates downwards.

Whilst all the above sources of bias are potentially serious, there are a number ofsafeguards against the introduction of sampling bias including:

(a) Use a random sample selection process and adopt, in full, the samplegenerated by the process.

Sampling Procedures

101

(b) Design the survey procedure and field administration such that there isno opportunity or need for interviewers to perform "in-field" sampling.

(c) Perform random call-backs on some respondents who have beeninterviewed to check on the accuracy of the data obtained and theadherence of the interviewers to the specified random sample.

(d) Perform cross-checks with other secondary sources of data to check onthe representativeness of the respondents.

(e) Make every attempt to increase response rates by means of reminderletters for self-completion, mail-back surveys and repeated call-backs forpersonal interview surveys.

(f) Attempt to gain as much information about the characteristics of theentire sample (i.e. in terms of Figure 4.9, try to know the target at whichyou are shooting) by identifying the characteristics of non-respondentsso that adjustments can be made to the survey results to account for thedegree of non-response (see Brög and Meyburg, 1980; Meyburg andBrög, 1981).

One final point with respect to sampling bias is that it will vary with the type ofsurvey method being used and with the parameters which the sample surveyseeks to estimate. Only careful consideration of the individual circumstances willdetermine whether significant bias is likely to exist in the survey results. In allcases, however, it is only possible to correct for sampling bias if sufficient efforthas been made to gather information about the entire sample and the entirepopulation which the sample purports to represent.

4.6 SAMPLE SIZE CALCULATIONS

Of all the questions concerned with sample design, the one most frequentlyaddressed is that of required sample size. As mentioned earlier, one way ofreducing sampling error is to increase sample size; the question remains,however, as to how much one should increase sample size in order to obtain anacceptable degree of sampling error. This section will attempt to provide someguidance on this matter, particularly for the case of simple random sampling.Estimation of required sample sizes for other sampling methods rapidly becomesmore complex (see, for example, Kish 1965) and will not be covered in detail inthese notes.

In discussing required sample sizes for simple random samples, it is emphasisedthat guidance only can be given. Much to the chagrin of many investigators, nofirm rules can be given for sample size calculations for use in all circumstances.Whilst the calculations are based on precise statistical formulae, several inputs to

Chapter 4

102

the formulae are relatively uncertain and subjective and must be provided by theinvestigator after careful consideration of the problem at hand. Importantly, it isoften difficult for the survey designer to convey the nature of sample sizecalculations to clients, who are most often ignorant of the statistical conceptsinvolved. This chapter will attempt to provide some assistance in conveying theseconcepts to clients with little or no statistical background.

The essence of sample size calculations is one of trade-offs. Too large a samplemeans that the survey will be too costly for the stated objectives and theassociated degree of precision required. Too small a sample will mean that resultswill be subject to a large degree of variability and this may mean that decisionscannot reliably be based on the survey results. In such a situation, the entiresurvey effort may have been wasted. Somewhere between these two extremesthere exists a sample size which is most cost-effective for the stated surveyobjectives.

In the context of survey objectives, it is useful to distinguish between two broadpurposes for which survey data may be collected:

(a) The main purpose is often to estimate certain population parameters, e.g.average person trip rates, car ownership, mode split, etc. In such cases, asample statistic is used to estimate the required population parameter.However, because all sample statistics are subject to sampling error, it isalso necessary to include an estimate of the precision which can beattached to the sample statistic. This level of precision will be affected,inter alia, by sample size.

(b) A second purpose of a survey (or surveys) may be to test a statisticalhypothesis concerning some of the population parameters, e.g. are theresignificant differences in trip rates in different areas, or has mode userisen following introduction of a new transport service? To test suchhypotheses, it is necessary to compare two sample statistics (each beingan estimate of a population parameter under different conditions), eachof which has a degree of sampling error associated with it. The tests areperformed using statistical significance tests where the power of the testis a function of the sample size of the survey(s).

Whilst the use of sample survey data to fulfil each of these objectives requiresdifferent statistical techniques, they are linked by a common usage of the conceptof standard error. This concept will now be described, initially with reference tothe former objective of sample surveys - that of obtaining population parameterestimates.

Sampling Procedures

103

4.6.1 Sample Sizes for Population Parameter Estimates

The determination of required sample size for the estimation of populationparameters depends, as will be shown later, on three principal factors:

(a) The variability, over the population, in the parameters to be measured;(b) The degree of precision required for each of the parameter estimates;(c) The population size.

Of these three factors, the first two are by far the most important. This may atfirst seem surprising because, to many, it seems intuitive that larger samples willbe required from larger populations to maintain accuracy of parameterestimates. This intuitive feeling is often summarised by statements which implythat sample sizes are always expressed in percentage form e.g. "a 10% sampleshould be big enough". However, as shall be seen later, except for very smallpopulations, the population size does not significantly affect the required samplesize: it is the absolute sample size which is important.

This finding is so important that it is worth repeating. Except in surveys of verysmall populations, it is the number of observations in the sample, rather than thesample size as a percentage of the population, which determines the precision ofthe sample estimates. A sample of 200 people from a population of 10 million isjust as precise as a sample of 200 people from a population of ten thousand.

4.6.1.1 Sample Sizes for Continuous Variables

Before proceeding too far in the determination of required sample sizes, it will beuseful if we review one statistical theory which is at the very heart of sample sizeestimation. This theorem is called the Central Limit Theorem. This theorem statesthat estimates of the mean of a sample tend to become normally distributed asthe sample size n increases. This normality of sample means applies irrespectiveof the distribution of the population from which the samples are drawn providedthat the sample size is of reasonable size (n!>!30). For small sample sizes, thetheorem still applies provided that the original population distribution isapproximately bell-shaped.

This theorem often causes confusion but it is so basic to sampling theory that itmust be understood before any progress can be made in understanding samplesize determination. So let's restate it. Assume that we have, for example, acontinuous variable (x) whose variability among sampling units in the populationmay be described by the distribution shown in Fig.!4.10. Such a variable may be,for example, the income of people in our population. The distribution may be ofany form (for example, negatively skewed as shown). Assume that thepopulation is of size N and the population distribution has some true mean valuem and a true standard deviation s.

Chapter 4

104

If we were to now draw a sample of size n from this population, we couldcalculate the mean income for that sample as m1 and the standard deviation forthat sample as S1. We could then draw a second sample of size n from the totalpopulation and calculate m2 and S2. This could be repeated for a third sample toobtain m3 and S3, a fourth sample to get m4 and S4 etc. Having drawn x samples,we could then construct a frequency distribution of the values m1, m2,m3,!..........!mx. The Central Limit Theorem states that this distribution, as shownin Fig. 4.11, is normally distributed with mean m (which is an unbiased estimateof the population mean m).

f(x)

x

s

m

Figure 4.10 Distribution of the Parameter in the Population

Sampling Procedures

105

Figure 4.11 Distribution of the Means of Independent Samples

The standard deviation of this distribution of sample means, which is referred toas the standard error of the mean (s.e.(m)), is given by:

s.e.(m) = !N-nN !.!

s2

n ! (4.1)

The above discussion has been based on taking repeated samples from apopulation. Generally, however, this is not possible and therefore it is necessaryto make some estimates based on a single sample of size n. In such a situationour best estimate of m is given by m1 and similarly the best estimate of s is givenby S1 (hereafter referred to as S). Therefore on the basis of a single sample, wecan estimate what the standard error of the mean would have been, if repeatedsamples had been drawn, as:

s.e.(m) = !N-nN !.!S

2

n ! (4.2)

As noted earlier, the standard error is a function of three variables; the variabilityof the parameter in the population (represented by the standard deviation s), thesample size (n) and the population size (N) . However for large populations andsmall sample sizes (which is often the case in transport surveys), the finitepopulation correction factor (N-n)/N is very close to unity. In such situations, theequation for standard error of the mean may be reduced to the more familiarform of:

Chapter 4

106

s.e.(m) = !S2

n ! = S!n!!

(4.3)

This equation highlights a most important aspect of sample size determination.That is, as sample size increases, the standard error of the mean will decrease butonly in proportional to the square root of the sample size. Thus, quadrupling thesample size will only halve the standard error of the mean. Increasing samplesize is therefore a clear case of diminishing marginal returns with respect todecreases in standard error of the mean.

Reference to the properties of the normal distribution, dictated by the CentralLimit Theorem, also enables an estimate to be made of the accuracy of thesample mean m as a reflection of the true population mean m. Such estimates arecalculated using the concept of confidence limits associated with the normaldistribution. Thus, some 95% of all sample means (from samples of size n) wouldlie within two standard errors on either side of the true mean, so that there is aprobability of only about one in twenty that the deviation between a samplemean and the true mean will exceed a value greater than twice the standarderror.

Given the foregoing discussion, the required sample size can be estimated bysolving for n in equation (4.2). This is most easily done in two stages by firstsolving for n in equation (4.3) such that:

n' = S2

(s.e.(m))2 (4.4)

and then correcting for the finite population effect, if necessary, such that:

n = n'1!+!(n'/N) (4.5)

Whilst the above procedure for the determination of sample size looks relativelystraightforward and objective, there are two major problems in the application ofthe method; the estimation of the population standard deviation (s) and theselection of an acceptable standard error of the mean (s.e.(m)). The problem withthe estimation of the standard deviation is that this is one of the statistics whichwill be calculated after the survey has been conducted, and yet we are required toestimate it before we conduct the survey in order to calculate the sample size. It istherefore necessary to derive an estimate of the standard deviation from othersources. Three major sources suggest themselves:

(a) Previous surveys of the same, or a similar, population may provide anestimate of the standard deviation of the parameter in question. Due

Sampling Procedures

107

allowance should be made for any differences in the sampling methodused in the previous and the current survey.

(b) There may be some theoretical foundations on which to base an estimateof the standard deviation. This technique was used, for example, in theAustralian National Travel Survey (Aplin and Flaherty, 1976).

(c) Where little previous information exists about the population, it may benecessary to conduct a pilot survey to obtain information needed todesign the main survey. A problem with this method, however, is thatoften time and money resources do not permit the conduct of largeenough pilot survey to enable serviceable estimates of the standarddeviation to be obtained. In such circumstances, the standard deviationestimates may be more misleading than informative.

Sometimes the estimated sample size can be adjusted during the course of themain survey to overcome any uncertainty in the initial estimate of the standarddeviation. Thus using the initial standard deviation estimate, a sample ofminimum size could be collected. The standard deviation in this sample couldthen be computed and compared with the initial estimate. If the standarddeviation is larger than estimated, thus indicating that a larger sample should becollected, then a supplementary sample could be collected to augment the initialsample. Whilst this two-step procedure sounds attractive in being able to lessenthe demands of accurate estimation of standard deviation, it is only feasible incertain circumstances. Thus the conduct of the survey must be spread over areasonable time period so that coding, editing and analysis of an initial samplecan be completed in time for the supplementary sample data collection to followon reasonably soon after the collection of the initial sample. Where strict timelimitations are placed on the conduct of a survey, supplementary samples maynot be feasible.

The second problem in the estimation of sample sizes using the above equationsis the specification of an acceptable standard error of the mean. This task basicallyexpresses how confident we wish to be about using the sample mean as anestimate of the true population mean. The specification of a standard error israrely performed per se; rather it is usual to specify confidence limits of aspecified size around the mean and at a certain level of confidence. For example,you may specify that you wish, with C% confidence, to obtain a sample meanwhich is within a specified range, either relative or absolute, of the populationmean. Specified in this way, two judgements must be made in order to calculatethe acceptable standard error.

First, a level of confidence must be chosen for the confidence limits. Basically, thelevel of confidence expresses how frequently the client is prepared to be wrongin accepting the sample mean as a measure of the true mean. For example, if a

Chapter 4

108

95% level of confidence is used, then implicitly it is being stated that the client isprepared to be wrong on 5% of occasions. If such a risk is deemed to beunacceptably large, then higher confidence limits (such as 99%) can be used.Higher confidence limits will, however, require larger sample sizes.

The specification of levels of confidence is a difficult task for the client and thesurvey designer. Not understanding the subtleties of sampling, most clients areunwilling to accept that the survey, for which they are paying, will come up withanything but the correct answer. On the other hand, the survey designer shouldknow that nothing but a full population survey will produce results that areabsolutely correct (assuming that everything else about the survey is acceptable).The task of the survey designer, therefore, is to get some indication from theclient of what they think is an acceptable level of confidence in the results. Byconvention, 95% levels of confidence are often assumed for sample surveys intransport. This means that if repeated sample surveys were to be conducted onthis topic with a sample of this size, then 5% of the estimates of the mean wouldlie outside the range of the population mean, plus or minus two standard errors.

Second, it is necessary to specify the confidence limits around the mean, either inabsolute or relative terms. If relative measures are used (i.e. the confidence limitis a proportion of the mean) then this requires that an estimate of the mean beavailable so that an absolute measure of the confidence limit can be calculated. Ifthe parameter being estimated is of some importance then smaller confidencelimits can be specified but again this will result in higher sample sizes beingnecessary. The size of the confidence limits will depend on the use to which theresults of the survey are to be put.

The important point to note about acceptable standard errors is that thespecification of both the confidence limits and the level of confidence is relativelysubjective. More important parameters can be assigned smaller confidence limitsand/or higher levels of confidence. Each of these actions will result in a smalleracceptable standard error and thus a higher required sample size. The decision,however, lies in the hands of the sample designer in liaison with the client; isaccuracy of a parameter estimate sufficiently important to warrant the highercosts involved in a larger sample size?

To illustrate the points outlined above, consider, as an example, a survey of apopulation of 1000 households in which an estimate of average householdincome is required such that there is a 95% probability that the sampling errorwill be no more than 5% of the sample mean. From a pilot survey of thispopulation, it has been found that with a sample size of 30 the mean income was$24,000 and the standard deviation was $5,000.

The acceptable standard error can be calculated from the specified confidencelimits and level of confidence. From a table of unit normal distribution values (see

Sampling Procedures

109

Appendix A), a 95% level of confidence corresponds to a value of 1.96 times thestandard error. That is, there is a 95% probability that the error of the meanestimate will be no more than 1.96 times the standard error. However, in ourcase we want this error to be no more than 5% of our estimated mean. Using thepilot survey mean value as an initial estimate, the confidence limit will thereforebe equal to $1,200 (= 0.05 x 24,000). The acceptable standard error is then givenby:

s.e.(m) = confidence!limitz (4.6)

= 12001.96 = 612

The sample size, for an infinite population, is then given by:

n' = S2

s.e.(m)2 = 50002

6122 = 67 (4.7)

Applying the finite population correction factor, the final sample size is given by:

n = n'1!+!(n'/N) = 67

1!+!67/1000 = 63 (4.8)

Having collected data from these 63 households, it may be found that theestimated sample mean income has fallen to $23,200 while the sample standarddeviation has increased to $6,000. In such a case, a new estimate of the requiredsample size may be given by:

n' = 60002

5922 = 103 (4.9)

and hence

n = 1031!+!103/1000 = 93 (4.10)

If it is convenient, and if the extra expense is deemed worthwhile, then an extra30 households should be sampled and surveyed. It should be noted that withoutthe extra households in the survey, the sampling error of the mean, at a 95%confidence level, is equal to 6.2% of the sample mean. The question must beasked as to whether the expense of the extra surveys is warranted by thereduction in sampling error from 6.2% to 5.0% of the mean.

4.6.1.2 Sample Sizes for Discrete Variables

The preceding discussion on sample size estimation has concentrated on thecollection of data on continuous variables. In many cases, however, we may wish

Chapter 4

110

to collect data on discrete variables which are characterised by their presence orabsence in a household. An example of a discrete variable would be carownership, where a household either owns a car or it does not. In such a case, theCentral Limit Theorem still applies but this time it is applied to the proportion ofa sample possessing a certain characteristic e.g. owning a car. The standard errorfor estimation of a proportion p is given by:

s.e.(p) = !N-nN !.!p(1-p)

n ! (4.11)

To illustrate the use of this equation, consider our previous example. In the pilotsurvey, it may have been found that 20% of the households did not own a car.Assume that we wished to estimate the percentage of households not owning acar such that, at a 95% level of confidence, the sampling error was not greaterthan 5 percentage points. Note that when estimating percentages it is importantto define the acceptable sampling error clearly; in our example, we have specified5 percentage points. (i.e. 20% +/- 5%), and not 5 percent of the mean percentage(i.e. 20% +/- 1%). Also note that if relative error is specified, it should be clear asto whether the relative error refers to the percentage possessing a characteristicor the percentage not possessing a characteristic. Thus if relative errors arespecified as 5% of the mean, does it mean 5% of the percentage not possessing acar (i.e. 5% of 20%) or 5% of the percentage possessing a car (i.e. 5% of 80%). Theacceptable standard error may be calculated as:

s.e.(p) = 51.96 = 2.55 percentage points (4.12)

The required sample size may be obtained in a two-step calculation by:

n' = p(1-p)s.e.(p)2

= 20!x!802.552

= 246 (4.13)

Applying the finite population correction factor, the final required sample size isgiven by:

n = n'1!+!(n'/N)

= 2461!+!246/1000

= 197 (4.14)

Sampling Procedures

111

Note that in this case the finite population correction factor has a much largereffect on the required sample size because the sample size is a substantialproportion of the population.

The two calculations described above highlight a problem with sample sizecalculations for real-life surveys. Very few surveys seek to estimate just oneparameter in the population; the normal situation is that a survey seeks toestimate a large number of parameters for a population. However, as shownabove, carrying out sample size calculations separately for each parameter mayresult in widely varying estimates of required sample size. Thus, in our example,an initial sample size of 63 was required for income but a sample of 197 wasrequired for (non) car-ownership. Obviously a fail-safe procedure would be toselect the largest calculated sample size across all the parameters. In this way, allparameters would be estimated (at least) as precisely as desired. However, thisprocedure may also be very inefficient if, for example, only one parameterrequires a large sample size due to its own inherently high variability across thepopulation. The more usual procedure in selecting an overall sampling ratetherefore involves a degree of compromise across the parameters. In this way,some parameters will be obtained more precisely than desired, while otherparameters will be estimated with less precision than desired. If possible, themore important parameters should have their precision requirements fulfilled inpreference to those of less important parameters. An overall average samplesize, weighted by parameter importance, should therefore influence the finalselection of a required sample size.

A final problem to be noted in the above sample size calculations is that often asurvey will be seeking to obtain parameter estimates not just for the entiresample, but for many different sub-samples within the total sample. In suchsituations, it is necessary to know what those sub-samples will be before thesurvey is designed so that adequate sample sizes can be obtained for each of thesub-samples to be analysed later.

4.6.1.3 Explaining Sample Sizes to Clients

Because of the complexities and subtleties involved in sample design, it is oftendifficult for a survey designer to explain the concepts to clients and to obtainuseful information from the client to assist in the sample design. This input ofinformation from the client is essential because, after all, it is the client who has tolive with the results after the survey is completed. Therefore, every attemptshould be made to tailor the survey and the sample to the needs of the client. Toassist in explaining sample size calculations to clients, and in obtaining therequired input from clients, the design aids outlined in this section have provenuseful.

Chapter 4

112

The basis of these design aids is a set of two spreadsheet tables, shown in Figures4.12 and 4.13. Figure 4.12 shows how the confidence limits around a continuousvariable change as the sample size is changed, given estimates of the populationmean and standard deviation, the size of the population and the level ofconfidence required by the client. Specification of the level of confidenceautomatically calculates the value of z used in equation 4.6. The outputs of Figure4.12 are the upper and lower confidence limits which could be expected from anyspecified sample size.

Continuous Variable Confidence Limit Calculator

Population Mean = 3.60Population S.D. = 2.50Population Size = 10000

Level of Confidence = 95% => z = 1.96

Sample Size s.e.(m) z*s.e.(m) Lower Limit Upper Limit50 0.35 0.69 2.91 4.29

100 0.25 0.49 3.11 4.09150 0.20 0.40 3.20 4.00200 0.18 0.34 3.26 3.94250 0.16 0.31 3.29 3.91300 0.14 0.28 3.32 3.88350 0.13 0.26 3.34 3.86400 0.12 0.24 3.36 3.84450 0.12 0.23 3.37 3.83500 0.11 0.21 3.39 3.81

5000 0.03 0.05 3.55 3.65

Figure 4.12 Confidence Limit Estimator for Continuous Variables

For example, if the expected trip rate per person per day is 3.60 (with a standarddeviation of 2.50), then with a sample of 300 persons in any one strata, we wouldexpect that the mean trip rate would fall between 3.32 and 3.88 in 95% of samplesof this size.

If the client believed that this range was too great, then they could experimenteither with different sample sizes or with changing the level of confidence. Thebottom line in Figure 4.12 is provided to allow the specification of any desiredsample size, which may be outside the range of those provided in the table.Figure 4.12 may be re-used for any continuous variable in the survey by simplychanging the population mean and standard deviation.

Sampling Procedures

113

A similar set of calculations is carried out in Figure 4.13 for discrete variables. Inthis case, however, what needs to be specified is the expected proportion in thepopulation possessing a certain feature. For example, if the expected proportionof trips by bus is 20%, then with a sample of 300 trips in any one strata, we wouldexpect that the mean proportion of trips by bus would lie between 16% and 24%in 95% of samples of this size.

Discrete Variable Confidence Limit Calculator

Population Proportion = 0.20Population Size = 20000

Level of Confidence = 90% => z = 1.64

Sample Size s.e.(m) z*s.e.(m) Lower Limit Upper Limit50 0.06 0.09 0.11 0.29

100 0.04 0.07 0.13 0.27150 0.03 0.05 0.15 0.25200 0.03 0.05 0.15 0.25250 0.03 0.04 0.16 0.24300 0.02 0.04 0.16 0.24350 0.02 0.03 0.17 0.23400 0.02 0.03 0.17 0.23450 0.02 0.03 0.17 0.23500 0.02 0.03 0.17 0.23

2000 0.01 0.01 0.19 0.21

Figure 4.13 Confidence Limit Estimator for Discrete VariablesFigures 4.12 and4.13 show the effect of changing sample sizes on the confidence limits whichcould be expected for one variable at a time. However, as noted earlier, one ofthe problems in sample design for a real survey is that sample sizes must becalculated for many variables across many strata. The effects of varying samplesize on the precision obtained for all variables can be summarised for the client asshown in Figures 4.14 through 4.16. These tables are constructed within astandard spreadsheet program, and are designed to be used interactively with aclient to give them a feel for the implications of using various sized samples andvarying number of strata (e.g. geographic regions). In Figure 4.14, the client canspecify the number of strata and the sample size, the population size (in terms ofnumber of households in the study area) and the required level of confidence(the latter item may be selected on the advice of the survey designer).

Chapter 4

114

Sample Size Design Parameters

Number of Strata = 12Households in Sample per Strata = 200

Persons in Sample per Strata = 600(assuming 3 persons per household)

Trips in Sample per Strata = 2400(assuming 4 trips per person)

Population Size = 60000 householdsLevel of Confidence = 95% => z = 1.96

Total Households in Sample = 2400Total Survey Cost = $144,000

(assuming $60 per responding household)

Figure 4.14 Input Screen for Sample Size Design Parameters

The survey designer can also input a unit price per responding household inFigure 4.14 to give the client an indication of the cost implications of their sampledesign decisions. A second input screen, shown in Figure 4.15, requires the client,or the survey designer, to specify the expected values of key variables in thepopulation together with the expected variability of continuous variables.

The spreadsheet program calculates the standard error for each variable and,using the value of z corresponding to the stated level of confidence, thencalculates the upper and lower confidence limits for each key variable. Theselimits are then stated in simple English as shown in Figure 4.16. If one or morethese ranges are not acceptable to the client, because they think that the precisionis not adequate for their purposes, then they can go back to the first input screenin Figure 4.14, change the sample size and observe the effects on the precisionshown in Figure 4.16. They can also experiment with the precision obtained withdifferent levels of stratification, by either increasing the number of strata anddecreasing the sample size in each strata, or by decreasing the number of strataand increasing the sample size in each. In this way, the client and the surveydesigner can interactively experiment with different sample designs and observethe effects on the precision of sample estimates and the cost of the survey,thereby experiencing the nature of the trade-offs in sample design.

Sampling Procedures

115

Expected Population Values for Key Variables

Household Variables Proportion Mean S . D .Persons per Household 3.00 1.50Vehicles per Household 1.50 0.80Households without Vehicles 0.10Trips per Household 12.00 5.00

Person Variables Proportion Mean S . D .Trips per Person 4.00 2.50% Male 0.50% Unemployed 0.10Average Personal Income ($K) 28.00 8.00

Trip Variables Proportion Mean S . D .Average Trip Length (minutes) 20.00 12.00% Trips by Bus 0.05% Trips to School 0.10Average People per Vehicle 1.40 0.20

Figure 4.15 Input Screen for Expected Values of Variables in Population

Precision of Sample Estimates

Household VariablesThe estimated value of Persons per Household lies between 2.79 and 3.21The estimated value of Vehicles per Household lies between 1.39 and 1.61The estimated value of Households without Vehicles lies between 0.06 and 0.14The estimated value of Trips per Household lies between 11.31 and 12.69

Person VariablesThe estimated value of Trips per Person lies between 3.80 and 4.20The estimated value of % Male lies between 0.46 and 0.54The estimated value of % Unemployed lies between 0.08 and 0.12The estimated value of Average Personal Income ($K) lies between 27.36 and 28.64

Trip VariablesThe estimated value of Average Trip Length (minutes) lies between 19.52 and 20.48The estimated value of % Trips by Bus lies between 0.04 and 0.06The estimated value of % Trips to School lies between 0.09 and 0.11The estimated value of Average People per Vehicle lies between 1.39 and 1.41

Figure 4.16 Output Screen for Estimated Values of Variables in Sample

4.6.2 Sample Sizes for Hypothesis Testing

The second purpose of a survey (or surveys) may be to test a statisticalhypothesis concerning some of the population parameters e.g. are theresignificant differences in trip rates in different areas, or has use of a specific moderisen following introduction of a new transport service? To test such hypotheses,

Chapter 4

116

it is necessary to compare two sample statistics (each being an estimate of apopulation parameter under different conditions), each of which has a degree ofsampling error associated with it. The tests are performed using statisticalsignificance tests where the power of the test is a function of the sample size ofthe survey(s).

In using sample survey data to test hypotheses about population behaviour, it isfirst necessary to ensure that the hypothesis to be tested is correctly specified.While hypotheses are often described as having been rejected or accepted, itshould be realised that the rejection of a hypothesis is to conclude that it is false,while the acceptance of a hypothesis merely implies that we have insufficientevidence to believe otherwise. Because of this, the investigator should alwaysstate the hypothesis in the form of whatever it is hoped will be rejected. Thus if itis hoped to prove that car ownership is higher in one area than in another, thehypothesis should be that car ownership is equal in the two areas, and then wetry to reject the hypothesis (statistically).

An hypothesis that is formulated with the hope of rejecting it is called the nullhypothesis and is denoted by H0. The rejection of H0 leads to the "acceptance" ofan alternative hypothesis denoted by H1. Thus in the case of testing fordifferences in two average values, the null hypothesis could be specified as:

H0 : m = m0 (4.15)

The alternative hypothesis could be specified in a number of different waysdepending on the purpose of the comparison. Possible alternative hypothesesmight include:

H1 : m > m0 (4.16)

H1 : m < m0 (4.17)

H1 : m ≠ m0 (4.18)

Note that only one of these alternative hypotheses can be used in any particulartest. The first two alternative hypotheses would constitute a one-tailed test, whilethe third alternative hypotheses would constitute a two-tailed test.

Since the data to be used in the testing of these hypotheses is to be collected usinga sample survey, we would be reluctant to base our decision on a strictdeterministic interpretation of the stated hypotheses. For example, if we wishedto test the null hypothesis that m = m0, then we would base our decision on testingwhether m fell within a critical region (D) around m0. Thus, if m0!+!D!<!m!<!m0 - D ,then we would, in practice, not reject the hypothesis that m!=!m0. Similarly if m < m0- D , then we would not reject the hypothesis that m!=!m0 in favour of thealternative hypothesis that m > m0. The definition of the critical region, D, is

Sampling Procedures

117

somewhat arbitrary and merely serves to give a workable rule for the rejectionof hypotheses. Obviously a smaller critical region will make rejection of the nullhypothesis easier.

In testing hypotheses, there are four possible end-states of the hypothesis testingprocedure. Two of these states signify that a correct decision has been madewhile the other two indicate that an error has been made. The four end-statesmay be depicted as shown in Table 4.1.

Table 4.1 Possible End-States of Hypothesis Testing

DECISION TRUE STATEH0 H1

Accept H0 Correct Type II Error

Reject H0 Type I Error Correct

Thus if the true state is described by the null hypothesis H0 and we accept H0 asbeing a description of the true state, then we have made no error. Similarly, wewill be equally correct if we reject H0 when the true state is actually described bythe alternative hypothesis H1.

A Type I Error will have been committed if we reject the null hypothesis when itis in fact true. For example, we assume that two groups of households have thesame car ownership when, in fact, one group has more cars than the other. AType II Error will have been committed if we accept the null hypothesis when it isin fact false. For example, we assume from two sample surveys conducted atdifferent points in time that trip rates have risen over time, when in fact it wassimply chance variations in the two samples which gave the appearance of anincrease in trip rate.

Obviously in testing hypotheses we would be interested in trying to minimise thechances of making either a Type I or Type II error. Which one we would be mostinterested in avoiding will depend on the relative costs associated with each typeof error. The degree to which we wish to avoid each type of error is expressed interms of the maximum probability which we will accept for making each type oferror. The acceptable probability of committing a Type I error is called the levelof significance of the test and is denoted by a . The acceptable probability ofcommitting a Type II error is denoted by b. The value 1-b is often called thepower of the test. The power of the test cannot be calculated unless a specificalternative hypothesis is stated in the form of an equality, such as H1: m!= m0 + d.

The inter-relationships between the variables described above can be moreclearly seen by reference to Figures 4.17 and 4.18. Both figures depict the

Chapter 4

118

distribution of sample means which would be obtained from repeated samplingfrom a population. In Figure 4.17, it can be seen that the null hypothesisexpressed in equation (4.15) is in fact correct, i.e. m = m0 . If this was known withcertainty, we would definitely accept the null hypothesis. However, because m isestimated from a sample survey, we would not know that the true mean is m0 .Rather, all we know is our sample estimate of the mean (denoted by a value ofm). We also know, from the Central Limit Theorem, that the sample estimates ofm will be normally distributed around m0 (for large enough sample sizes), asshown in Figure 4.17. In such circumstances, a certain proportion (a) of ourestimates will be greater than m0!+!D, and hence according to our decision rule,involving rejection of values outside the range of the critical region, we will rejectH0 : m = m0 even though H0 is, in fact, correct. We will therefore have committeda Type I Error (on a percentage of occasions).

Figure 4.17 Probability of a Type I Error

Consider, now, the situation depicted in Figure 4.18 which shows the situationwhen the specific alternative hypothesis is, in fact, correct, i.e. H1!:!m!= m0 + d.Once again, however, our sample estimates of m will be distributed normallyaround m0 + d. Given the same critical region D (around!m0), we can see that on bpercent of occasions the sample estimate of m will be less than m0 + D and hencewe will be led to accept the null hypothesis when it is in fact false. Thus, a Type IIerror will have been committed.

Sampling Procedures

119

Figure 4.18 Probability of a Type II Error

It should be obvious from Figures 4.17 and 4.18 that we can change the values ofa and b by changing the critical region D. Thus increasing D will reduce a but willat the same time increase b. Thus, we can make less Type!I errors at the expenseof making more Type II errors. Alternatively, decreasing D will increase a butreduce b. Importantly, no adjustment of D is possible which will simultaneouslyreduce both a and b. All one can do by changing D (i.e. changing the decision ruleby which we accept or reject the null hypothesis) is to trade-off the frequency ofType I and Type II errors.

One can reduce b, without affecting a, by increasing d. Thus, if one wishes to testfor a larger difference between the null hypothesis and the specific alternativehypothesis, then one can reduce the probability of making a Type II error.However, specifying a large value of d makes the test procedure of little value,since it is then incapable of detecting small differences in the null and alternativehypotheses. All that the test then says is that if a large difference in a parameter isdetected then we can be fairly sure that such a difference is real.

The only way in which a and b can be simultaneously reduced is by increasingthe sample size n. This is because, from equation (4.3), increasing the sample sizewill reduce the standard error (i.e. the standard deviation of the distributionsshown in Figures 4.17 and 4.18). Reducing the standard error of each distributionwill obviously reduce the area under the curve lying in the tails of the distributionto the right of the critical region line (m!=!m0!+!D) for a and to the left of the criticalregion line for b. The question then remains as to how much the sample sizemust be increased in order to reduce both a and b to acceptable levels.

Chapter 4

120

In testing a statistical hypothesis, the significance level (a) is normally controlledby the investigator, while b is controlled by using an appropriate sample size.Given an estimate of the standard error, the critical region (D) will automaticallybe determined as a consequence of meeting the requirements of attaining a fixedlevel of a . For the one-tailed test where:

H0 : m = m0

H1 : m = m0 + d (4.19)

or H1 : m = m0 - d

it can be shown (Walpole and Myers, 1978) that the required sample size to meetthe requirements of a and b is given by:

n =(za!+!zb)2s2

d2 (4.20)

where za , zb = critical values from standard normal distributiontables at levels of significance a and b (see Appendix A).

s2 = population parameter variance.

d = desirable detectable difference in hypotheses.

In the case of a two-tailed test of the form:

H0 : m = m0

H1 : m = m0 + d (4.21)

then the minimum sample size is given by:

n =(za/2!+!zb)2s2

d2 (4.22)

The discussion so far has concentrated on testing the results obtained from onesample survey (m) against a known benchmark (m0). In many cases, however, it isuseful to compare the results obtained from two sample surveys (such as before-and-after studies), or to compare the results obtained from two groups withinthe same sample survey (e.g. car ownership rates for two different socio-economic groups in a population). In such cases, it is necessary to determine thesample size required for each of the surveys in order to obtain pre-determinedvalues of a and b. Assuming that the sample size for each survey, or for eachsub-population within a single survey, is the same, then for a one-tailed test therequired sample size is:

Sampling Procedures

121

n =(za!+!zb)2(s12!+!s22)

d2 (4.23)

and for two-tailed test, the required sample size is:

n =(za/2!+!zb)2(s12!+!s22)

d2 (4.24)

where s12 , s22 = population parameter variances for each of the samples.

Typically, in before-after surveys where one is attempting to detect animprovement (or degradation) following a change in the transport system, one isconcerned only with a one-tailed test. Also since the same population is beingsurveyed on both occasions, it is usually safe to assume that s12!=!s22!=!s2 .Equation (4.23) then reduces to:

n =2(za!+!zb)2(s2)

d2 (4.25)

Three implications of this equation, while obvious on reflection, are worthnoting. First, the larger the values of za and/or zb , the larger the requiredsample size. Large values of z will occur if a and/or b are small. Thus the smallerthe probabilities that one wishes to accept of making either Type I or Type IIerrors, the larger the required sample size. For example, if one wishes to reducethe values of both a and b from 10% to 5%, then, all other things being equal, therequired sample size will increase by a factor of 1.64. Second, as noted whenestimating sample sizes for parameter estimation, the required sample size isdirectly proportional to the population variance of the parameter to be tested.Third, the sample size is inversely proportional to the square of the differencewhich one would like to detect in the parameter in the before-after situation.Halving the desired detectable difference will result in a fourfold increase in therequired sample size.

To illustrate the use of the above equations, consider that a study wished toexamine household public transport trip generation rates in an area before andafter the introduction of a new public transport system. Because it would beunrealistic to expect a sample survey to detect a change of any size (no matterhow small), it is necessary to have an idea of the expected change (d) and to testwhether this change has been realised. Assume that the initial weekly trip rate bypublic transport in the area is estimated to be 8 trips per household, with avariance of 4 trips per household, and that we wish to test whether this rate hasincreased by 25%, as expected. For no good reason, other than convention, let us

Chapter 4

122

assume that a!= b = 0.05. The minimum sample size for each survey would thenbe given by:

n =2(za!+!zb)2(s2)

d2

n = 2(1.645!+!1.645)2!.!42

22

= 87 (4.26)

Further examples of sample size calculations for hypothesis testing may be foundin Skelton (1982).

As an example of a slightly different situation in which sample surveys might beused to test hypotheses, consider the following situation. The suburb of Fairfieldhas recently had a new free minibus service established to serve residents of thatsuburb. The residents of nearby Glossodia, on hearing of this, petition theirtransport authority for establishment of a similar service in their area. They areinformed that Fairfield received the service only because of the relatively highnumber of households in that suburb which do not own a car. The Glossodiaresidents reply that their suburb has even more non-car-owning households thanFairfield, and therefore they too deserve a free minibus service. They offer toback up this claim by conducting a survey of households in Glossodia andFairfield to show the difference in car-ownership. The transport authority agreesto provide a free bus service for Glossodia if it can be shown, by an unbiasedsample survey, that Glossodia has more non-car-owning households thanFairfield. However, to ensure that the claim is fully justified, the transportauthority insists that the difference in the proportion of non-car-owninghouseholds be at least 5 percentage points at the 5% significance level. Theresidents agree to this, perhaps after some bartering, and proceed about the taskof designing the survey (obviously with professional help).

The first thing which must be done is to define the hypotheses to be tested. Sincethe residents wish to prove that non-car-ownership is higher in Glossodia than inFairfield, they choose as their null hypothesis the idea that the non-car-ownershipis equal in both areas. That is:

H0 : p1 = p2 (4.27)

where p1 = proportion of households in Glossodia with no car

p2 = proportion of households in Fairfield with no car

The alternative hypothesis, which they would like to prove, can be stated as:

Sampling Procedures

123

H1 : p1 > p2 (4.28)

However, to enable calculation of the power (1-b) of the test, it is necessary tostate the alternative in a specific manner such as:

H1 : p1 = p2 + d (4.29)

where d = expected difference in car-ownership.

The second step is to agree on the significance and power of the test to beperformed. To the residents, this means agreeing on how certain they would liketo be when drawing conclusions from the sample survey data. Initially, theresidents may say that they want to be 100% certain of making the right decision.However it is then explained to them, by their consultants, that to be 100%certain they would need to survey all households in both areas. Assuming thateach area is quite large, the cost of such a survey would be quite prohibitive -they would be able to buy several minibuses of their own for less than the cost ofthe survey! It is explained to them that if they are willing to accept being wrongwith a certain small probability, then the cost of the survey could be reducedquite substantially. They agree that this seems reasonable and therefore set aboutdetermining how often they would be prepared to be wrong.

They must set two probabilities: one each for Type I and Type II errors. To them,a Type I error means accepting the conclusion that they do have more non-car-owning households than Fairfield when it is, in fact, not the case. A Type II errormeans accepting that Glossodia does not have more non-car-owning householdsthan Fairfield, and hence they will not get the free bus service, even though inreality they do have more non-car-owning households and should have got thefree bus. Quite naturally the Glossodia residents are not very concerned aboutType I errors; from their point of view they would like them to happen as oftenas possible. However, the transport authority has wisely constrained them in thisregard by stipulating a 5% level of significance [i.e. a = 0.05) . In fact, thetransport authority has constrained them more than they probably realise,because as well as stipulating the probability of Type I errors they have alsostipulated the size of the critical region for the test (i.e. D = 0.05) . By doing this,the transport authority has effectively predetermined the sample size needed forthe surveys in each area, as will now be shown.

During their discussion with the residents, the authority let it be known that theproportion of non-car-owning households in Fairfield was 20%. Given this, thevariance in this proportion within Fairfield would be given by:

s2 = p2(1 - p2)

= 0.20 (1-0.20)

Chapter 4

124

= 0.16 (4.30)

Since there is expected to be no drastic difference in car-ownership between thetwo areas it would be reasonably safe to assume the same variance for Glossodia.

If Type I errors are limited to 5%, the critical region is set at 0.05 and the varianceof the difference in car ownership is estimated to be 0.32 (=!2s2) then, as seen byreference to Figure 4.17, the sample size must be fully determined by:

n =2s2

(s.e.(p1!-!p2))2

=2s2!.!za

2

D2

= 0.32(1.6452)0.052

= 346 (4.31)

Since surveys of this size would need to be carried out in both areas, a total of 692households must be surveyed. At a cost per completed survey of (say) $40 thisrepresents a total cost of $27,680. This appears to be quite a high outlay in orderto get a free mini-bus and hence the residents begin to question whether it isworth doing the survey at all. To help in making this decision, it is useful tocalculate the probability of the residents making a Type II error, i.e. acceptingthat Glossodia does not have more non-car-owning households than Fairfield,even though in reality they do and should have got the free bus.

To perform this calculation, it is necessary for the residents to specify a value ofthe expected difference in car-ownership between the two areas so that a specificalternative hypothesis can be formulated as per equation (4.29). Obviously if thedifference is confidently expected to be quite large then it may still be worthwhilespending the money on the survey since the results are highly likely to befavourable to Glossodia residents. On the other hand, if they do not expect thedifference to be much more than the 5% stipulated by the transport authoritythen the chances of achieving a favourable result are diminished. The chances ofnot being able to prove that Glossodia deserves the free bus, when in fact it does,can be expressed in terms of b. The variation in b as a function of the expecteddifference in the proportion of non-car-owning households is shown in Figure4.19.

Sampling Procedures

125

Figure 4.19 Variation in b as a function of d

As the expected difference in the proportion of non-car-owners in each areabecomes greater, the probability of an unfavourable result for the Glossodiaresidents decreases. Assume that after consultation, and perhaps some referenceto existing data sources, the residents decide that the most likely difference in theproportion of non-car-owning households in Fairfield and Glossodia is 0.07. i.e.given that the proportion of households without cars in Fairfield is 0.20, theexpected value for Glossodia is 0.27. In this case, with a sample size of 346households in each area, there is a probability of 25% (from Figure 4.19, b = 0.25)that the survey will show a difference less than 0.05 even though the truedifference may well be 0.07. Thus there is a 25% chance that the free bus will notbe provided when in fact it should be provided. The question remains as towhether this probability is too high compared to the cost of the survey. Usingsimple expected utility theory, the expected gain from conducting the survey willbe given by:

Expected Gain = (probability of getting bus)(value of bus) - survey cost

= (0.75)($40,000 (say)) - 27,680

= $2,320 (4.32)

In such a situation, it is just worthwhile for the residents to go ahead with theirsurvey; on the balance of probabilities they should come out ahead, providedthat their assumptions along the way have been accurate.

Chapter 4

126

The foregoing example has shown the application of decision analysis to samplesurvey design (in this case to the decision of whether to proceed with a survey ornot). It has also shown very clearly that the probabilities of Type I and Type IIerrors cannot be adequately specified without a consideration of the likely costsinvolved in committing each type of error. For example, if the value of the bushad been $60,000 in the previous example, then the value of b could have beenset as high as 0.65 and the decision to conduct the survey would still have beencorrect. While fraught with numerous practical difficulties, the concepts ofdecision analysis can be most useful in bringing a more rational basis to samplesurvey design (see Ramsay and Openshaw (1980), and Jessop and Gilbert (1981)for further discussion of this matter).

4.7 VARIANCE ESTIMATION TECHNIQUES

In sample surveys, the analyst is usually interested not only in measuring themean of a variable for the sample in question, but in estimating the value of thismean for the total population from which the sample was drawn. If the sample inquestion is a random sample from the population, then the population meanmay be taken to be the same as the sample mean. If, however, a new sample wasdrawn from the population (using the same methods), one would not expect thetwo sample means to be the same. Therefore, in stating that the population meanis the same as the sample mean, one is really stating that the population mean isbest represented by the sample mean. However, repeated sampling will result ina distribution of estimates of the mean, and the standard deviation of thisdistribution is known as the standard error of the estimate of the mean.

4.7.1 Variability in Simple Random Samples

For simple random samples, the standard error of the mean may be calculatedeasily by the following equation:

s.e.(m) = !S2

n ! = Sn!

(4.33)

where s.e.(m) = standard error (of the mean)

S = standard deviation of the variable in the population

n = sample size

Thus larger samples produce lower standard errors (i.e. more precise estimatesof the mean), while variables which are less variable in themselves can beestimated more precisely.

Sampling Procedures

127

In real-life surveys, however, there are often considerable deviations from theideas of simple random sampling. Such complex surveys often include designrefinements such as stratification, multi-stage sampling and the use of clusters assampling units. Whilst it is theoretically possible to extend equation (4.33) toaccount for these complexities, such extensions often become cumbersome, if notmathematically intractable. It is therefore desirable to use other methods toestimate the degree of sampling error in a sample estimate of the mean.

4.7.2 Design Effects

A particularly simple, but often used, method of estimating sampling error forcomplex survey designs is the use of "design effects". A design effect (Kish, 1965)is a factor which relates the variance in a sample design to the variance whichwould have been obtained if the sample had been chosen by simple randomsampling. The design effect can be greater than or less than one, depending onwhether the sample design tends to increase or decrease the sampling error. Anexample of a sample design which decreases sampling error is stratifiedsampling; designs which increase sampling error include multi-stage samplingand cluster sampling.

The following calculations, based on an example described by Moser and Kalton(1979), show the effect of stratification on the sampling error obtained from asurvey and also shows the calculation of the design effect for this sampling plan.Consider a sample survey of one person from a household in differentmunicipalities within a city. The objective of the survey was to determine theproportion of households favouring the upgrading of an airport. Since it wasknown beforehand that households in different areas of the county would havedifferent views on the upgrading (e.g. those near the airport would prefer noupgrading, while those living away from the airport in more affluent suburbswould prefer the upgrading), the sample was chosen such that each municipalitywithin the city was represented in the sample in proportion to the number ofhouseholds within each municipality. The results of the surveys are shown inTable 4.2, together with the proportion in each municipality favouring theupgrade (pi), and a product term (nipi(1-pi)) required for the calculation of thestandard error for a stratified sample.

Chapter 4

128

Table 4.2 Results of the Stratified Sample Airport Upgrading Survey

MunicipalityHouseholds in

Sampleni

NumberFavouringUpgrade

pi nipi(1-pi)

Dryden 95 86 0.905 8.170Freeville 43 22 0.512 10.736Lansing 25 18 0.720 5.040Montrose 39 31 0.795 6.355Waverley 32 20 0.625 7.500Belgrave 66 33 0.500 16.500TOTAL 300 210 54.301

The proportion in the city in favour of the airport can be obtained by calculatinga weighted average of the proportions in each of the strata (municipalities). Sincethe same sampling rate was adopted in each strata, this average can be obtainedas the proportion of the total number favouring the upgrade divided by the totalsample size, i.e 70% (210/300) of city households can be seen to be in favour ofthe upgrade. If these results had been obtained by a simple random samplingplan, then the standard error of the estimate of this proportion (s.e.(psrs)) couldbe calculated by the use of equation 4.11. Thus, assuming that the sample size is asmall proportion of the total households in the city:

s.e.(psrs) = !0.700!x!0.300300 ! = 0.0265 (4.34)

However, it is known that the estimate of standard error for a proportionatestratified random sample is given by:

s.e.(pprop) = !Ânipi(1-pi)

n2 ! (4.35)

= !54.3013002 !

= 0.0246

Comparison of the above standard errors shows the reduction gained by the useof stratified random sampling. This gain can be expressed in terms of the designeffect by calculating the ratio of the sampling error variances:

Design Effect = (0.0246)2

(0.0265)2 = 0.862 (4.36)

This design effect of 0.862 indicates that the use of stratified sampling bringsabout a reduction of 14% in the variance of the estimate of the mean, comparedto what would have been obtained using simple random sampling. Put in more

Sampling Procedures

129

practical terms, it means that a simple random sample of 348 households(=300/0.862) would be needed to give the same level of accuracy as provided bythis stratified sample of 300 households. The design effect can therefore beinterpreted as a measure of sample size savings by the use of the more complexsample design (i.e. we can use a sample which is only 86.2% of the simple randomsample if we are willing to use a stratified sample design). We can then trade-offthis reduction in sample size against the possible increase in cost per samplingunit to determine which is the most efficient way of obtaining a sample ofspecified precision.

While the above example has used the estimation of a proportion as the basicrationale for the survey, the same arguments apply when one is attempting toestimate the mean of a continuous variable. In this case, the standard error of thesample mean based on a small proportionate stratified random sample of size nis:

s.e.(mprop) =!ÂniSi2!

n2 (4.37)

Given this demonstration of the calculation of a design effect for a stratifiedrandom sample, the question arises as to whether there is a generally applicablevalue of the design effect which could be used for all samples of this type.Unfortunately, the answer to this question is in the negative. While it is generallytrue that stratification will reduce the standard error of estimates, the extent towhich it reduces it will vary from case to case. As noted in Section 4.4.2, there isno gain in precision if the stratification of the sample is based on a variable whichhas no connection with the parameters under investigation (e.g. estimating triprates for people born on different days of the week). The gain provided bystratified sampling can be expressed as the difference in variances of the estimate,as given by:

Var(msrs) - Var (mprop) ≈ !Âni(mi!-!m)2!

n2 (4.38)

where mi is the population mean for the ith stratum and m is the overallpopulation mean. This expression shows that if all the strata have the same mean,then there is no gain from stratification (i.e. the design effect is equal to one). Asthe difference in the means within each stratum increases, then stratification willhave a greater effect on reductions in variance (i.e. the design effect will becomesmaller and smaller). The size of the design effect will therefore depend on theextent to which the stratification variables are able to make significant differencesin the strata means for the parameters in question. An estimate of the size of the

Chapter 4

130

design effect can be made before the survey if you are able to estimate the likelymeans within each strata. Such information may come from the sourcesidentified in Section 4.6.1.

The use of complex sample designs does not always reduce the standard error ofthe estimates. Whenever cluster sampling (including multi-stage sampling) isemployed, then the design effect will generally be greater than one. As anillustration, consider once again the data presented in Table 4.2, and examine theresults obtained from the municipality of Belgrave. The number of households inthis municipality was 660, of whom 10% were chosen for the survey. In thediscussion so far, it has been assumed that these households were chosen in asimple random sampling manner. However, consider for the followingdiscussion that they were chosen as clusters, rather than individually. There mayhave been many practical reasons for this, but the usual one is to reduce fieldcosts when performing the survey. For example, the regular grid street patternwithin Belgrave may have conveniently divided the households consistently intogroups of six along one side of a street between adjacent intersections. To reducetravel costs and time between interviews at households, a decision may havebeen made to select 11 of these clusters and then perform the interview with allheads of households in the selected clusters (for the moment we are assuming a100% response rate to this survey). Thus we are selecting 10% of the clustersrather than 10% of the individual households to obtain our sample of 66households. Assume that the results obtained from the 11 clusters in Belgrave areas shown in Table 4.3, together with some values required for later calculations ofstandard errors.

The proportion favouring the airport upgrading in the entire Belgrave sample isobtained by dividing the number favouring it by the total number of householdsin the sample. To estimate the standard error of this estimate, it is useful to thinkof the survey not as a random sample of households but as a random sample ofclusters (i.e. change the sampling unit from household to cluster as described inSection 4.2), where the parameter being measured is the average proportion ofhouseholds within clusters favouring the airport upgrading. Since it is a simplerandom sample of clusters, each with a value of pi, then the standard error of themean of pi can be obtained by use of a modification of equation 4.2, such that:

Sampling Procedures

131

Table 4.3 Results of the Cluster Sample Airport Upgrading Survey

ClusterHouseholds in

Cluster,ni


pi (pi-p)2

1 6 3 0.500 0.0002 6 5 0.833 0.1113 6 2 0.333 0.0284 6 3 0.500 0.0005 6 4 0.667 0.0286 6 1 0.167 0.1117 6 4 0.667 0.0288 6 2 0.333 0.0289 6 6 1.000 0.250

10 6 1 0.167 0.11111 6 2 0.333 0.028

TOTAL 66 33 0.723

s.e.(pi,clust) = !M-mM !.!

Sc2

m ! (4.39)

where we have drawn a sample of m clusters from a population of M clusters,and where Sc2 is the standard deviation of the cluster values pi. This standarddeviation may be estimated from the sample results as:

Sc2 = 1m-1 Â

i=1

m(pi!-!p)2! (4.40)

In this case, m=11, M=110, and Sc2 = 0.723/10 = 0.072. Hence,

s.e.(pi,clust) = !100110!.!0.072

11 ! (4.41)

= 0.0059 = 0.077

If the results for Belgrave had been based on a simple random sample ofhouseholds, then the standard error would have been given by:

s.e.(pi,srs) = !N-nN !.!p(1-p)

n-1 ! (4.42)

= !594660!.!0.5(1-0.5)

65 !

= 0.059

The design effect for this cluster sample of households within Belgrave istherefore given as:

Chapter 4

132

Design Effect = (0.077)2

(0.059)2 = 1.70 (4.43)

Such a value for the design effect in the range from 1 to 2 is typical for a clustersample, although by no means universal. Just as the design effect for a stratifiedrandom sample depends on the degree to which the selected strata arehomogeneous, so too the design effect for cluster samples depends on the degreeof homogeneity within the cluster. It is more than likely that members of acluster are more like each other than they are like others outside the cluster. Inour airport upgrading example, it is likely that members of households living onthe same side of the street on the same block may well have discussed the issueof the airport upgrading, or for many other reasons may have come to a similarconclusion based on where they live. The degree to which the members of acluster are similar is measured by the intra-cluster correlation. This correlationmeasure (r) can be integrated into our considerations of design effect by notingthat for a cluster sample with C members per cluster:

Varclust ≈ Varsrs[1 + (C-1)r] (4.44)

Since the design effect is given as the ratio of the cluster sample variance to thesimple random sample variance, then:

Design Effect = [1 + (C-1)r] (4.45)

Several points arise from this equation which are worth noting:

(a) if C = 1, that is if each cluster contains only one member, then thedesign effect is equal to one since there is essentially no clusteringtaking place, and the cluster sample degenerates to a simple randomsample;

(b) if r = 0, then there is no intra-cluster correlation, in that themembers are just as similar to members of the population at largeas they are to members of their own cluster. In such a case, eachcluster can be seen as just a random mini-sample of the populationat large, and the summation of these clusters will simply be thesame as a simple random sample, as shown by the calculated designeffect being equal to one; and

(c) so long as the value of r is positive (and this is almost always thecase), the design effect will be greater than one. The greater thesimilarity between members of a cluster, the higher will be the valueof r, and hence the higher will be design effect.

Sampling Procedures

133

In the airport upgrading example, the design effect was calculated from the dataas being equal to 1.70. Given a cluster size (C) of 6, this results in a value of r of+0.14. This is a typical value of intra-cluster correlation, although it can sometimesbe far greater.

The differences in design effect for stratified sampling and cluster sampling areworth noting. Both come about because of homogeneity within the strata orclusters, but the effect is in different directions. Thus greater homogeneity withinstrata results in a greater reduction in the design effect for stratified sampling, butgreater homogeneity within clusters results in a greater increase in the designeffect for cluster sampling. The difference lies in the fact that whereas we wish toinclude all the strata in a stratified sample, we wish to deliberately exclude someof the clusters in the cluster sample.

The concept of cluster sampling can be further extended, and the effects onsampling error mitigated to some extent, by employing the concept of multi-stage sampling in which a sample of sub-clusters is selected from within theoriginally selected clusters. In this way, the design effect can be reduced byspreading the final sample over a greater number of clusters. It is also possible totrade-off the effects of clustering and stratifying by the use of a stratified multi-stage sample, although usually this still results in a design effect greater than onesince the clustering effect on sampling error is generally greater than the effect ofstratification. Further details on the calculation of design effects may be found inMoser and Kalton (1979).

4.7.3 Replicate Sampling

The use of design effects gives only a rough approximation to the expectedstandard error and would be used mainly when trying to estimate sample sizesbefore the data is collected. Once the data has been collected, however, one canemploy the methods of replication to directly calculate standard errors. Drawingon the idea expressed earlier, whereby the standard error is simply the standarddeviation of the distribution of estimates of the mean from repeated sampling, itis possible to draw a number of independent samples from the population thusenabling the standard error to be estimated directly. When two separate samplesare drawn, this method is sometimes called a paired selection design (Kish, 1965).When more than two samples are drawn, this is often referred to as replicated orinterpenetrating sampling (Deming, 1960). The idea of replication has been usedin transport surveys in the 1974 Survey of Seat Belt Utilization conducted forTransport Canada, in a study reported by Ochoa and Ramsey (1986), and intravel surveys conducted in Australia by the Transport Research Centre(Richardson and Ampt, 1994; Richardson and Cuddon, 1994).

With replicated sampling, sample estimates of the mean (or for that matter anyother parameter) can be calculated for each sub-sample, and the variation of

Chapter 4

134

these estimates is an indication of the sampling error which would have beenobtained from the total sample. The calculated sampling error includes the effectsof all elements of the sample design, and hence replication techniques can be usedfor sample designs of any degree of complexity.

The number of replications to be used is open to question. Choosing a few largesub-samples means that the variation within each sub-sample is well estimated,but the variation between sub-samples is based on only a few observations. Onthe other hand, choosing a large number of small sub-samples means that thevariance within the sub-samples is based on only a small sample size. Somewherea compromise needs to be struck, using a moderate number of moderately sizedsub-samples. One guideline that is often adopted is that originally proposed byDeming (1960), whereby he recommends the use of ten sub-samples. Thismethod has the added advantage that it is easy to choose the sub-samples bysub-dividing the entire sample based on the last digit of the case identificationnumber, i.e. all households with identification numbers ending in "1" belong insub-sample 1, all those with numbers ending in "2" belong in sub-sample 2 and soon.

If a total sample is selected by choosing r independent samples of equal size, andif zi is the estimate of a parameter for the ith sample, then the overall estimate ofthis parameter is given by:

z = 1r Âzi (4.46)

and the standard error of z is estimated by:

s.e.(z) = !sz2

r ! (4.47)

= !Â(zi!-!z)2

r(r-1) ! (4.48)

where sz is an estimate of the standard deviation of the sub-sample estimates ofthe parameter in question.

Suppose, for example, in the airport upgrading example introduced in theprevious section that the total sample of 300 households was collected by meansof 10 separate and random sub-samples with 30 households in each sample. Theresults from the 10 sub-samples is shown in Table 4.4.

Sampling Procedures

135

Table 4.4 Results of the Replicated Sample Airport Upgrading Survey

Replication Households inReplicate


zi (zi-z)2

1 30 18 0.600 0.01002 30 21 0.700 0.00003 30 23 0.767 0.00444 30 18 0.600 0.01005 30 25 0.833 0.01786 30 19 0.633 0.00447 30 24 0.800 0.01008 30 19 0.633 0.00449 30 22 0.733 0.0011

10 30 21 0.700 0.0000TOTAL 300 210 0.0622

Given that the standard deviation of the zi values is calculated as 0.083, then thestandard error of the mean may be estimated from equation (4.47) as:

s.e.(z) = !0.0832

10 ! (4.49)

= 0.0263

The standard error of the mean may also be estimated from equation (4.48) as:

s.e.(z) = ! 0.062210(10-1) (4.50)

= 0.0263

This value of 0.0263 compares favourably with the value of 0.0265 estimated inequation (4.34) for the case of simple random sampling. Such agreement is not,however, necessary and simply reflects the fact that in this particular case theassumptions underlying the analytical calculation of equation (4.34) appear tohave been approximated in this particular data set.

One obvious problem with this approach, however, is that by selecting a smallnumber of independent samples, the estimate of the standard error could be, initself, subject to considerable sampling error. This could be overcome to someextent by drawing a larger number of independent samples. However if the sizeof the sub-samples were to remain at 30, this would result in an excessivelyexpensive total sample. Conversely, if the total sample size was to remainconstant at 300, then the size of the sub-samples would decrease and this maycreate problems within each sub-sample (e.g. with respect to the number ofobservations within stratification cells). In order to overcome the problem ofsufficient replicates of sufficient size, several concepts of half-sample replicationhave been proposed as described below.

Chapter 4

136

4.7.3.1 Half-Sample Replication

When working with a stratified sample, one can employ the idea of half-samplereplications to generate a large number of "pseudo-replicates". The basic idea insuch a scheme is to draw two independent samples within each stratum. Bycombining the half-samples within each stratum with the half-samples in theother stratum, in all possible ways, it is possible to generate a large number ofdifferent sub-samples. This concept is also termed "inter-penetrating samples" byDeming (1960). For example, consider a stratified sampling procedure where twoindependent selections are made in each stratum, and where each stratum isassigned a weight W when estimating the population parameters from the stratavalues. Let the population and sample characteristics be denoted as follows (forthe reader not interested in the following explanation, you may skip ahead to theconclusion following equation (4.60) without loss of meaning):

Table 4.5 Sample Data for Half-Sample Replication Example

Stratum Weight Populationmean

Populationvariance

Sampleobservations

Sample mean

1 W1 Y1 S12 y11, y12 y12 W2 Y2 S22 y21, y22 y2... ..... ..... ..... ............ ....... ..... ..... ..... ............ ....... ..... ..... ..... ............ ....h Wh Yh Sh2 yh1, yh2 yh... ..... ..... ..... ............ ....... ..... ..... ..... ............ ...... .... ..... ..... ............ ....L WL YL SL2 yL1, yL2 yL

An unbiased estimate of the population mean Y, (yst), is:

yst = Â1

L(Wh.yh) (4.51)

The ordinary sample estimate v(yst) of the variance of the mean V(yst) is givenby:

v(yst) = 12 Â

1

L(Wh

2.Sh2) (4.52)

= 14 Â

1

L(Wh

2.dh2) (4.53)

where dh = (yh1 - yh2).

Sampling Procedures

137

Under these circumstances, a half-sample replicate is obtained by choosing one ofy11 and y12, one of y21 and y22, ........, and one of yL1 and yL2. The half-sampleestimate of the population mean , (yhs), is:

yhs = Â1

L(Wh.yhi) (4.54)

where i is either 1 or 2 for each h. There are 2L possible half-samples, and it iseasy to see that the average of all half-sample estimates is equal to yst. That is ,for a randomly selected half-sample:

E(yhs| y11,y12,..........,yL1,yL2) = yst (4.55)

where E(yhs| y11,y12,..........,yL1,yL2) is the expected value of yhs given a samplerandomly drawn from the population consisting of y11,y12,...,yL1,yL2.

If one considers the deviation of the mean determined by a particular halfsample, for example yhs,1 = Â(Wh.yh1) , from the overall sample mean, the resultis obtained that:

(yhs,1 - yst) = Â1

L(Wh.yh1) - 12 Â

1

L(Wh.(yh1+yh2))

= 12 Â

1

L(Wh.(yh1-yh2))

= 12 Â1

L(Wh.dh) (4.56)

In general, these deviations are of the form:

(yhs-yst) = 12 (+W1d1 + W2d2...........+ WLdL) (4.57)

where the deviation for a particular half sample is determined by making anappropriate choice of a plus or minus sign for each stratum. In the example givenabove, each sign was taken as plus indicating that the first of the two stratumvalues was used to form the half-sample. The squared deviation from the overallsample mean is, therefore, of the general form:

Chapter 4

138

(yhs-yst)2 = 14 Â

1

L(Wh

2.dh2) + 12 Â

1

<k(+WhWk.dhdk) (4.58)

where the plus or minus signs in the cross-product summation are determinedby the particular half-sample that is used (plus signs correspond to using the firstvalue in the stratum, minus signs indicate that the second value was used).

If the squared deviations of a half-sample estimate from the overall sample meanare summed over all possible half-samples, then it is possible to demonstrate thatthe cross-product terms appearing in the separate squared deviations cancel oneanother. Thus, for a randomly selected half-sample:

E[(yhs-yst)2 | d1,d2,.........,dL] = 14 Â

1

L(Wh

2.dh2) = v(yst) (4.59)

Since v(yst) is known to be an unbiased estimate of the true variance V(yst), if onetakes expected values over repeated selections of the entire sample, we have theresult that:

E(yhs-yst)2 = 12 Â

1

L(Wh

2.Sh2) = V(yst) (4.60)

Thus although we know that the 2L sub-samples are not independent, becausethey contain many common elements, it is possible to eliminate the effects ofthese covariances by taking the average variance across all of the possible half-sample combinations. This, however, poses a potentially serious computationalproblem. The number of possible sub-samples is known to be 2L, where L is thenumber of strata. In a typical sample survey project, the number of stratificationcells could easily be between 20 and 40. In this case, the total number of pseudo-replicates would be between 220 and 240 (i.e. between about 1 million and 1trillion). Clearly, the task of calculating the variance estimates for several millionpseudo-replicates would be excessive. What is needed is a method by which wecan obtain most of the benefits of pseudo-replication without the excessivecomputational costs involved. Two methods have been proposed for thispurpose.

The first method is to choose a random sample of pseudo-replicates from theavailable population of 2L. Clearly, taking a larger sample of pseudo-replicateswill more nearly eliminate the effects of pseudo-replicate covariance. Techniqueshave been derived to determine appropriate sample sizes under these conditions.A more elegant technique, however, employs the concept of balanced half-sample replicates.

Sampling Procedures

139

4.7.3.2 Balanced Half-Sample Replication

As noted above, taking different samples of pseudoreplicates will introducevariability into the estimates of half-sample variance because of the covariancebetween the half-samples. These covariances are represented by the cross-product terms involving dhdk in equation.(4.58). These cross-product terms cancelone another over the entire set of 2L half-samples, or when one uses an "infinite"number of half-sample replications. The question then arises as to whether onecan choose a relatively small subset of half-samples for which these terms willalso disappear. If this can be done, then the corresponding half-sample estimatesof variance will contain all the information available from the total sample.

A simple example will show that it is possible to select a subset of half-samplereplications that will have the desired property. Consider a three-strata situationwith observations (y11,y12), (y21,y22) and (y31,y32). There are (23=) 8 possible half-sample replicates. Now consider the following subset of four replicates:

Table 4.6 Sample Data for Balanced Half-Sample Replication

Replicate Stratum1 2 3

Deviation from Mean(yhs,i - yst)

1 y11 y21 y31 (1/2)(+W1d1 +W2d2 +W3d3)2 y11 y22 y32 (1/2)(+W1d1 -W2d2 -W3d3))3 y12 y22 y31 (1/2)(-W1d1 -W2d2 +W3d3))4 y12 y21 y32 (1/2)(-W1d1 +W2d2 -W3d3))

The signs of the separate terms in the deviations are determined by the definitionof dh = (yh1-yh2). It is, of course, immaterial how the two observations within astratum are numbered originally. Once the numbering is set, however, as in thefirst replicate, it is maintained in determining the remaining replicates. If thesedeviations are squared, the first part of each expression is W1

2d12/4 + W2

2d22/4

+ W32d3

2/4 , which is the desired estimate of variance. The second part of eachexpression contains the cross-product terms, and it can easily be checked that allthese cross-product terms cancel when the squared deviations are added over thefour replicates. This follows from the fact that the columns of the matrix of signsin the deviations are orthogonal to one another. Thus this set of balanced half-samples can be identified as:

+ + ++ - -- - +- + -

where a plus sign indicates yh1, while a minus sign denotes yh2. Notice that thisparticular set of replicates has the property that each of the two elements in a

Chapter 4

140

stratum appears in half the samples. Thus, the mean of the replicates is anunbiased estimate of the mean of the population, and because of the nature ofthe "cross-product balance" the variance estimate is also unbiased and unaffectedby the correlations inherent in the composition of the individual half-samples.

If one wishes to obtain a set of half-samples that will have this feature of "cross-product balance", for any fixed number of strata, then it becomes necessary tohave a method of generating matrices of + and - signs whose columns areorthogonal to one another. A method is described by Plackett and Burman (1943-46, p.323) for obtaining k x k orthogonal matrices, where k is a multiple of 4.Suppose, for example, that we have 5,6,7 or 8 strata. The Plackett-Burmanmethod produces the following 8 x 8 matrix, which is the smallest that can beused for these cases because of the multiple-of-4 restriction. The rows identify ahalf-sample, while the columns refer to strata.

+ - - + - + + -+ + - - + - + -+ + + - - + - -- + + + - - + -+ - + + + - - -- + - + + + - -- - + - + + + -- - - - - - - -

Any set of 5 columns for the 5 strata case (or in general, n columns for the nstrata case) defines a set of eight half-sample replicates which will have theproperty of "cross-product balance". If it is necessary to use the eighth column,the resulting set of half-samples will not have each element appearing an equalnumber of times. This will not destroy the variance estimating characteristics ofthe set of half-samples, but it does mean that the average of the eight half-samplemeans will not necessarily be equal to the overall sample mean. When thenumber of strata is a multiple of four, it may then be wise to use the next highestmultiple of four as the number of half-samples.

Since orthogonal matrices of plus and minus ones can be obtained whenever theorder of the matrix is a multiple of four, it is always possible to find a set of half-sample replicates having cross-product balance. It follows that the number ofhalf-samples required will be at most four more than the number of strata.Consider, for example, if the survey design were to be based on 21 stratificationcells. In such a case, it would be necessary to use 24 half-sample pseudoreplicates.A possible set of balanced half-samples for this particular case is shown below.This design was obtained by using the first 21 columns of the construction givenin the Plackett-Burman paper. Any two columns of this design are orthogonal,and each element appears in 12 of the 24 replicates. The entire pattern isdetermined by the first column. The 2nd column is obtained from the 1st by

Sampling Procedures

141

moving each sign down one position and placing the 23rd sign at the top of thesecond column. This rotation is applied repeatedly to obtain the remainingcolumns. The 24th position is always '-' and is not involved in the rotation.

Half StratumSample 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 1 + - - - - + - + - - + + - - + + - + - + +

2 + + - - - - + - + - - + + - - + + - + - +

3 + + + - - - - + - + - - + + - - + + - + -

4 + + + + - - - - + - + - - + + - - + + - +

5 + + + + + - - - - + - + - - + + - - + + -

6 - + + + + + - - - - + - + - - + + - - + +

7 + - + + + + + - - - - + - + - - + + - - +

8 - + - + + + + + - - - - + - + - - + + - -

9 + - + - + + + + + - - - - + - + - - + + -

10 + + - + - + + + + + - - - - + - + - - + +

11 - + + - + - + + + + + - - - - + - + - - +

12 - - + + - + - + + + + + - - - - + - + - -

13 + - - + + - + - + + + + + - - - - + - + -

14 + + - - + + - + - + + + + + - - - - + - +

15 - + + - - + + - + - + + + + + - - - - + -

16 - - + + - - + + - + - + + + + + - - - - +

17 + - - + + - - + + - + - + + + + + - - - -

18 - + - - + + - - + + - + - + + + + + - - -

19 + - + - - + + - - + + - + - + + + + + - -

20 - + - + - - + + - - + + - + - + + + + + -

21 - - + - + - - + + - - + + - + - + + + + +

22 - - - + - + - - + + - - + + - + - + + + +

23 - - - - + - + - - + + - - + + - + - + + +

24 - - - - - - - - - - - - - - - - - - - - -

The use of replication, random half-sample replication, or balanced half-samplereplication provides a means whereby the variability in a sample estimate can beobtained from samples of any degree of complexity. It should be noted,however, that this estimate of variability can be obtained only after the samplehas been drawn and the data has been collected. Therefore, replication methodscannot be used to assist in the determination of the sample size. However,replication methods can be used on existing samples of various designs to enablethe calculation of design effects which can then be used in sample design asoutlined in the previous section.

Chapter 4

142

There are many other ways of estimating variance directly, including variousforms of "jack-knifing" techniques (Brillinger, 1966; Effron, 1981,1983). Theessence of jack-knifing is to remove one observation from the sample andrecalculate the mean of the resultant sample. By choosing which observation toremove in a systematic manner, akin to the orthogonal matrix techniquedescribed for replication, the mean and variance of the jack-knife samples can bemade to provide unbiased estimates of the population parameters.

4.8 DRAWING THE SAMPLE

The final stage in the sampling process is the actual drawing of a sample from thesampling frame. In some cases (for example, with systematic sampling), theprocedure is very simple and can be easily automated, either!in the office or inthe field (although care should be taken to ensure that the sampling procedure isadhered to in the field). In most situations, however, the sample should be drawnby reference to a random process.

Ideally, the process should be truly random i.e. independent events with eachoutcome being equally probable. However, the only true random processes arestrictly physical events such as tossing a coin or rolling a die. In most situations,such random processes are too time consuming to be useful in sample selection.We must, therefore, resort to some form of "pseudo-random" process which canquickly and easily generate a set of random numbers for use in sampling.

Two forms of such random number generation processes are commonly used;look-up tables and recursive mathematical equations. The use of tables ofrandom numbers is widespread and for this purpose several publications havecompiled tables of random numbers. The most well-known of these is "Rand'sOne Million Random Numbers" (The Rand Corporation, 1955) although there areseveral other compilations (Owen, 1962; Kendall and Smith, 1939). In addition,most statistics textbooks contain a reproduction of part of one of the morecomplete compilations. A table of random numbers is included as Appendix B inthis book.

In using random number tables to select a sample, the first step is to number allsampling units in the sampling frame. The order of this numbering is immaterial,and should be done for maximum convenience. The next step is to pick a startingpoint in the table of random numbers to be used (anywhere!will do) and thensystematically work through the table until the required number of randomnumbers has been selected. In sampling without replacement, the final sample ofrandom numbers must contain no replications. In using tables of randomnumbers, any selection procedure is acceptable so long as it is systematic. Forexample, numbers may be read down or across the page, from left to right orright to left. In addition, numbers may be truncated in any way to obtain

Sampling Procedures

143

numbers in the desired range, e.g. if the tabulated numbers are in five digitgroups, as in Appendix B, and the user wants random numbers between 0 and 99then either the first two or the last two (or any other two) digits of the five digitgroup may be used. The numbers may also be modified systematically to obtainnumbers in the desired range, provided that no bias is introduced. For example,if random numbers in the range 0 to 39 are required, the user could samplenumbers between 0 and 99, discarding numbers greater than 39. This procedure,while not introducing bias, is, however, wasteful of random numbers. A moreefficient procedure is to subtract 40 from those numbers in the range 40 to 79 andthen use the result as a valid random number in the range 0 to 39. Thosenumbers from 80 to 99 would still need to be discarded because they do not havethe same range as the required numbers and hence, if included (by subtracting80), would bias the selection towards numbers in the range 0 to 19.

Sometimes, tables of random numbers are not readily available for use in thefield. In such situations, it is useful to be aware of other, more readily availablesources of random numbers. One source which is almost universally available is atelephone directory. Whilst not recommended for large scale surveys, atelephone directory does provide a useful source of random numbers if usedwith due care. Thus, for example, random numbers could be selected by takingdigits in the right-hand columns of the telephone number. For most telephonedirectories a maximum of four digits should be selected from any one telephonenumber (otherwise the biasing effect of a limited number of telephone exchangecodes can become significant). Within these restrictions, the principle is again tochoose numbers systematically. For example, choose a page and column atrandom, and then read down the list of numbers in that column.

The major difference in using a telephone directory to obtain random numbers,as opposed to using a specific table of random numbers, is that whereas the tableof random numbers has been checked for randomness before publication, thenumbers obtained from the telephone directory carry no such guarantee. It istherefore necessary to ensure that the numbers obtained are indeed(approximately) random, by means of a series of simple checks. As a minimum,three tests should be conducted: first, plot the frequency distribution of the listednumbers and perform a goodness-of-fit test e.g. Kolmogorov-Smirnov test;second, calculate the mean and standard deviation of these numbers; and third,perform a runs test to check whether the listed numbers are ordered non-randomly. The use of these tests is described in Fishman (1973) and Knuth (1969).

Whilst the above methods of using "look-up" tables are convenient when one isdrawing a relatively small sample, they become rather cumbersome to use whenone is drawing a large sample, or drawing repeated samples, especially whenusing a computer. The storage space required to store tables of random numberscan be quite large, especially when large independent repeated !samples are

Chapter 4

144

required. For that reason, use is often made of truly "pseudo-random" numberswhich are generated by a .recursive mathematical equation. The use of anequation to generate random numbers may at first appear to be in directcontradiction of the concept of random numbers because each pseudo-randomnumber so generated is completely determined by its predecessor and,consequently, all numbers are determined by the initial "seed" number. !Whilethis is true, the critical point is that the random numbers so generated can passthe statistical tests for uniformity and independence required of truly randomnumbers. They are therefore indistinguishable from real random numbers.

While there exist a number of different random number generator algorithms(see Knuth, 1969) the most common type is the linear congruential method. Inthis method, numbers are generated by an equation of the form:

xi = (axi-1 + c) mod m (4.61)

where x0 = an initially specified seed number

Expanding equation (4.61) such that the modulo notation (mod m) is removed,we obtain:

xi = axi-1 + c - ÎÍÈ

˚˙axi-1!+!c

m . m (4.62)

where [ ] = integer portion of value inside brackets.

To convert the integer number obtained from equation (4.62) to a randomnumber within a smaller specified range (A,B), the following transformation isapplied:

xi(A, B) =xim . (B-A) + A (4.63)

The critical factor in the use of such random number generators is to chooseappropriate values of a, c and m. The selection of these values will depend on thecomputer being used to perform the calculations. Generally, each computer willhave a random number generator routine with appropriate values of a, c and mwhich have been found to give satisfactory statistical results on that computer.One problem often noted with pseudo-random number generators is that,because they are deterministically calculated, then once a number is repeated anentire sequence of repetitions must follow. Obviously no more than m differentrandom numbers can be generated, although the period p can be substantiallyless than m with inappropriate choice of a and c. Specific rules apply formaximising p (see Fishman, 1973). However, this problem of periodicity is moreof a concern when using pseudo-random numbers in discrete event simulationmodelling than it is in survey sampling procedures. Only when the sample size is

Sampling Procedures

145

very large might periodicity become a problem (and even then it can be avoidedby suitable choice of a, c and m).

Following selection, and checking, of a set of random numbers by one of theabove methods, it then remains to select those units on the sampling frame withthe corresponding number and then include them on the sample list for use inthe survey.



5. Survey Instrument Design

5.1 MORE TRADE-OFFS IN TRANSPORT SURVEY DESIGN

The survey process described in this book has different levels of precision at eachof its many stages. On the one hand, sampling theory, which was described in theprevious chapter, has been developed to a relatively high degree ofsophistication such that, with due care, sampling error can be reduced to pre-specified acceptable levels of precision.

The same cannot be said, however, about the design of the survey instrument.The challenge of this stage of the survey process is dealing with the fact that it ismuch more of an art form than a science. Indeed, the design of survey forms fitsvery neatly the definition of an art form as "an established form of composition(e.g. a novel, sonata, sonnet, etc.)" (Concise Oxford Dictionary). Here we have anart form which, like a sonata, has rules which must be followed to ensure the bestpossible results, but which is free to adapt to the constraints imposed by theoverriding objectives of the study and to the skills of the survey designer.

Nonetheless very similar types of errors which are discussed in Chapter 4 canapply when talking about survey instrument design. Our designs of surveyinstruments can result in errors of variability (since this make the results non-repeatable, it is rather like sampling error) and bias. In other words, ourinstrument design can result in:

Chapter 5

148

• data which gives unacceptable levels of variability when measuringthe respondent's real behaviour, or real attitudes (we will call thisinstrument uncertainty); and/or

• data which is not at all what we are trying to measure (we call thisinstrument bias).

The first of these errors - instrument uncertainty - occurs, for example, when poorsurvey instrument design leads a group of respondents who actually all behavedin the same way to report a variety of different behaviours. This mostly happenswhen bad design means that respondents understand the same question indifferent ways. For example, the designer may want respondents to reportlinked trips, but the poor (or untested) design means that some people reportunlinked/staged trips (too many trips in this case), and some people reportjourneys (too few trips).

The second type of error which is a result of poor survey instrument design -instrument bias - occurs when the questions actually lead to the wrong answers.An example of this would be a questionnaire design which led to peoplereporting their travel of more than, say, 10 minutes to places which they wentregularly (like to school and work) very accurately, but often forgetting all aboutsmall trips to irregular activities (like the corner shop). These examples ofinstrument bias (sometimes called systematic bias) are serious because, withoutcareful testing, they often go completely unnoticed. In our example, the resultsmay show that very few people make short trips within their local areas, with theresult that there may seem to be no need to improve pedestrian or bicyclefacilities; an improvement which might have encouraged the use of these facilities- possibly giving economic, environmental and even social benefits to localcommunities.

Before proceeding with a detailed discussion of survey instrument design, it maybe useful to review briefly the marksmanship analogy from Chapter 4 in thecurrent context (Figure 5.1). Again, the first of the two sources of error -instrument variability - is analogous to imprecision. This occurs when the surveyinstrument asks the same question and gets a range of answers even when peoplehave actually behaved in the same way .

On the other hand, if people are answering in the same (but incorrect) way(always telling us about longer, more regular trips), the resultant instrument biasmeans that we may be recording very precise (repeatable) data, but unless wecarry out careful testing, we are not sure if the data is actually accurate!

Survey Instrument Design

149


PRECISE

IMPRECISE

Figure 5.1 Questionnaire Imprecision (Variability) and Inaccuracy (Bias)

Unless the measurements being made are valid, it does not really matter howprecise those measurements are.

Relative to sampling error (uncertainty) and sampling bias (Chapter 4), very littleresearch has been carried out on instrument uncertainty and bias. One of themain reasons for this is that instrument errors are not often obvious - either tothe survey designer or to the analyst. While there are no fixed rules for designinga survey instrument, there are sound principles which have been derived fromsome controlled experiments as well as from recorded experience. Application ofthese principles, in conjunction with a good deal of common sense pertaining tothe problem at hand, should result in a design which actually measures what theinvestigator set out to measure.

There are probably two important reasons why the art of survey instrumentdesign has not progressed as far as that of sample design. When we designsurvey instruments we are often reluctant:

• to test the instruments; and

• to admit that someone else may understand the questions we designin a different way than we do ourselves.

It is possibly the latter point that makes us so reluctant to execute the former.This chapter is designed to make you more willing to test survey instruments by

Chapter 5

150

giving you an insight into the rewards of better understanding human behaviour(travel behaviour, in particular, in our case), and hence being able to be confidentabout the survey results you get - not just dismissing anomalies as "surveyproblems", but accepting them as being extremely likely to represent actualtravel behaviour and attitudes.

5.2 SCOPE OF THIS CHAPTER

Throughout this chapter we will refer to "questionnaires". More correctly weshould continue referring to "survey instruments", since the form of thequestionnaire will vary depending on the survey method being used. Thus self-completion surveys, telephone interviews, household interviews, and interactivegroup interviews will all require some means by which the data is to be recorded.However, the details of these data recording techniques will vary considerably.

Despite the many different types of questionnaires, survey instrument designtechniques can essentially be divided into:

• those which design for completion by the respondent and

• those which design for completion by a trained person (interviewer).

The major difference between these two methods is that, in the former, therespondents read and answer the questions by themselves whereas in interviewsurveys, the respondent is assisted by an interviewer who, by and large, readsthe questions and records the answers. Obviously, there is a much greater onuson clear questionnaire design when an interviewer is not present. This is because,while it is possible to train a small number of interviewers to deal withconsiderable complexities in survey design, it is clearly not possible to expect thesame of a large number of unknown respondents. However, since interviewerbias (all error associated with the use/presence of an interviewer) can beexacerbated by poor questionnaire design and layout, making it difficult forinterviewers to conduct a survey without error, even personal interviewquestionnaires need to be designed with great care.

As with other components of the survey process, it is instructive to examinequestionnaire design within the context of a number of specific factors which areimportant for the overall task. With respect to questionnaire design, the principalfactors which will be addressed are:


151

(a) Questionnaire content;

(b) Physical nature of forms;

(c) Question types;

(d) Question format;

(e) Question wording;

(f) Question ordering;

(g) Question instructions.

The aim of this chapter is to present a number of general principles, and somespecific recommendations, with respect to each of these key factors.

5.3 QUESTIONNAIRE CONTENT

This section deals with the issue of deciding exactly what information needs to becollected by a transport survey instrument. The discussion centres around thequantity of data being collected and therefore focuses on the length of the surveyinstrument. In addition, since gathering information on travel behaviour is oftenan integral part of a transport survey, particular emphasis is given to ways inwhich this type of data (about trips) is best collected.

In deciding on what information to include in the questionnaire, there are threebasic guidelines:

• the data must be relevant to the purpose of the survey;

• the data must be reliable, i.e. the same results would be gained if thesurvey was replicated - this minimises instrument uncertainty;

• the data must accurately represent what is being examined - thisminimises instrument bias.

5.3.1 Length of the Questionnaire

Already in the preliminary planning stage (Section 2.7), we should haveconstructed a wish-list of survey content. The task at this stage is to whittle downthis list to those items which are particularly relevant. Specifically, the surveydesigner should now derive an explicit rationale for each item in the survey,covering not only why the information is needed but also how the data obtainedis to be analysed. As noted in Chapter 1, this requires a backward linkage fromthe coding and analysis phases of the survey process. Oppenheim (1992) suggeststhat one way of testing the adequacy of the design (including content of thequestionnaire) is to run through the natural sequence of survey stages in reverse

Chapter 5

152

order. For example, if we expect at the end of the survey to be able to showwhether men make more trips to shopping than women, we would need somecross tabulations which related gender to trip rate. At this point we would haveto draw up some dummy tables, showing the relevant variables cross-tabulatedwith certain sub-groups, for example, gender. In order to generate these tableswe must have asked questions about trip making as well as about shopping (atthe level of detail required in the cross-tabulations) and we must also know thesex of each of our respondents.

Having identified the need for various items of data, the final selection of itemswill be the result of a trade-off between:

• the expressed needs (i.e. the wish-list of questions to be covered);

• the survey resources (in terms of money, personnel and so on); and,

• the effect of survey length on response rates and validity of responses.

While there are always exceptions to the rule, it is generally agreed that, onceabove a critical-mass survey length, longer questionnaires and/or interviewsresult in poorer response rates, completion rates and/or response validity. Thusfewer people respond to longer questionnaires and, if they do respond, they maynot complete all the questions. Even if they do complete the questions, thevalidity of responses to questions at the end of the questionnaire may bedubious.

An interesting way of asking more questions than would usually be possible inan on-board-survey is described by Sheskin and Stopher (1982b). Here they usedthe duel survey mechanism described in Section 3.2.5. which collected someinformation via self-completion on-board a bus, and gave respondents a furtherquestionnaire to take home and mail back. This resulted in a significant increasein response rates, and hence decrease in sample bias.

The issue of trading off the wish-list vs. survey resources vs. effect of lengthpresents an interesting idea for the way in which an experimental design may beintroduced into a pilot study. For example, it would be possible to test the effect(in terms of resources and response rates) of having, say, 50 questions vs. 20questions. This could be done by testing two questionnaires in parallel - one withthe 50 "dream" questions and one with the 20 "absolutely essential" questions.Analysis of the results would then indicate any variation in response ratesbetween the two designs, and, together with a comparison of the costs of eachmethod it would be possible to make an informed decision on which method touse.


153

In economic terms, what one is faced with is an elasticity of response with respectto questionnaire length. The decision to be made is whether increasedquestionnaire length will provide more or less useful information overall(compare this problem with that of a transit operator who wishes to knowwhether increasing or decreasing fares will increase revenue). While betterquestionnaire design will improve response (for any length questionnaire) thereis always a limit to the amount of information which can be sought in anyquestionnaire.

Given that the desired amount of information will often be greater than thepossible amount of information which can be collected (based either on aresponse rate or resources argument), it is necessary to assign priorities to theitems of information. Thus the items should be ranked in order of importance tothe study in question (realising that some items of information may becomplementary). The final selection should then be made in accordance with theavailable resources and the diminishing marginal value of extra items ofinformation.

5.3.2 Relevance of the Questions

In making this final selection of which questions to include in the survey, twovital factors should be considered. First, in dealing with human populations theinformation sought should not only be relevant to the study purposes but shouldappear to be relevant to the respondent. Special care (and pilot testing) needs to beused because, to us as survey designers, it is always perfectly obvious why aquestion is included. After all, we have done the preliminary planning, set ourobjectives and even worked backwards to ensure that the questions we ask arethe right ones for the analysis. Hence we know, for example, that the "usualactivity" of a respondent (such as whether the respondent works, studies, isretired, etc.) is an important variable influencing travel behaviour. However, ifthe respondent sees (or hears) a questionnaire on travel behaviour beginningwith a question about what they usually do (without explanation), they cancertainly be excused for querying the relevance of this question to a travelsurvey. (A simple way of dealing with this particular case is shown in Figure5.16.)

If questions are not perceived to be relevant then a number of adverse effects arepossible. At the very least, the respondent may be annoyed at having to answer"irrelevant" questions. This may lead, in a personal interview survey, to adiminution in rapport between respondent and interviewer, and hence have alasting adverse effect on the interviewer. In later surveys, the interviewer mayomit or rephrase the question (perhaps thereby changing the intent of thequestion) in order to avoid annoying the respondent. More seriously, "irrelevant"questions (especially when their purpose is not adequately explained) can create a

Chapter 5

154

mistrust of the stated survey purpose and this may well lead to spurious andinaccurate answers from a wary respondent. In the long term, such mistrust canonly be to the detriment of all sample survey efforts as respondents becomereluctant to respond to any such surveys. It is therefore wise to restrict questionsto those which are, and can be explained to be, relevant to the immediate surveypurposes.

Of course, this reinforces the importance of ensuring that all survey staff who arelikely to have any dealings with respondents (primarily interviewers and thosepeople answering phone queries), know a great deal about the purpose of thesurvey. This is discussed further in Chapter 7.

5.3.3 Reasonableness of the Questions

Another factor to consider in defining the questionnaire content is whether it isreasonable to expect respondents to be able to answer questions we ask of them. Thisdifficulty applies equally to questions of opinion, knowledge and fact. It should,however, be considered that, in general, respondents find questions of fact"easier" than questions of opinion - since one is stating what has already occurred,while the other involves consideration of, or speculation about, an issue (whichmay or may not have been done by the respondent). Sections 5.6 and 5.7 discussthis in more detail.

It is unwise to assume that respondents will voluntarily admit ignorance. On thecontrary, they (like many of us!) will generally attempt to give some answereven if they are ill-informed or have never thought about the subject matterbefore. If first impressions or perceptions are all that the investigator is interestedin, then this may pose no serious problem. However, if the answer is to beinterpreted as being a considered response or a correct statement, thenrespondents should be given ample opportunity to admit their ignorancewithout fear of recrimination.

5.3.4 The Context of Questions about Trips

Many (perhaps most) surveys which gather information for transport planninghave data on people's trips as an "absolute essential" component of their wish-listfor questionnaire content. For this reason it is important to understand thecontext in which people can be asked about trip-making.

Day-to-day travel behaviour can be gathered in two contexts:

(1) Travel-only context - people can be asked to report details about theirtravel only. This will almost always give information on the purpose oftravel, which is usually an activity (see Preface), but no furtherinformation on activities is obtained.


155

(2) Activity context - here people are asked to report all activities in whichthey take part - both at home and away from home. Travel will be anatural part of these activities, and in this context is often seen as anactivity in itself. Research has indicated that this context not only givesdata on activities, but also results in a much more accurate recording oftravel (Clarke, Dix and Jones, 1981).

At the beginning of designing a survey instrument in which travel data is to becollected, it is essential to determine which of these approaches will be adopted.Each requires substantially different question types (Section 5.5) and each can bedone using different survey methods (Chapter 7). A further discussion of thesedetails appears in the relevant sections of this book.

5.3.5 Questionnaire Design to Maximise Trip Recording

Whichever of the above contexts are used in a travel survey, it is necessary torealise that travel outside the home can be recorded either by asking respondentsto recall what happened at a past time (recall technique), or by announcing torespondents in advance that they will have to report travel about a future time(announce-in-advance technique). Both of these methods have variations as wellas advantages and disadvantages.

The recall technique, by and large, generates the greatest error in reporting ofactual travel. This is particularly the case when respondents are asked simply toreport travel-only for a period of time in the past (Figure 5.2).

To test these problems for yourself, try remembering in detailwhere/when/how you travelled two days ago, or even yesterday! Earlyhousehold travel surveys used this technique almost exclusively. There are twoways to assist with getting better data from the simple recall technique. First,respondents could be assisted with prompts such as "Where did you go next?".Another improvement would be to ask respondents to think in terms of activitiesand not just trips. In either case, however, there remain severe disadvantages inaccurate reporting of trip data due to forgetting of travel.

Chapter 5

156

Figure 5.2 Recall-Only Method of Trip Reporting Source: Bureau of Transport Economics (1981)

The announce-in-advance technique has been shown to improve trip reportingconsiderably. The methodology involves contacting the respondent in one wayor another prior to the "Travel Day" (the day about which trip reporting shouldoccur). This means that respondents are more likely to either record travel as it ismade or at least to be alert to the fact that they will have to report their travel,and thereby pay more attention to details on the Travel Day. The methods donot totally exclude the possibility that respondents will sometimes fail to recordtheir travel, meaning that recall is their last resort, but it is generally limitedsignificantly by the survey method used.

This method gives respondent the opportunity to actively remember their travelpatterns and even to take notes on what they do. The latter option has led surveyresearchers to accompany the announce-in-advance method with a diary of sometype to facilitate the note-taking behaviour. In personal interview surveys, a verybrief diary (memory jogger) has been left at the pre-travel day visit (Ampt, 1981),and in self-completion surveys, although the questionnaires themselves are akind of diary, some researchers (notably Stopher, 1992) have sent an additionalmemory jogger to stimulate even better travel recording. Both of these methodsare usually accompanied by further assistance to ensure that travel and activitiesare not forgotten. The self-completion method can sometimes includes specificreminders such as "Did you go anywhere else after this?" (Figure 5.3) which helpsto reduce the chance that major trips will be forgotten.


157

Figure 5.3 Maximising Trip Reporting in a Self-Completion FormSource: Socialdata Australia (1987)

The personal interview method, on the other hand, can use prompts such as"And what did you do next?" (even while the respondent is referring to acompleted memory jogger) (Figure 5.4). This method is appropriate only forthe personal interview method, since while asking respondents to think in theframework of activities ("The next thing I did was bake a cake..") the activities arenot recorded - only the next trip ("..and I found I was out of flour, so I walked tothe shop"). Just the walk to the shop would be recorded.

Figure 5.4 Verbal Activity Recall Framework in Personal InterviewsSource: Ampt (1992b)

Chapter 5

158

Possibly the most thorough method of collecting comprehensive activity andtravel data uses the announce-in-advance method, and gives respondents activitydiaries (Figure 5.5) to carry with them throughout the day. As well as improvingthe recall of trips, activity diaries can also provide basic information on activitypatterns which can be used in assessing potential trade-offs between travel andactivities. The advantage of diary techniques is that they rely very little on therecall of past events (provided the dairies are filled in regularly). On the otherhand, they require a greater degree of cooperation from respondents and thereis also the possibility that the mere fact that a diary has to be filled in will affectthe phenomena that are being measured. Certainly, in activity diary surveys,there must be an entry each day entitled "Filled in activity diary!"

In this method, which is sometimes called the "verbal activity recall framework"(Ampt, 1981), relatively minor trips which involve returning immediately to thesame location (e.g. going to a local shop, going to lunch from work) are lesslikely to be forgotten as a coherent activity pattern is established.

Figure 5.5 Example of an Activity DiarySource: Jones et al., (1983)

Notwithstanding possible problems with the mechanics of diary completion, theconcept of placing trips within the total context of a daily travel or activity patternis one way of effectively minimising the problem of respondents not being ableto answer questions accurately through lapses of memory.


159

5.4 PHYSICAL DESIGN OF SURVEY FORMS

An often overlooked aspect of questionnaire surveys is the physical nature of theform on which the data is to be recorded. Careful attention to this matter,however, can often lead to more efficient job performance by respondent,interviewer and data enterer. In addition, in self-completion surveys, anattractive, professional appearance of the survey form will always lead to higherresponse rates. In fact, for self-completion surveys, the overall appearance of thesurvey form is of vital importance since it is the only point of contact with therespondent. While the cost of producing a high quality survey form is obviouslyhigher than for a single photocopied sheet of questions, it is money well spent,especially if the interest of the respondent needs to be aroused.

While more attention needs to be given to the physical nature of forms in self-completion surveys, it can never be disregarded altogether - even in personalinterview surveys. The following guidelines will help in the design of forms forpersonal interview surveys and self completion questionnaires; there are somespecial notes for intercept surveys.

For personal interview surveys, attention needs to be given to the followingpoints:

(a) It is first of all necessary to determine whether the respondent or theinterviewer will actually fill out the form. If the respondent is to do thewriting, then the form will need to be designed more as a self-completionquestionnaire, since the interviewer is merely there to give assistance inthe interpretation of questions. Normally, however, the interviewerreads the questions and records the answers on the form. It is in thissituation where the following comments are more applicable.

(b) Generally, the form should require a minimum amount of writing. If theinterviewer is required to do a lot of writing when recording responses,then it is quite likely that the attention of the respondent will be lostwhile they wait for the next question to be asked.

(c) The interviewer should be provided with a separate list of instructions foreach question so that they may guide the interview and provideinterpretations of questions to respondents. Generally, these detailedinstructions do not need to be included on every copy of the interviewform since they should be contained in the Interviewers' Manual (Section7.3). However, brief reminders of instructions and specific prompts canbe included at appropriate points within the form.

(d) The interviewers should be trained to give an introduction which explainsthe purpose and background of the survey. Usually ample training will

Chapter 5

160

mean that it is not necessary to write this on the questionnaire form,although notes may be useful as reminders of points to cover.

(e) The form should contain a detailed flow chart of the sequence ofquestions, by means of arrows or Go to instructions, especially whenthere are many filtering and branching questions.

(f) Each form should be identified by a unique identification number toenable records to be kept of the status of that unit in the sample, e.g. hasinterview been completed, refused or not yet attempted.

(g) It is extremely useful if each interview form is enclosed in a cover sheetfor office and interviewer use only. It is used for administrative controland for recording selected data which the interviewer can obtain byobservation. The cover sheet normally serves a range of functions suchas:

(i) helping the interviewer to locate the sample household.(ii) checking the accuracy of information about sample households.(iii) maintaining a precise record of what happens at each sample

household. Times and dates of successful and unsuccessfulcontacts should be recorded.

(iv) recording supplementary data, such as type of household andquality of surrounding environmental conditions.

(v) providing a space for recording response reports.(vi) recording appointment times

The cover sheet is ideally made of heavier quality paper than the rest ofthe questionnaire.

(h) All questions should be numbered consecutively throughout theform with no omissions or repetitions. This applies even when theinterview is broken up into several discrete sections sinceconsecutive numbering facilitates easy and non-ambiguous cross-referencing and branching control.

(i) Different type faces or fonts should be used for different elements ofthe interview form to facilitate easier administration of the surveyby the interviewer (Figure 5.6). Three elements, in particular, shouldbe segregated:

(i) Instructions to the interviewer;(ii) QUESTIONS TO BE READ VERBATIM TO THE RESPONDENT;(iii) Coding categories for the recording of responses.


161

Figure 5.6 Sample Page of Personal Interview Survey FormSource: Ampt (1992b)

For self-completion questionnaire surveys (with no interviewer), the surveyform takes on extreme importance and should be the subject of extra attention indesign, wording, and layout. Some general guidelines include:

(a) The overall layout should be clear, concise and, in general, should leadrespondents to the next question. In this respect, arrows from branchingquestions can be a useful device, provided that there are not too manyof them; too many arrows tend to confuse rather than assist.

(b) A minimal amount of writing should be required. Questions shouldrequire a "tick the box" reply if at all possible (Figure 5.7). Many peoplerarely need to write in their day-to-day lives, so that it can be quitethreatening to have to complete a survey form. In addition, many people

Chapter 5

162

with a less than perfect grasp of a language who are quit happy to speakit, find it embarrassing to write down their mistakes for perpetuity (asthey perceive it).

(c) A short, non-technical summary of the aims of the survey should beincluded to increase respondent interest in the survey.

(d) Include general instructions on how to fill out the form at the start of thequestions. If an example completed form is provided, refer to it in theintroductory comments.

(e) Any detailed instructions for particular questions should be attached tothe questions to which they refer.

(f) Assurances of confidentiality should generally be stated in theintroduction, but should not be over-emphasised. Too much emphasis,especially when the survey covers material of a not very confidentialnature, may have the counterproductive effect of arousing people'ssuspicions.

(g) The questionnaire should be endorsed and signed by someone withauthority so as to lend credence to the survey effort. Just who this personshould be varies with each survey (See Section 7.1). This is usually doneon an accompanying letter.

(h) There would always be a phone number for respondents to ring aboutquestions of clarification or legitimacy of the survey. The phone needs tobe manned on evenings and weekends as well as weekdays, snce that isthe time respondents will usually want to ask questions.

(i) The form should be as small as possible, consistent with clarity, legibility,and sufficient space for recording of answers. For mail-backquestionnaires, attention should be given to the approved maximumsizes for various postage rates. To maximise efficient use of space,consideration should be given to printing the introduction andinstructions on a portion of the survey form which can be torn off andnot returned through the mail. Going one step further, the questions canalso be printed on the tear-off portion so that all that is returned in thepost is a postcard containing the answers. This maximises the ratio ofinformation received to return postage charges paid.

(j) Using two sides of a sheet of paper is better than using two sheets ofpaper to ensure that the questionnaire does not look longer than itactually is. However, if questions are printed on both sides then careshould be taken in the selection of paper stock to ensure that it is


163

relatively opaque. For single sheet surveys, the use of good quality cardstock is recommended.

(k) The survey form should look professional and be printed in clear, easilyreadable type face. Professional artwork design is recommended for allquestionnaires. The use of desktop publishing and object-orientedgraphics programs has greatly facilitated the production of professional-looking survey forms in recent years. For major surveys, the use ofmulti-colour printing has been found to be a cost-effective way ofincreasing response rate. An example of the type of survey form which isrelatively easy to produce these days is shown in Figure 5.7. This tripform was part of a package of survey forms produced by the authors fora self-completion questionnaire travel survey, and were loosely based onthe KONTIV survey form design described by Brög and colleagues (Bröget al., 1983; Brög et al., 1985). Such forms can be produced using object-oriented graphics programs which are becoming more and morecommon. They can then be colour printed directly from the final camera-ready copy. The use of readily-available graphics programs in this waygreatly facilitates the preparation of draft versions of the questionnairefor committee discussion, and enables changes to be easily made right upuntil the day before the questionnaire forms go to the printers.

Bear in mind that even the advertising you receive in your mail box looksvery professional! Who will complete a less than professional lookingsurvey form?

(l) In summary, self-completion questionnaire survey forms should bedesigned to encourage every respondent to reply - whether they areused to filling out forms or not.

Chapter 5

164

Figure 5.7 Example of Self-Completion Questionnaire SurveySource: Richardson and Ampt (1993a)

For intercept surveys (e.g. public transport patronage studies), the followingpoints need to be given special consideration, since often the questionnaires willbe handed out or filled in on a moving vehicle:

(a) The form should be ergonomically designed. This means that amplespace should be left for writing (if this is necessary) and data should not


165

need to be recorded towards the edges of the sheets (which is difficultwith no firm support for the survey form).

(b) The forms should be weather-resistant if necessary (for example if theyare being done at the roadside) and, if rain-affected should still be able tobe filled in with pen or pencil. Note that instructions may need to begiven to interviewers about the types of pen or pencil which arepermissible. (Some tips! Felt-tipped pens smudge in the rain, ballpointpens do not work at temperatures below 3oC, and pencils need lots ofsharpening!)

(c) The forms should be of convenient size and format such that, forexample, pages are easy to turn, and that the forms can be used by bothright-handers and left-handers.

(d) The cardinal rule is for the investigator to test the survey form underactual, or simulated, field conditions before adopting the form for finaluse. The form should be tested under various conditions and especiallywhen the survey workload is at its highest (e.g. with very full buses).

(e) The rule about the minimal amount of writing required is particularlyimportant in intercept surveys. Questions should require a "tick the box"reply if at all possible. This is particularly important for on-board publictransport surveys where vehicle movement can make writing extremelydifficult. Another small but important point with on-boardquestionnaires is to provide the respondent with a pen or pencil to fill inthe survey form. The pen could then be given to the respondent as a"freebie".

(f) Although response rates are best when intercept surveys are collectedimmediately, always provide a mail-back option for people who aregenuinely in a hurry, have forgotten their reading glasses, or who simplycannot fill in the form while holding a brief case in one hand, and a childin the other.

5.5 QUESTION TYPES

In constructing a questionnaire there are three basic types of questions whichmay be asked: classification, factual and opinion questions.

5.5.1 Classification Questions

Classification questions are those questions which need to be asked in order toobtain a basic description, or classification, of the respondent. Such questionsusually relate to socio-economic and demographic characteristics of the

Chapter 5

166

respondent, e.g. age, sex, income. Responses to classification questions aretypically used to form sub-groups within the sample for later analysis or cross-checking with secondary data sources.

Classification questions can often be used to perform a screening functionwhereby members of certain sub-populations are either included or excludedfrom our final population. For example, questions relating to car ownershipcould eliminate car-owners from a population for a study of the transportdisadvantaged. Such screening questions would be asked at the very beginningof the interview (perhaps when contact has first been made with the respondentat the door). The use of screening questions in this way can be a useful method oflocating members of rare populations, when no sampling frame list for thispopulation is available. Thus a totally random sample is first selected, and thenonly those passing the screening test are retained in the final sample.

Classification questions can also be used as branching questions within theinterview, so that respondents are only asked questions which are relevant tothem, e.g. non-car-owners are identified so that they are not asked detailedquestions about car ownership (see Figure 5.8).

Since many classification questions may be seen by respondents as beingsomewhat tangential to the main purpose of the survey, it may be wise for theinterviewer to explain why such details are needed (generally along the lines ofcomparing results for different groups in the population). Also because somerespondents may find the classification questions intrusive, it is often a good planto have categories decided upon before the interview (on the basis of priorknowledge) and have the categories printed on cards. The interviewer can thensimply present the card to the respondent and say "which of these groups appliesto you?"

Classification data is very important when the dataset is likely to be used forsecondary analysis - so that the new investigator can identify sub-populationswithin the original dataset. In these circumstances, it makes very good sense touse classification categories which are standard across many surveys, e.g. askingrespondents the income categories used in the Census questions, since it is verylikely that Census data will be used as secondary data at a later stage.


167

Figure 5.8 Classification Question, with BranchingSource: Ampt Applied Research (1990)

5.5.2 Factual Questions

Factual questions deal with the respondent's experiences and knowledge and arethe types of questions which are most suited to self-completion questionnairesurveys. Two points are of particular importance when shaping factual questions.First, it is necessary to ensure that the definitions of words and phrases used inthe question (e.g. trip) are completely clear to the respondent so that the "facts"given by them are indeed the facts which we seek. The accuracy of theinformation received can only be as great as the clarity with which the question isasked. Second, one should ensure that it is reasonable for the respondent to beable to provide the facts we require.

The effect of memory lapses with respect to trip recall was discussed earlier as anexample of this problem. One should also be careful in this regard when askingrespondents to reply on behalf of other members of the family (i.e. to provide"proxy" responses). All that can be expected is the respondent's perception of the"facts" as they apply to the other family members.

One useful point to realise about factual questions in interview surveys is thatprobing can be used by the interviewer without great fear of biasing theresponses it is even possible to rephrase the question if necessary. The successfuluse of such strategies depends largely on the amount of training given tointerviewers. If they are made fully aware of the scope and intention of suchfactual questions, then they can appraise the quality of the response and judgewhether further clarification through probing is required.

At the outset, it is necessary to consider that reporting travel is essentiallyreporting factual information - i.e. what actually happened rather than opinionsor attitudes. Since the respondent actually takes part in their own travel, theinability to answer factual questions accurately in this case could not be due toignorance per se. If the questions are asked about past travel behaviour or tripmaking, it is much more likely to be due to the fallibility of human memory.

Chapter 5

168

Past studies have consistently shown that, in surveys where respondents arerequired to recall the trips made at some time in the past, they do forget tripsthey have made (Clarke, Dix and Jones, 1981; Meyburg and Brög, 1981; Stopherand Sheskin, 1982). This leads to the phenomenon of under-reporting of trips,meaning simply that the number of trips reported is less than the number whichactually occurred.

5.5.3 Opinion and Attitude Questions

In contrast to factual questions, opinion and attitude questions seek to obtain theopinions and attitudes, rather than the knowledge, of respondents. Because ofthis, they are much more sensitive to the wording and type of probing used.Questions of this type should be specifically identified on personal interviewquestionnaires as a reminder to the interviewer that they, like all other questions,must be asked verbatim. The difference between opinions and attitudes issomewhat difficult to define and there is, in fact, a large body of literature whichaddresses the topic (e.g. Pratkanis et al., 1989). For ease of understanding, somesocial psychologists make a rough distinction between different levels of aperson's philosophy, calling the most superficial level "opinions", the next one"attitudes", a deeper level "values" or "basic attitudes", and a still deeper level,"personality" (Oppenheim, 1992). There are certainly relationships and patterns ofconnections between all of these layers. Usually the most important thing for thesurvey designer is to discover the way in which any of these levels is likely toaffect behaviour.

Moser and Kalton (1979) suggest that opinion questions merely seek todetermine whether a respondent agrees or disagrees with a given opinionstatement. The Gallup Poll is an example of a collection of opinion questions.Attitude questions, on the other hand, often form a battery of coordinatedopinion questions which attempt, through specific psychological theories, to forman assessment of the respondent's overall attitude towards a particular subject.The techniques of attitude measurement are highly developed and cannot becovered in detail here (see Golob (1973) and Tischer (1981) for detaileddescriptions of attitude measurement). However, a few general comments on thetypes and use of opinion or attitude questions are warranted.

Unidimensional vs. Multidimensional Attitude Data

The collection of attitude data, particularly with respect to attitudes towardstransport options, can proceed under one of two assumptions. The first statesthat each attribute of an alternative can be separately assessed in terms of itsrelative importance and its degree of satisfaction. These unidimensional attituderatings can then be combined in some manner to obtain an overall attitude ratingfor the alternative.


169

The second assumption follows in the Gestaltist tradition and states that analternative can only be assessed in its totality. Thus attitude ratings can only beobtained for the alternative and not for the individual attributes. Informationintegration theory (Anderson, 1974) is one example of the use ofmultidimensional attitude data.

The vast majority of attitude data used in transport choice analysis, andelsewhere, is of the unidimensional type. The majority of discussion in thissection will therefore concentrate on unidimensional attitude data, althoughsome examples of multidimensional attitude measurement techniques will begiven in this section, and will be explored later when discussing stated preferencesurvey methods (Section 5.3.4.).

Profile vs Similarities Data

A further general distinction which must be made in relation to attitude data isbetween profile and similarities data. Profile data characterises one or morealternatives in terms of one or more attributes. Thus, for example, severalalternatives may be rated in terms of the satisfaction gained from severalattributes.

Similarities data, on the other hand, attempts to show how alike two alternativesare seen to be, either in general or with respect to a specific attribute. Both profileand similarities data may be collected on either a unidimensional ormultidimensional basis.

Types of Measurement Scales

Before we discuss several specific examples of attitude measurement techniques,consider the types of scales which may be used. There are essentially fourdifferent types of scales which may be constructed:

• Nominal Scales• Ordinal Scales• Interval Scales• Ratio Scales

Nominal scales serve simply to categorise people and objects into groups. Thecodes attached to each group do not imply any ordering. Examples of such scalesinclude categorisations by sex and occupation. The only information obtainedfrom a nominal scale is that objects with the same rating belong to the samecategory.

Ordinal scales serve not only in a classification manner but also impart order toobjects rated on such a scale. If, for example, three objects (A, B and C) wererated on an ordinal scale, then the only information available on these objects,

Chapter 5

170

with respect to the characteristic in question, would be their order. Hence it couldbe stated, for example, that A > B > C but nothing would be known about therelative size of the difference between A and B and between B and C. An exampleof an ordinal scale is the Mor's Hardness Scale which simply states that materialswith a higher number on the scale are harder than (i.e. will scratch) materialswith lower numbers on the scale. Another example is the ranking procedure fordestination zones as used in the intervening opportunities model of tripdistribution. All that is known is that one zone is more attractive than another;how much more attractive is not known.

One consequence of using an ordinal scale is that it is impossible to perform anymathematical operations on the scale numbers (except ordering). For this reason,they are of limited usefulness as inputs to mathematical models of travel choice.

Interval scales impart both order and position to objects rated on such a scale.That is, they not only rank objects but they also give meaning to the distancebetween object ratings on the scale. Thus, relative values on the scale possesssome meaning. However, there is no meaning attached to absolute values on thescale. This is because interval scales possess no fixed zero point; rather theselection of a zero point is completely arbitrary. Examples of interval scales incommon usage are the Fahrenheit and Celsius temperature scales (with a zeropoint determined by the freezing point of water), and the Gregorian calendar ofyears (with zero point determined by the birth of Christ).

As a result of the absence of a fixed zero point, it is not possible to multiply ordivide with interval scale numbers although it is permissible to add and subtractsuch numbers.

Ratio scales impart order and length to an object rating and also have adeterminate zero point which enables all four mathematical operations to beperformed with ratio scale numbers. Thus the distance between two ratings has ameaning as does the ratio of two ratings. Examples of ratio scales are the Kelvinscale of temperature, the Decibel scale of loudness and numerous other physicalscales such as length, weight and duration. A summary of the properties of thefour types of scale are presented in Table 5.1.


171

Table 5.1 Summary of Scale Types

Statistics

Scale Type CentralTendency

VariabilityMeasure

IndividualPosition

PermissibleUses

PermissibleTransformations

RATIO GeometricMean

Coefficientof Variation

AbsoluteScore

Find RatiosBetween

Multiplication andDivision

INTERVAL ArithmeticMean

Variance,StandardDeviation

RelativeScore

FindDifferencesBetween

Addition andSubtraction

ORDINAL Median Range RankPercentile

Establish RankOrder

Any that PreserveOrder

NOMINAL Mode Number ofCategories

Belongingto Category

Identify andClassify

Substitutionwithin Category

Note: The higher level scales subsume all the features of the lower level scales.

Attitudinal Measurement Techniques

Given these general definitions of scale types, consider some specific examples ofattitudinal scaling techniques which are of use in transport choice analysis. Thetechniques discussed in this section are all unidimensional scaling techniques.Multidimensional techniques are discussed in the next section.

Paired Comparisons

Assume that there exists an ordering to a set of objects. One way of determiningsuch an ordering would be by comparing all objects two at a time and noting thehigher order object on each occasion. After carrying out all possible comparisons,the highest order object should have been selected every time it appeared in acomparison. The second-highest order object should have been selected everytime it appeared except for when it was compared with the highest order object.This finding can be extended to all lower order objects until the lowest orderobject which is never selected in any comparison. The number of times an objectis selected will constitute an ordinal scale since it imparts order but not separationdistance to the objects on that scale (i.e. all objects will be equally spaced on thisscale). The concept of a paired comparison test has been extended to produce aninterval scale by Thurstone (1959) based on his famous Law of ComparativeJudgement (Thurstone, 1927). This law states that a stimulus - whether physicalor otherwise - gives rise to a perceptual response within an individual which, forvarious random reasons, varies from presentation to presentation of the samestimulus as shown in Figure 5.9.

Chapter 5

172

Frequencyof Response

ResponseStimulus

Figure 5.9 Distribution of Response to a Single Stimuluson repeated Occasions

If the individual is presented with a different stimulus this too will result in adistribution of responses within the individual. If the two stimuli levels are closeenough and the response variances are large enough then the two distributionswill overlap as shown in Figure 5.10.

Frequencyof Response

ResponseStimulus 1 Stimulus 2

Figure 5.10 Distribution of Response to Two Different Stimuli

If now the individual is required to compare the two stimuli and make a selectionof the higher order (e.g. the biggest, the best), then the probability of arranging


173

the two stimuli in the correct order will be a function of the distance betweenthem (and hence the degree of overlap of the two distributions).

Hence by allowing for an error in the perception of the stimuli levels, the numberof times an object is selected will now constitute an interval scale since thenumber of incorrect selections will be directly related to the distance between themean values of the perceived stimulus distributions.

An example of a paired-comparison task is shown in Figure 5.11 for theestimation of system attribute importances. Paired-comparison tasks have beenused by Golob et al. (1972) and Gustafson and Navin (1973).

In each of the following questions, please select the feature (A or B)which you would most prefer to be included in a public transport servicefor your journey to work.

1. A. Guarantee of obtaining a seat for the entire journey.

B. Low waiting times at stations.

2. A. Low door-to-door travel time.

B. Low fare.

Figure 5.11 Part of a Paired Comparison Question

Two major problems exist with the paired-comparisons technique. Firstly, thelevels of adjacent stimuli must be close enough such that there exists someoverlap between the two distributions in Figure 5.10. If there is no overlap, thenno estimate of the distance between them can be inferred from the results of thecomparisons and the resultant scale reverts to an ordinal scale. Secondly, thenumber of comparisons needed to compare all objects in all possible ways isgiven by n(n-l)/2. Thus when the number of objects is large, the number ofcomparisons becomes prohibitive. It is possible to compare only some pairings(Gullikson, 1956; Bock and Jones, 1968) but this makes the analysis morecomplicated and less accurate.

Rank Ordering

In a rank ordering task, an individual is asked to rank a set of alternatives withrespect to some attribute. For example, modes of transport may be ranked inorder of the level of comfort associated with each. For one individual, such aranking produces an ordinal scale of comfort measurement. However, by again

Chapter 5

174

employing Thurstone's Law of Comparative Judgement, and by having theindividual perform repeated rankings, an interval scale may again by generated.However, because all alternatives are presented for comparison at the one timein the rank order task, whereas they were presented only two at a time in thepaired comparisons task, it is likely that there will be less inconsistency injudgements in the rank order task for any one individual (i.e the variance in thedistributions will be reduced). For this reason, it is more difficult to produce aninterval scale with individual rank order data. It is therefore more usual to obtaininterval scales from rank order data by means of utilising the results obtainedfrom a number of different individuals with the assumption that the individualscome from a homogeneous population.

The obvious advantage of the rank ordering method is that the task is not asonerous as the paired-comparisons task and that the task can be more easilyexpanded to take account of new alternatives for comparison. However, likepaired-comparisons, the alternatives must be sufficiently close with respect to theattribute in question such that the distributions overlap to some extent. The rankorder method can be used to obtain both profile and similarities data. Anexample of the rank-order task for similarities data is given by Nicolaidis (1977)and is shown in Figure 5.12. The data obtained from this method was used byNicolaidis in a multidimensional scaling analysis technique known as INDSCAL(Carroll and Chang, 1970).

Category Scales

The allocation of objects to categories can produce scales of a nominal, ordinal orinterval nature depending on the way in which the categories are defined. At thenominal scale level, objects may be allocated to categories on the basis ofunorderable classifications (e.g. sex). At the ordinal scale level, such categoriesmay indeed possess an order such as income categories or statements ofpreference for an object. Interval scales may be derived by ensuring that thedescriptions of each category accord with the numerical value associated witheach category. To construct an interval category scale it is necessary to know therelationships between standard words and phrases in terms of the numericalinterpretation of such labels. A semantic atlas showing dimensional intensityloadings for several hundred English words has been constructed for thispurpose (Jenkins et al., 1958).


175

For each of the modes of travel shown below, please specify which of the remainingmodes you think are most similar with respect to the comfort experienced whilst usingeach mode. Indicate the most similar mode by putting its identification letter in the first setof brackets under the name of each mode of transport. Put the identification letter of thesecond most similar in the second set of brackets and so on for all the other modes.

The identification letters for the modes are:A. AutomobileB. BusC. BicycleD. MotorcycleE. TaxiF. WalkingG. Hitch-hiking

Automobile Walking Taxi Bicycle Bus Hitch-hiking Motorcycle 1 [ ] 1 [ ] 1 [ ] 1 [ ] 1 [ ] 1 [ ] 1 [ ] 2 [ ] 2 [ ] 2 [ ] 2 [ ] 2 [ ] 2 [ ] 2 [ ] 3 [ ] 3 [ ] 3 [ ] 3 [ ] 3 [ ] 3 [ ] 3 [ ] 4 [ ] 4 [ ] 4 [ ] 4 [ ] 4 [ ] 4 [ ] 4 [ ] 5 [ ] 5 [ ] 5 [ ] 5 [ ] 5 [ ] 5 [ ] 5 [ ] 6 [ ] 6 [ ] 6 [ ] 6 [ ] 6 [ ] 6 [ ] 6 [ ]

Figure 5.12 Similarities Ranking Question

An example of such a category scale is shown in Figure 5.13. All the respondenthas to do is to mark the importance of each attribute in the appropriate box. Ananalysis by Miller (1956) suggests that the most appropriate number of categoriesfor such a scale, based on the limits of human discrimination, is seven plus orminus two. There should always be an odd number of categories to allow for aneutral rating by the respondent, if so desired.

There exists some debate as to whether scales of this type are truly interval ormerely ordinal. Some researchers (e.g. Anderson, 1972) claim that the scalevalues are interval and use the numerical labels of the categories at face value.Others (e.g. Bock and Jones, 1968) state that they are really only ordinal but, byemploying the Law of Comparative Judgement, they then proceed to convertthe scale values into an interval scale. The category scale method has beenapplied to transport by Paine, et al., (1969) and Brown (1977).

Chapter 5

176

.

In considering the use of public transport for the journey to work, show the importanceof the following features by circling one of the numbers 1 to 7 in accordance with thedegree of importance you associate with each feature.

IMPORTANCE

Extre

mel

yIm

porta

nt

Very

Impo

rtant

Som

ewha

tIm

porta

nt

Neith

er Im

porta

ntno

r Uni

mpo

rtant

Som

ewha

tUn

impo

rtant

Very

Unim

porta

nt

Extre

mel

yUn

impo

rtant

Guarantee of gettinga seat for the entire trip.....

Low waiting timesat stations...........................

Low door-to-doortravel time...........................

Low fare..............................

1 2 3 4 5 6 7

1 2 3 4 5 6 7

1 2 3 4 5 6 7

1 2 3 4 5 6 7

FEATURE

Figure 5.13 Category Scale Question

Likert Scales

The Likert scale consists of a number of attitudinal statements of differentpolarities and degrees of extremity. The respondent rates each statement along afive-point dimension denoted by: strongly disagree, disagree, uncertain, agreeand strongly agree. This type of scale is shown in Figure 5.14. The final score withrespect to the attitude in question is given by a combination of the ratings oneach statement and the extremity of view expressed by each statement.

The Likert scale is similar in concept to two other techniques which seekresponses to statements of varying polarity and extremity. These are theThurstone scale and the Guttman scale (Guttman, 1950; Fishbein, 1967). Althoughthe question format and analysis techniques are somewhat different, the threescales have been shown to give very similar results.


177

Figure 5.14 Likert Scale Question

An important aspect of the use of Likert, and other, scales is the need to pre-testextensively with a panel of respondents to establish the extremity of the viewexpressed by each statement (Oppenheim, 1992). This is done by writing a set ofattitude statements, and then ascertaining the opinions expressed towards thesestatements by a panel of representative respondents. Once the polarity andextremity of each statement is established, a balanced pool of questions is thenselected for use in the final survey.

Semantic Differential Scales

One of the most widely used of the attitude scales is the semantic differential scaledeveloped by Osgood et al., (1957). The scale consists of a seven-point scale whichis labelled at each end with bipolar adjectives describing the quality in questione.g. good/bad, satisfactory/unsatisfactory, expensive/cheap. The respondentrates the attribute in question by placing an X on the scale at a position which isindicative of the strength of the response. A sample semantic differential scale isshown in Figure 5.15. The semantic differential scale can be seen to be closelyrelated to a category scale. The principal difference lies in the use of end-anchors

Chapter 5

178

only with the semantic differential whereas intermediate labels are used forcategory scales.

.

In considering the use of public transport for the journey to work, show the importanceof the following features by marking with an X on the scale-line to correspond with thedegree of importance you associate with each feature.

IMPORTANCE

Extre

mel

yIm

porta

nt

Extre

mel

yUn

impo

rtant

Guarantee of gettinga seat for the entire trip.....

Low waiting timesat stations...........................

Low door-to-doortravel time...........................

Low fare..............................

FEATURE

Figure 5.15 Semantic Differential Scale Question

Examples of transport studies which have used semantic differential scalesinclude Golob (1970), Nicolaidis (1977), Ackoff (1965), Sherret (1971), Hartgen andTanner (1971) and Golob, Dobson and Sheth (1973). Such scales can be usedequally well to obtain profile or similarities data.

Ratio Scales

Although there are many methods for the generation of ratio scales (see, forexample, Guilford, 1954; Carterette and Friedman, 1974; Torgerson, 1958), threemethods are of particular importance in transport choice analysis.

(i) Fractionation

This scaling procedure involves presenting the respondent with an assumedrating for one object and then requiring the respondent to select another objectwhose rating is a particular fraction of the initial object's rating. Thus, forexample, a respondent may be told the accessibility of one location to the citycentre and then asked to nominate other locations whose accessibility is a fraction


179

(e.g. a half) of the original location. The scale of accessibility so derived would bea true ratio scale of perceived accessibility.

(ii) Multiple judgement

Closely related to judgements of fractions are judgements of multiples. Therespondent is given an initial rating for an object and then asked to select anotherobject whose rating is a given multiple of the initial rating.

Although the methods of fractionation and multiple judgements have beenconsidered by some (Torgerson, 1958; Hanes, 1949) to be similar, it is notaltogether clear that they represent the same psychological process. For example,fractionation methods may be considered as an interpolation between thenominated initial rating and a fixed zero point, thus generating a true ratio scale.On the other hand, multiple judgement methods may be considered as aninterpolation between the nominated rating and the highest rating of that type ofobject that the respondent has ever experienced. In this case the zero point isarbitrary, and hence the ratings do not necessarily generate a ratio scale.

(iii) Magnitude Estimation

In the magnitude estimation method, the respondent is asked to assign scalevalues to a series of objects in accordance with the subjective impressions theyelicit. No arbitrary reference point is specified at the start; the respondent choosesboth the reference point and the subsequent ratios to this reference point. Theorigins of this method are attributable to Richardson (1929) but the moderndevelopment of this theory has been most strongly associated with Stevens(1956, 1967).

As an example of this method, a respondent may be asked to assign a number(any number) to represent the accessibility of a location to the city centre. Thepresentation of other locations to the respondent should then elicit responseswhich are in direct proportion to the accessibility ratio between the first and thepresent locations.

Considerable debate exists as to whether category scales or ratio scales are themost appropriate measurement techniques. Stevens (1974), for example, statesthat "For the purposes of serious perceptual measurement, category methodsshould be shunned. The deliberate and ill-conceived imposition of a limited set ofresponse categories forces the subject into portioning. At that point, the hope fora ratio scale must fail". On the other hand, Anderson (1976) is equally as forcefulwhen he states that "It seems appropriate, therefore, to conclude that the ratingmethod can yield true interval scales and that the method of magnitudeestimation is biased and invalid". To be sure, there exists a non-linear relationshipbetween category rating results and ratio scale results as demonstrated by

Chapter 5

180

Stevens and Galanter (1957), and acknowledged by Anderson (1976). The form ofsuch a relationship is shown in Figure 5.16, and demonstrates the concaverelationship usually found between category and ratio scale results.

CategoryScaleRating

Very veryLoud

Very Loud

Loud

Medium

Soft

Very Soft

Very verySoft

7

6

5

4

3

2

1

0 10 20 30 40 50 60 70Magnitude Estimation Rating of Loudness (sones)

Figure 5.16 Comparing Category Scale and Magnitude Estimation Ratings(Source: Stevens and Galanter, 1957)

Although there is obviously a difference between the results there is no clearindication of which is correct. The problem is confused even further when it isrealised that the number system which is used is also subject to considerabledoubt as to which type of scale it represents. Is it an interval or ratio scale, and isit perceived as a linear scale? (Jones, 1974). It has been empirically demonstrated,for example, that the number series (from 1 to 10, at least) is subjectivelyperceived as a power function with an exponent of 0.49 (Rule, 1971). That thesebasic questions about scaling remain unresolved after a century of psychologicalresearch should be some comfort for transport planners attempting to come togrips with the area of attitudinal measurement.

Constant Sum Allocation

A final unidimensional ratio scaling technique which is of relevance to transportchoice is the method of constant sum allocation (Comrey, 1950). The originaltechnique involves the division of 100 points between pairs of objects in such away that the assigned values indicate the relative amounts of some characteristicwhich each object possesses. The method used by Comrey is based on a pairedcomparison format. However, considerable economies in effort in data collection


181

can be effected by considering all objects at one time and dividing the 100 pointsbetween all the objects.

This constant sum allocation method has been used in transport choice analysis tomeasure attribute importances (Hensher, 1972) and behavioural intent (Hartgenand Keck, 1974). It appears that this relatively straightforward method could beused more often in the measurement of attitudes in terms of a ratio scale.

Attitudinal Questions in Pilot Surveys

One of the most important aspects of the use of attitude or opinion questions -and one which is almost invariably overlooked in travel-related research - is thepilot test. Its importance is documented again and again in the social scienceliterature (e.g. Oppenheim, 1992). After studying the literature on the subject,these pilot tests need to take the form of in-depth interviews, the essentialpurpose of which is two-fold:

1) to explore the origins, complexities and effects of the attitude areasin question in order to decide more precisely what is to bemeasured, and;

2) to get vivid expression of these attitudes from the respondents in aform that could be used in the statements on an attitude scale.

Suppose we were trying to construct a scale to measure people's attitudes toimproving the frequency of buses in their area. We may well find in exploratoryinterviews that almost everyone is in favour of this measure. A simple "for oragainst" scale on improving frequency would, therefore, show littledifferentiation. It can often happen that the pilot test actually causes a change inthe aim of the scale (e.g. levels of frequency rather than to increase or not toincrease), and possibly of the aim of the investigation.

Next, we may propose to build a scale dealing with "relevance of bus frequencyto mode choice" - dealing with the extent to which considerations of busfrequency enter into people's minds when they choose a mode of transport. Atfirst, it would seem that this attitude is directly related to people's knowledgeabout bus frequencies; and it would seem relatively easy to differentiate betweenthose with more or less knowledge about the bus service on the basis of a fewwell-chosen factual knowledge questions. Further exploration may show,however, that many people with little correct knowledge about bus frequenciesnevertheless are very interested in them and are influenced by friends'/neighbours'/newspapers' claims about them. We begin to find various linksbetween a person's attitudes to bus frequencies and their attitudes to otheraspects of their life; for example, community/neighbourhood awareness mayinfluence the perception of bus frequencies, while concern about house prices

Chapter 5

182

may also influence whether people want increased bus frequencies. And wecould continue to find a variety of other linkages outside the narrow realm oftransport services! "This is a part of social psychology where clinical acumen,intuition and a capacity for listening with the third ear are invaluable and wherechairbound, preconceived frameworks may constitute a real hindrance."(Oppenheim, 1992).

After doing perhaps thirty or forty preliminary interviews, it is then necessary todecide what it is we wish to measure. Only then will we be in a position to drawup a conceptual sketch of the clusters of attitudes in question with the likelylinkages.

The Asking of Attitudinal Questions

It is essential when asking opinion questions in personal interview surveys thatinterviewer effects be reduced to a minimum. Thus, in addition to asking thequestions verbatim, it is important to realise that even seemingly innocuouscomments may have considerable impact on the way in which opinion questionsare answered. All answers must be accepted by the interviewer as perfectlynatural and no signs of positive or negative reaction should be displayed. Thesereactions may either be verbal (such as saying "I see" with an inflection after aresponse is given) or visual (such as raising the eyebrows).

Apart from possible interviewer effects, attempts to obtain opinions may meetwith other problems. First, respondents may simply have no opinion on aparticular topic - they may not have thought about it before or else, if they have,then the topic may be so unimportant to them that they have not bothered toform an opinion. Second, respondents may have conflicting opinions on a topic.They may see both good and bad sides to the topic and may not be able to give adefinite opinion without a large number of qualifying statements. Third,although an opinion may be elicited from a respondent, the intensity of opinionmay vary considerably between respondents. Thus some respondents may feelvery strongly about a topic (either positively or negatively) while others may feelless strongly but in the same direction. It is particularly important that there isroom for respondents to record all these types of responses either on thequestionnaire form or to the interviewer.

Because opinion questions generally require a spontaneous and individualresponse from a respondent, opinion questions are generally more suited topersonal interview surveys than to self-completion questionnaires, despite theabove reservations about interviewer effects.

The preceding discussion of attitudinal measurement techniques has beennecessarily brief. A vast array of literature awaits those who wish to delve moredeeply into the subject. In terms of general reading on attitudinal measurement


183

principles the principal texts are those by Carterette and Friedman (1974),Guilford (1956), Bock and Jones (1968), Torgerson (1958) and Oppenheim (1992).Examples of studies reviewing the application of attitudinal measurementtechniques to transport choice analysis are Hartgen (1970), Golob (1973), Goloband Dobson (1974), Michaels (1974), Dobson (1976), Spear (1976), Levin (1979a),Golob, Horowitz and Wachs (1979), Dix (1981), Tischer (1981) and Levin (1979b).

5.5.4 Stated Response Questions

Two types of multidimensional scaling technique are of particular relevance totransport choice analysis. The first involves the rating of an alternative, overall,by means of one of the techniques mentioned above - that is, the application of aunidimensional scaling technique to a multidimensional object. This method isfrequently used to ascertain how the unidimensional ratings of the individualattributes might be combined into an overall rating of the alternative.

The second method is known by various titles such as conjoint measurement(Luce and Tukey, 1964; Krantz and Tversky, 1971), information integration(Anderson, 1971, 1974), functional measurement (Anderson, 1970; Levin, 1979a;Meyer, Levin and Louviere, 1978), and, in recent years, stated preference orstated response (Pearmain et al., 1991; Hensher, 1994). The principle feature ofeach of these methods is that they seek the respondent's reaction to a series ofhypothetical combinations of attribute levels. The set of questions is determinedon the basis of an experimental design which seeks to present a balanced set ofsituations to the respondent.

Stated response methods are particularly useful in two contexts:

• when a substantially new alternative is being introduced and there islittle or no historical evidence of how people might respond to thisnew alternative

• when the investigator is trying to determine the separate effects oftwo variables on consumers' choices, but where these two variablesare highly correlated in practice.

Because of the manner in which the set of questions has been determined by anexperimental design, the investigator has control over the combinations ofattributes to which the respondent will respond. This is particularly important inthe second context listed above, because it enables the investigator to isolate theindividual effects of the various attributes.

The design of the choice situations to be presented to the respondents is animportant component of the overall design of stated response surveys. Pearmainet al. (1991) offer a simple example of such design by considering a situation

Chapter 5

184

involving three attributes for a public transport service: fare, travel time andfrequency. If each attribute has only two levels (viz, high-low, slow-fast,frequent-infrequent), then there are eight different combinations of these optionsas shown in Table 5.2.

Table 5.2 A Simple Stated Response Experimental Design

Public Transport Attributes

Option Fare Travel Time Frequency

1 Low Fast Infrequent2 Low Fast Frequent3 Low Slow Infrequent4 Low Slow Frequent5 High Fast Infrequent6 High Fast Frequent7 High Slow Infrequent8 High Slow Frequent

The respondent could then be asked to rank these options in order of preference,and from the combined responses of a sample of respondents, the relativeimportance attached to fares, travel times and frequency could be determined.Importantly, because of the orthogonal nature of the experimental design (ie.each variable is independent of all other variables in the set of options presentedto the respondent), the importances attached to each attribute are true reflectionsof the separate effects of each attribute.

As with unidimensional scales, the respondent may be asked to perform differenttasks with the presented information. For example, they could be asked to:

• rank the alternatives in order of preference

• assign a rating to each alternative to reflect their degree ofpreference

• select the single alternative which they prefer the most

• select choices in a paired comparison manner from a series of two-way choice situations

Each of these methods has their own strengths and weaknesses, both from thepoint of view of the respondent and the analyst.

One of the problems with stated response methods is that the set of optionsshown in Table 5.2 is extremely limited. For example, it is likely that more thantwo levels of each of the attributes would need to be tested, and perhaps more


185

than three attributes will need to be evaluated. However, as the number ofattributes and attribute levels increases, so too does the number of possiblecombinations of the attribute levels. For example, if we wish to test three levelsof three attributes, that would result in 27 combinations; three levels of fourattributes would require 81 combinations. Clearly, it is impossible to expectrespondents to be able to consider this many different situations. Kroes andSheldon (1988) suggest that a maximum of 9 to 16 options is acceptable, withmost current designs now adopting the lower end of this range. With amaximum of 9 options for the respondent to consider, this severely limits thenumber of attributes that can be considered. For example, the following optionsare available with 9 or less options:

• with two attribute levels 22 = 4 options 2 attributes with 2 levels23 = 9 options 3 attributes with 2 levels

• with three attribute levels 32 = 4 options 2 attributes with 3 levels

• with mixed levels 21 x 31 = 6 options 1 attribute with 2 levels1 attribute with 3 levels

21 x 41 = 8 options 1 attribute with 2 levels1 attribute with 4 levels

To overcome this limitation, and yet be able to consider more attributes and/ormore attribute levels, it is necessary to adopt one of the following strategies(Pearmain et al., 1991):

• use a "fractional factorial" design, whereby combinations ofattributes which do not have significant interactions are omittedfrom the design. A significant interaction is said to exist when thecombined effect of two attributes if significantly different from thecombination of the independent individual effects of these twoattributes.

• remove those options that will "dominate" of "be dominated" by allother options in the choice set. For example, in Table 5.x1, option 7 isdominated by all other options, while option 2 dominates all others.These options could be removed from the choice set, on theassumption that all "rational" respondents would always put option2 first and option 7 last in any ranking, rating or comparisonprocess.

• separate the options into "blocks", so that the full choice set iscompleted by groups of respondents, but with each groupresponding to a different sub-set of options. Each group responds to

Chapter 5

186

a full-factorial design within each sub-set of options, and it isassumed that the responses from the different sub-groups will besufficiently homogeneous that they can be combined to provide thefull picture.

• present a series of questions to each respondent, offering differentsets of attributes, but with at least one attribute common to all toenable comparisons to be made. Often the common attribute will betime or cost to enable all other attributes to be measured againsteasily understood dimensions.

• define attributes in terms of differences between alternatives (eg.travel time difference between car and train). In this way, twoattributes are reduced to one attribute in the experimental design.However, they may still be presented as separate attributes to therespondent on the questionnaire.

Adoption of one, or more, of the above strategies will allow more information tobe obtained from stated response questionnaires while keeping the taskrelatively manageable for the respondent.

The major weakness of stated response methods, however, is that they seek thereactions of respondents to hypothetical situations and there is no guarantee thatrespondents would actually behave in this way in practice. This is particularly thecase if the respondent does not fully understand the nature of the alternativesbeing presented to them. There is thus a high premium on high-qualityquestionnaire design and testing to ensure that respondents fully understand thequestions being put to them. Unfortunately, this does not appear to be the case atthe present time. While a lot of attention has been placed on refining the natureof the experimental designs, and on increasing the sophistication of the analysistechniques to be employed after the data has been collected, relatively littleattention has been paid to improving the quality of the questions being put to therespondents. With few exceptions (e.g. Bradley and Daly, 1994), relatively littleattention has been focussed on testing for methodological deficiencies in thesurvey techniques used to obtain stated response data. There are numerousexamples of stated response questionnaires in which the questions being asked ofthe respondent are almost unintelligible (even to a trained professional). Futurework in this area must pay much greater attention to the quality of the surveyinstrument.

5.6 QUESTION FORMAT

The format of a question describes the way in which the question is asked and,more importantly, the way in which the answer is recorded. The choice of


187

question format is closely related to the choice of data processing procedures tobe used later in the survey process. Three basic types of question format areavailable:

(a) open questions(b) field-coded questions(c) closed questions

5.6.1 Open Questions

Open questions are answered by the respondent in their own words which arethen recorded verbatim (as much as possible) and coded at a later date (Figure5.17). Open questions can be used in personal interview surveys, where theinterviewer does the recording, or in self-completion questionnaire surveyswhere the respondent does the recording by means of a written answer.

WHAT IS YOUR COMPANY POLICY FOR BUSINESS TRAVEL? (i.e. specific modes, costs - upper limits, etc.)

_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

Figure 5.17 An Open Question in a Personal InterviewSource: Ampt Applied Research and Transport Studies Unit (1989)

The open question format has a number of distinct advantages:

(a) This format generally improves the relationship with respondentsbecause they feel that their own personal views are of some importance.In this respect, open questions are often good introductory questions ininterviews to make the respondent feel at ease, and to stimulate interestin the survey topic. They are also useful concluding questions at the endof any survey to enable the respondent to "let off steam" about thesurvey topic.

(b) There is more opportunity for probing by the interviewer to bring outfurther points in discussion. Specific probes may be provided to eachinterviewer to guide discussion.

(c) Open questions can be of great help to the investigator in developingideas and hypotheses which may later be tested by more structuredquestions.

(d) Open questions are most useful in pilot surveys where the range ofresponses to a question may be initially unknown (see Section 5.5.3 for a

Chapter 5

188

discussion on this aspect of attitude questions). The pilot survey openquestion responses can be used to define an expected range of responsesto make it possible to develop a satisfactory closed question format forthe main survey.

(e) The use of an open question, immediately after a closed question, can beuseful for interpreting the respondents' perception or understanding ofthe closed question. This technique is particularly useful in pilot studieswhere wording is being tested.

(f) The quotations obtained from open question responses may be of directvalue when writing the survey report, to provide a degree of humaninterest and to emphasise dramatically the meaning and significance ofotherwise sterile statistical results.

Open questions are not, however, without their disadvantages. In particular, ithas been found that open questions can have the following problems:

(a) By forcing respondents to express their feelings in their own words, theycan be threatening.

(b) They can give answers which are vague and not related to the subjectmatter of the survey.

(c) They can also give quite valid answers, the range of which is so wide asto be completely unmanageable from a coding viewpoint.

(d) When used in personal interview surveys, they are generally subject tointerviewer effects in two main ways. First, the amount of informationobtained from an open question will depend, to a large extent, on theamount of probing done by the interviewer. Differences in response maytherefore be a function of the interviewer rather than of the respondent.Second, although the answers to open questions are intended to berecorded verbatim, this is generally impossible in an interview situation.Interviewers are therefore somewhat selective in recording responses,recording those aspects which they feel are most important. In this way,interviewer bias will affect the results.

(e) The response to open questions can also be subject to respondent bias.Thus, in personal interviews, a larger number of comments will bereceived from the more loquacious members of the populationirrespective of whether they have more information to contribute or not.Similarly, in self-completion questionnaires, a greater number ofcomments will be received from the more literate members of thepopulation. Since both these characteristics are related to the social class


189

and level of education of the respondent this could introduce a bias intothe interpretation of the results of open questions.

(f) For personal interview surveys, open questions generally mean that adouble-processing of responses is necessary - once in the field when theresponse is recorded verbatim and once in the office where the responsesare assigned a numerical code.

(g) If too many open questions are used in an interview, it can be very tiring,and/or annoying, for both interviewer and respondent. It can beparticularly annoying if the same subject matter is covered later in theinterview in closed question format. The respondent may rightly wonderwhy they are being asked a question to which they have already given acomplete answer.

The general recommendation for open questions in personal interviews and inself-completion questionnaires is therefore to use them sparingly. Be aware thatalthough they will often provide a richer source of information, they also carrywith them some significant practical disadvantages.

5.6.2 Field-Coded Questions

In an attempt to avoid some of the disadvantages of open ended question inpersonal interview surveys, use is often made of field-coded questions, in which theinterviewer asks what is apparently an open question but then, instead ofrecording the answer verbatim, classifies the response into one of severalpredetermined categories (Figure 5.18). The categories available are, however,known only to the interviewer.

Chapter 5

190

Figure 5.18 Field-Coded Questions (D&E) in a Personal InterviewSource: Ampt and Waters (1993)

The principal advantage of field-coded questions is that they eliminate double-processing of the responses by having the interviewer do the coding. However,this is also the source of their prime disadvantage since considerable undetectableinterviewer bias may exist in the interpretation of the responses and the selectionof the code to match the response. As in all situations, the training of theinterviewer is crucial in these cases. The Interviewers' Manual (Section 5.9)becomes essential here since it should contain all definitions needed (e.g. ahairdresser might be coded as "Personal Business/Services" and "Education" mayexclude hobby courses). It is essential that the interviewer not only knows whatto do, but also understands the reasons for asking each question, so that theircoding is done with comprehension and not simply according to pre-specifiedrules.

5.6.3 Closed Questions

In an effort to eliminate interviewer bias altogether, most questionnaires makegreatest use of closed questions, which may be defined as those questions where


191

the respondent is presented with a list of possible responses and is then asked tofit themselves into the appropriate category in response to the question (Figure5.19).

Figure 5.19 A Closed Question in a Personal InterviewSource: Ampt and Waters (1993)

A number of features of closed questions deserve specific mention:

(a) Closed questions are most useful for factual questions, although opinionquestions can be asked by means of psychometric scaling techniques (seeTischer, 1981).

(b) The use of pre-selected categories is very valuable in helping to define themeaning and scope of the question for the respondent.

(c) On the other hand, the scope of the responses must not be unnecessarilyrestricted by the range of the response categories offered. The responsecategories should be:

(i) exhaustive (Note that by presenting some, but not all, categories you runthe risk of suggesting to the respondent that their responses should fitinto those options offered);

(ii) mutually exclusive

(iii) unambiguous.

(d) To avoid forcing people into categories which are not really appropriate(which may bias the results and will certainly annoy the respondent), anopen alternative should be offered, where appropriate (e.g. "don'tknow", "not applicable", "other (please write in)") as shown in Figure 5.20.

Chapter 5

192

Figure 5.20 An Open Alternative in a Closed QuestionSource: Richardson and Ampt (1993a)

(e) If one does offer an open alternative category, as described above, itmust be realised that this may attract a higher than acceptable response,since it offers respondents a chance to reply without really thinkingabout the question. In some circumstances it may be more desirable toforce respondents to come down on one side of the fence or the other.

(f) In the search for exhaustiveness of categories, one must be careful toavoid creating confusion by offering too many categories. In opinionquestions, a limit of seven response categories is often taken as a generalrule-of-thumb (Miller, 1956).

(g) When there are numerous possible responses in a personal interview, it ispreferable to show respondents a card on which the categories areprinted rather than have the interviewer read out the possible responses(see Figure 5.21). Apart from the possibility of having the interviewershow preference for one response by the mere tone of their voice, it hasbeen demonstrated that respondents show preferences for alternatives atthe start and end of the list when they are read out (because of short-term memory effects). The use of show cards also allows the interviewerto obtain a variation in the ordering of the alternatives by a randomselection of the card to be shown.


193

Per Week Per Year GroupNo income No income 1

$1 - $58 $1 - $3,000 2$59 - $96 $3,001 - $5,000 3

$97 - $154 $5,001 - $8,000 4$155 - $230 $8,001 - $12,000 5$231 - $308 $12,001 - $16,000 6$309 - $385 $16,001 - $20,000 7$386 - $481 $20,001 - $25,000 8$482 - $577 $25,001 - $30,000 9$578 - $673 $30,001 - $35,000 A$674 - $769 $35,001 - $40,000 B$770 - $961 $40,001 - $50,000 C

$962 - $1,155 $50,001 - $60,000 D$1,156 - $1,346 $60,001 - $70,000 E$1,347 - $1,538 $70,001 - $80,000 F

Over $1,538 Over $80,000 GPer Week Per Year Group

Figure 5.21 A Show Card for a Closed QuestionSource: Ampt and Waters (1993)

The use of cards with variable ordering can eliminate the possibility of a"donkey vote" biasing the results.

(h) If respondents are required to answer "Yes/No" questions or provide"Agree/Disagree" responses, it should be realised that it has been shownthat certain individuals are prone to respond in a positive way regardlessof the question. Low interest in the survey topic results in a largertendency to respond positively.

(i) As a general rule, it is necessary to pre-test closed questions verycarefully to insure that the proposed categories are understood, relevant,and more-or-less comprehensive.

Chapter 5

194

In personal interview surveys, a combination of open, closed and field-codedquestions may be most effective in exploring a topic. For example, Gallup (1947)suggested a "quintamensional plan of question design" in which a series of fivetypes of question are used to explore various components of an issue, as shownin Figure 5.22.

Q.1 Awareness of issue (Open or field-coded question)

Q.2 General feelings (Open)

Q.3 Answers on specific parts of issue (Open, field-coded or closed)

Q.4 Reasons for views (Open)

Q.5 Intensity of belief (Open or closed)

Figure 5.22 The Quintamensional Plan of Question DesignSource: (Gallup, 1947)

In general, the balance between open and closed questions will depend on anumber of situational factors. These include the objectives of the survey, theprior knowledge held by the investigator, the extent to which the respondent hasthought about the subject matter, and the ease with which the material inquestion can be communicated to the respondent.

5.7 QUESTION WORDING

It is obvious that a question must be understood by the respondent before asuitable answer can be given. The meaning of a question is communicatedprimarily by the words used in the question. While there are numerous studieson the problems of specific wording (e.g. Oppenheim, 1992) the following pointsare the essential features of question wording which should be borne in mind inthe design of a questionnaire.

5.7.1 Use Simple Vocabulary

Use simple and direct vocabulary with no jargon or slang. The language usedshould be suited to the subject and the population in question (note that in somesurveys a language other than English may be most appropriate). Many wordsand phrases which you think are simple may not be to the general population or,more importantly, to your particular survey population. For example, transportplanners frequently refer to "modes" of travel to denote different forms oftransportation such as cars, buses, trains, walking, etc. The general public,however, does not usually refer to these means of transportation as travel


195

"modes". It will be much more easily understood if you say "What means oftransport did you use to get there?" or, even better, "How did you get there?""Destination" is another word with similar problems. Try variations on "Wheredid you go?"

Make sure that words which you use have the same meaning to yourrespondents. In some cases, it may be necessary to define the meaning of certainwords (e.g. trip) which will be used in the questionnaire. If you are only going touse a word or phrase once only, it is best to give a full description immediately.For example, if you are only going to use the word trip once, rather than to say"a trip - which means...", it is easier to say something like "every time you movealong a public street...". Pilot testing with people from the population to besurveyed will assist with checking these words.

5.7.2 Use Words Appropriate to the Audience

While the language used should be simple, you should not give the appearanceof "talking down" to your respondents. If this situation is perceived byrespondents, you run the risk of alienating them for the entire survey. It shouldbe realised, however, that in surveying a large general population, it is virtuallyimpossible to abide by both these first two points on question wording if a singlequestionnaire is used for the entire population. To attempt to satisfy both points,it would be necessary either to use different question wording for differentsegments of the population (and assume that both wordings are measuring thesame phenomena) or else provide for an interviewer who can adapt andinterpret the question wording according to each respondent's circumstances(and assume that interviewer effects are minimal). The best solution is to use theday-to-day language you use when speaking to your friends (who are notcolleagues) about travel and transport.

A particular problem is encountered when dealing with a multilingualpopulation. Not all members of this population are equally fluent in the majoritylanguage, and for this reason it would be desirable to have different versions ofthe survey in the languages which are most likely to be encountered. However,this poses a potential problem in ensuring that each version of the survey is infact measuring the same thing because of subtle differences which may creep intoopen-ended questions during the translation process. A recent survey inSingapore (Olzewski, et al., 1994) used the personal interview method describedin Section 7.3 which meant that short diaries were left for respondents tocomplete before the personal interview. By having these diaries ("TravelMemos") available in Mandarin, Malay and Tamil as well as English, it waspossible to conduct many interviews in English since the difficult task (writing)had been done in the native language.

Chapter 5

196

5.7.3 Length of Questions

Questions which are long, complicated, or which involve multi-dimensionalconcepts are better split up into a number of simpler, shorter questions. On theother hand, with open questions it has been found that the length of the responsegiven by the respondent is directly proportional to the length of the questionasked by the interviewer. Short abrupt questions generally elicit short abruptreplies. The choice of question length will therefore depend on the objectives ofan open question. For closed questions, where the length of the response is pre-specified, short simple questions are recommended.

5.7.4 Clarify the Context of Questions

The context and the scope of questions should be made clear by giving therespondent an adequate frame of reference in which to answer the question. Forexample, in a study of residential location, the question "Do you like living here?"would be relatively useless (unless asked purely as an introductory not-to-be-coded question). Unless the respondents are advised on which factors to considerin their answer (e.g. closeness to work, quality of neighbourhood) the range ofpossible answers would make the question extremely difficult to evaluate.

5.7.5 Avoid Ambiguous Questions

Obviously, every attempt should be made to avoid ambiguous questions. Thetrouble with this guideline is that often the survey designer does not realise that aquestion could be legitimately interpreted by a respondent in a completelydifferent fashion to that which was intended. If ambiguities were obvious, theywould be easy to eliminate. Generally, though, ambiguities are only detected bysomeone other than the question designer. This emphasises another importantrole for pre-tests and pilot surveys. To properly test for ambiguity it isparticularly important that these tests are done with people of the type who willbe surveyed. Asking colleagues is the least likely method of removing anyambiguities. Never carry out even the most simple survey without a pilot test forwording.

One source of ambiguity, which is related more to layout than wording of aquestion, concerns the placement of labels for tick-the-box closed questions. Thelabels should clearly refer to one and only one box. For this reason labels shouldnot be placed between adjacent boxes on one horizontal line unless there isample separation between label/box pairs. This confusion applies especially tocategories in the middle of the list of categories, as shown in Figure 5.23.


197

How many vehicles were garaged at this household last night?

0 1 2 3 4 5 >5

Figure 5.23 Poor Placement of Tick-Box Labels

5.7.6 Avoid Double-Barrelled Questions

"Double-barrelled" questions should be avoided. Answers to questions like "Doyou like buses and trains?" are difficult to interpret by both the respondent andthe analyst. More subtle, but equally incorrect, is the question, "How do you feelabout public transport?" To some extent, the problem is ameliorated ifrespondents are asked to respond in an open way. If, however, they are given ascale, the answers to these questions can be considered basically worthless since itcannot be determined which form of public transport is being commented on.

5.7.7 Avoid Vague Words About Frequency

It is desirable to avoid vague words like "usual", "regular", "often", etc., unlessthere is a specific intention for the respondent to interpret the word in their ownway as, for example, in the question "How do you usually travel to work?" Thereason is because you will find that interpretations of this question are as manyas the number of times the question is asked. "Regular", for example, can mean"each day", "each week" or each month" and would therefore give no indicationof frequency., should that be the reason it was being asked.

Such a question could be used where it is impossible and/or unnecessary toobtain complete details of the journey to work or where it is expected that thereis little variation in the method of travelling to work. Generally, however, lessbias will be introduced into the response if the question is made more specific,e.g. "How did you travel to work today?"

5.7.8 Avoid Loaded Questions

In most situations, "loaded" questions should be avoided. The classic example of aloaded question is "Have you stopped beating your wife yet?" The essence ofsuch loaded questions is that it presumes that the question is relevant to therespondent with respect to their current activities. To eliminate such loadedquestions, use can be made of filter questions to determine whether subsequentquestions are relevant to the respondent.

Chapter 5

198

In some situations, however, the loaded question has been used successfully toobtain answers to potentially embarrassing questions. A famous situation can befound in the work of Kinsey et al. (1948, 1953) in their study of sexual behaviour.Rather than asking respondents whether they engaged in certain sexual practices,they went straight into questions about frequency and detail. In this way,respondents were made to feel that such practices were perfectly normalbehaviour and were able and willing to answer questions in great detail. Intransport surveys, however, questions are seldom likely to be as socially delicateas these, and, therefore, the use of filter questions is generally advised.

5.7.9 The Case of Leading Questions

While loaded questions severely limit the range of possible replies, leadingquestions, which merely encourage the respondent to reply in a particularmanner, are an equal, though perhaps more subtle, threat to the validity ofresponses. Leading questions can be phrased in several different ways, asdescribed in the next six items, each of which should be recognised.

• The use of emotionally charged words can induce respondents to answerin a particular fashion, e.g. "Do you think that powerful left-wing tradeunions should have more say in the control of public transport systems?"would probably bring a different response if the question were phrasedas "Do you think employees should provide an input to the managementof public transport systems?"

• The partial mention of possible answers will bias responses towardsthose answers, e.g. "Did you perform any activities on the way homefrom work last week (e.g. go shopping, play sport, etc.)?" will result in ahigher recall of these activities and a lower reporting of other activitieswhich have not been mentioned.

• The mention of a likely response framework can also lead to a leadingquestion. For example, "Do you think congestion has increased in the lastfive years?" is more leading than "Do you think travel conditionschanged in the last five years?"

• An appeal to the status quo is another example of a leading question. If aquestion implies or states that one of the alternatives to be chosen from alist is a representation of the present state of affairs, then there will be atendency for respondents, in the general population, to choose thisalternative because of the widespread tendency among the communityto accept things as they are now in the social order. This biasing effectcould also be used in reverse if the population in question is, for example,a group of radical university students.


199

• The use of qualifying phrases at the beginning or end of questions is anobvious way to construct a leading question, e.g. "Don't you think that. . ."? or ". . . isn't it ?" Such qualifiers are definitely taboo.

• The degree of leading involved will also depend on the context withinwhich a question appears in a questionnaire. If respondents are asked toagree with one statement in a list of statements, they will more likelyagree with the statement if the other statements are supportive of, ratherthan antagonistic towards, the statement in question.

• The context of a question can also be affected by the sequencing ofprevious questions. Such sequencing can result in obtaining predictablereplies to questions by forcing respondents into a corner because of theirprevious answers. This technique is a favourite ploy of salesmen, asillustrated by the following abbreviated dialogue between anencyclopedia salesman and the parents of young children.

SALESMAN: (after knocking on door of household) Good evening! I amconducting a "survey" concerned with the state of education intoday's schools. Do you have any children and are youinterested in their education?

PARENTS: Why, yes, we do have two kids, and we are alsoconcerned about what's happening in the schools today. Comeinside.

S: (After coming inside and getting settled) What do you think are themajor problems with the education today's children arereceiving?

P: (The parents are allowed to talk freely about the issue, giving themthe feeling that they "own" this "interview". The salesmancarefully notes any mention they make about reading, sourcesof information, reference material, etc.)

S: I note that you mentioned several times that you thought readingwas important and that kids spend too much time in front of thetelevision these days.

P: (The parents are given a further opportunity to committhemselves to the idea that reading is important and that kidsshould have an alternative to watching television all the time.)

S: And who do you think should be responsible for providing thekids with this alternative to watching television at home?

P: (The parents are able to say anything they wish, but almostinvariably they will, at some stage, say that it is theirresponsibility as parents.)

S: As parents, then, you feel that it is largely your responsibility toprovide adequate educational material for your children to read

Chapter 5

200

at home? What sources of educational reading do you think aremost useful?

P: (Once again, the parents are able to say anything they wish, butagain almost invariably they will, at some stage, say thatencyclopedias are a valuable resource material.)

S: So, as responsible parents, then, you feel that encyclopedias are avaluable educational resource for your children to have availablein their home?

At this stage, the salesman has committed the parents to such an extentby their previous answers that they can hardly turn around and say thatthey do not believe that they should consider buying the encyclopedias.By careful sequencing of questions, the salesman has turned theirposition from one of probable indifference, or even antagonism, towardsthe encyclopedias to one of support. While this example is somewhatextreme, the same process can be seen at work in more subtle ways inmany examples of survey design.

The effect which question sequencing can have on the outcome of laterquestions is well demonstrated by the following extract from the populartelevision series "Yes, Prime Minister" (Lynn and Jay, 1989, pp. 106-107),in which Bernard is instructed by Sir Humphrey Appleby on the moresubtle points of survey design:

"He was most interested in the party opinion poll, which I had seenas an insuperable obstacle to changing the Prime Minister's mind.

His solution was simple: have another opinion poll done, one thatwould show that the voters were against bringing back NationalService.

I was somewhat naive in those days. I did not understand how thevoters could be both for it and against it. Dear old Humphreyshowed me how it's done.

The secret is that when the Man In The Street is approached by a niceattractive young lady with a clipboard he is asked a series ofquestions. Naturally the Man In The Street wants to make a goodimpression and doesn't want to make a fool of himself. So themarket researcher asks questions designed to elicit consistentanswers.

Humphrey demonstrated the system on me. 'Mr. Woolley, are youworried about the rise in crime among teenagers?'


201

'Yes,' I said.

'Do you think there is a lack of discipline and vigorous training in ourComprehensive Schools?'

'Yes.'

'Do you think young people welcome some structure and leadershipin their lives?'

'Yes.'

'Do they respond to a challenge?'

'Yes.'

'Might you be in favour of reintroducing National Service?'

'Yes.'

Well, naturally I said yes. One could hardly have said anything elsewithout looking inconsistent. Then what happens is that the OpinionPoll published only the last question and answer.

Of course, the reputable polls don't conduct themselves like that. Butthere weren't too many of those. Humphrey suggested that wecommission a new survey, not for the Party but for the Ministry ofDefence. We did so. He invented the questions there and then:

'Mr. Woolley, are you worried about the danger of war?

'Yes,' I said, quite honestly.

'Are you unhappy about the growth of armaments?'

'Yes.'

'Do you think there's a danger in giving young people guns andteaching them how to kill?'

'Yes.'

'Do you think it is wrong to force people to take up arms againsttheir will?'

'Yes.'

Would you oppose the reintroduction of National Service?'

Chapter 5

202

I'd said 'Yes' before I'd even realised it, d'you see?

Humphrey was crowing with delight. "You see, Bernard," he said tome, "you're the perfect Balanced Sample."

5.7.10 Avoid Double Negatives

In the interests of clarity, double-negatives should normally be avoided eventhough, to the survey designer, the idea to be tested might most accurately bedescribed in terms of a double-negative. An example of a double negative: "Doyou not think that the reduction of trams is undesirable?"

5.7.11 Stressful Questions

Questions should generally be non-stressful and non-threatening. Respondentsshould not be forced into admitting anti-social or non-prestigious behaviour inorder to answer a question truthfully.

5.7.12 Avoid Grossly Hypothetical Questions

Grossly hypothetical questions should be avoided, or, at least, the answersshould be treated with some caution unless the answers are cross-referenced toother questions or information, or unless the respondent is faced with a clear andrealistic trade-off situation when making a response. Thus the question "Wouldyou like a more frequent bus service?" is useless because most people wouldobviously answer in the affirmative. However, the question "Would you preferthe frequency of buses to be doubled if the fare also increased by 50 percent?"could provide useful information on the trade-off between frequency and fares.

This issue is of particular importance as the use of stated preference surveys(Section 5.5.4) gain popularity in transportation planning (Hensher and Louviere,1979; Bonsall, 1985; Kroes and Sheldon, 1988; Polak, 1994).

5.7.13 Allow for the Effect of Response Styles

The effect of response styles should be accounted for in the wording and layoutof closed questions, especially opinion questions. A response style is a tendencyto choose a certain response category regardless of the question content. Severalresponse styles are of particular importance. First, there is the acquiescenceresponse style where some respondents, in an agree/disagree question format,consistently choose "agree" even when the content of the question is reversed. Toovercome this, one could specify an equal number of positively and negativelyworded statements such that the acquiescence effect is counterbalanced.Alternatively, one could rephrase the question such that a specific response,rather than an agree/disagree response, is required.


203

A second style is governed by social desirability where a respondent always givesanswers which are most favourable to self-esteem irrespective of therespondent's true opinions.

A third response style consists of order or position biases when answering multiplechoice questions or using rating scales. Thus, for example, Payne (1951) has notedthat respondents will be more inclined towards the middle in a list of numbers,towards the extremes in a list of ideas, and towards the second alternative in a listof two ideas. Similarly, some individuals will consistently mark on the left, right,or centre of horizontal rating scales. Order and position biases can be controlledby having a number of alternative forms on which the positions and orders arereversed or randomly distributed. The expense incurred in printing multipleversions of a questionnaire, however, can only be justified if the biases involvedare severe, although the increasing use of computer-based surveys makes thisoption much more feasible.

5.7.14 Care with Periodicity Questions

When asking questions concerning the frequency of periodical behaviour (suchas moving house, or performing holiday travel) there are a number ofalternative question formats. One could ask, for example, "How many holidaytrips do you usually make per year?", or "How many times did you go to worklast week?" The definition of last week, for example, can be any of the following:

• Sunday to Saturday• Monday to Sunday• Monday to Friday• Today - back seven days• Yesterday - back seven days

Clearly these self-defined weeks can lead to significantly different response time-frames and hence results. Possibly the best solution is to say "How many times inthe last seven days...?" in the 1981 Sydney Travel Survey, this was supplementedwith a question which ran, "Did you work yesterday (say, Tuesday), did youwork Monday? did you work Sunday? etc., i.e. working backwards througheach of the preceding seven days?

Naturally, the level of effort used in this type of question relates to the surveyobjectives, but it is possible to gather very precise data if the questionnaire isdesigned carefully.

5.7.15 Use of an Activity Framework

As discussed in Section 5.3.5, the recording of trips will be done most accuratelyby respondents if the trip is placed within an activity framework. This means that

Chapter 5

204

respondents are asked "what they did" rather than "where they travelled" toencourage them to put travel in context. Further, if a diary is used to assist in therecording of trips as they are made, the results can be improved even further.

5.7.16 Flow of the Question

An important test of questionnaire wording is to check whether the questionreads well. Several features should be considered, such as insuring that the keyidea in the question appears last. To avoid the respondent prejudging the intentof the question, all qualifiers, conditions and other less important material shouldbe placed at the start of the question. Punctuation, in the form of commas, semi-colons, etc., should be kept to a minimum. The objective is to get a questionwhich reads well, not one which is necessarily strictly correct grammatically. Forexample, "Who did you travel with?" should actually be "With whom did youtravel?", but the former is likely to give the impression of a much morerespondent-friendly questionnaire than the latter.

Key words in the question should be identified by means of underlining or theuse of a different type-face. Finally, abbreviations should not be used no matterhow much space they save or how well you think the respondent might knowthe abbreviation.

5.7.17 Always Use a Pilot Test

The acid test of questionnaire wording is the conduct of a pre-test or pilot survey.As we have mentioned before, this test should not be confined to your workassociates, who probably think along the same lines as you do anyway, butshould include people from the same population that are to be surveyed in themain survey.

5.8 QUESTION ORDERING

Given a set of well-worded questions, consideration should be given to theordering of questions to ensure a smooth, successfully completed interview. Theopening questions may be of two forms. First, any screening questions should beasked so that wasted time is kept to a minimum. If respondents are not eligiblefor membership of the population it is best to find this out quickly to avoidwasting both the respondent's and the interviewer's time. Having asked thesequestions, or if screening questions are not needed, the opening questions shouldbe used to put the respondent at ease, to establish rapport and to get therespondent interested in the survey. Open questions are often very useful in thiscontext to enable the respondent to quickly air their views on the subject. Careshould be taken, however, to ensure that this open-ended discussion does notcarry on for too long. Otherwise, the respondent may tire before reaching the


205

main part of the interview and may become annoyed at having to repeat someof the points of closed question form later in the interview.

The body of the questionnaire should be arranged in a logical sequence movingfrom point to point. Where there is an unavoidable break in the train of thought,the interviewer should warn the respondent and briefly explain what the next setof questions will be about, as shown in Figure 5.24.

Figure 5.24 An Introduction to a Set of QuestionsSource: Ampt (1992b)

The end of the interview is a good place for two types of questions. First, anypersonal questions which you believe may occasionally meet with refusal if askedearlier in the questionnaire may be asked at the end. The reason for this istwofold. First, the respondent (and the interviewer in a personal interview) willhave had the chance to gain rapport by this time and hence personal questionswill have a higher chance of being answered. Second, even if the respondentrefuses to answer these questions at this stage and halts the interview, at least therest of the questions will have already been answered and hence not much is lostby concluding the interview at this stage.

A second type of question at the end of an interview is an open-ended questionin which the respondent's general comments are sought. These comments maybe irrelevant to the survey topic but at least they will make the respondent feelbetter. Alternatively, the comments may be very useful in highlighting pointswhich were not thought of in the design of the questionnaire. Either way, itprovides a smooth, pleasant way of finishing an interview and taking your leave.

Within the body of the questionnaire there are a number of points worth noting.It is important to be aware of the conditioning effect of previous questions and theway in which answers to earlier questions can lead the respondent to answerlater questions in a particular manner. In exploring a particular issue, one can use

Chapter 5

206

a funnel sequence of questions which start out very broad and successivelynarrows the topic down to very specific questions. Alternatively, one can use aninverted funnel in which initially narrow and specific questions lead on to moreand more general questions. The use of either type of sequence will depend onwhether it is difficult to elicit answers on either the general or specific topic.

While it has been stated several times that filter or skip questions or sequenceguides should be used to save time, effort and frustration, one must be carefulhow and where such filter questions are used. For example, if filter questions areplaced before each set of questions throughout the interview and if a "NO"response to a filter question means that a set of questions is skipped, it is notbeyond the wit of a reluctant respondent to figure out that a "NO" response to allthese questions will result in a very short interview!

A better way to use filter questions is to mix the required responses between"YES" and "NO" so that the respondent is not sure how to skip a set of questionsor, alternatively, to ask all filter questions at the start of the interview so that therespondent does not realise that the answers are being used to determinewhether or not to ask certain sets of questions later in the interview.

A final point on question ordering is to realise that fatigue is likely to set intowards the end of a long interview or questionnaire and that this may tend tobias the answers to latter questions. It is therefore desirable to keep the moreimportant questions away from the end of the interview, if possible. Better still,keep the questionnaire short!

5.9 QUESTION INSTRUCTIONS

The questionnaire or interview form is only one part of the documentationneeded to conduct the survey. Also needed is a set of instructions explaining torespondents how to fill in the questionnaire, or a set of instructions explaining tointerviewers how to conduct the interview.

5.9.1 Self-Completion Surveys

With self-completion surveys, instructions may be of two types: generalinstructions which apply to the entire questionnaire and specific instructionswhich apply to particular questions. General instructions should be placed at thebeginning of the questionnaire while specific instructions should be placeddirectly in front of the question to which they refer. Often, for smallquestionnaires, it is possible to include an example questionnaire which hasalready been completed to provide some guidance on the methods to be used incompleting the questionnaire, as shown in Figure 5.25.


207

Care should, however, be taken to ensure that the responses on the examplequestionnaire do not lead the respondent to give specific answers. Also, if anexample questionnaire is provided it should be referred to in the introductoryinstructions for the questionnaire.

In general, instructions for self-completion questionnaires should be kept to aminimum. In some respects, good design and wording of the questionnaireshould largely eliminate the need for extensive instructions. Besides, manyrespondents do not read long instructions but merely prefer to answer thequestions as they see fit. Therefore, instructions should only be used to definekey terms, to give directions for skipping questions and to explain questionswhich, after extensive pilot-testing, still appear to need instructions forunambiguous completion.

Another set of question instructions in a self-completion survey should be theAdministrator's Manual - a manual very similar to the Interviewers' Manualdescribed in the next section. It contains information on the survey objectives, thereasons for all questions, data definitions, and so on, to ensure that when theadministrative personnel are phoned by any respondents, they are asknowledgable as the researchers on all these aspects.

Chapter 5

208

Figure 5.25 The Use of a Completed Sample QuestionnaireSource: Richardson and Ampt (1993a)

5.9.2 Personal Interview Surveys

With personal interview surveys, the need for instructions is much more broadlybased. Not only are instructions needed to assist the interviewer in asking thequestions, but the instructions should help in all facets of the conduct of theinterview. Often these instructions are formulated in the form of an InterviewerInstructions Manual. One widely-quoted manual is that produced by the SurveyResearch Center at the University of Michigan (University of Michigan, 1976). Anexcellent summary is available in Warwick and Lininger (1975).


209

In addition to an Interviewers' Manual which interviewers can refer to at anytime, a training session is important. At this training session it is important thatthe survey designer instil in the interviewers not only the answers to a set ofrules, but an understanding of the entire survey process. In particular,interviewers need to understand why the survey is taking place, how the samplewas chosen, why, for whom is the work being done, and so on. Only with thislevel of background can interviewers feel confident and hence achieve the levelsof response which are necessary for gaining robust data. This implies that thetraditional one-two hour "briefing" is rarely enough to adequately provideinterviewers with the proper level of information about the survey. It is usuallybetter to think in terms of "interviewer training" - even for experiencedinterviewers it is likely that the topic and application will be new - and plan to setaside a whole day for the exercise. It is certain to pay off in terms of the quality ofthe data obtained - and usually in the loyalty of the interviewers who find thistype of treatment a sign of respect of their abilities. For new interviewers usingthe personal interview survey described in Section 7.3, the training sessions needto be about 3 days in length, including practice sessions and actual fieldexperience in the evenings.

In this training session and in the Interviewers' Manual, guidance should beprovided to the interviewer on the following components of the interview task:

(a) Understanding the survey objectives. The interviewers should know andunderstand the survey objectives, since this helps clarify the need formany of the features of the survey. If, for example, the objective of thesurvey is to measure people's exposure to the risk of accident, it becomesrelatively easy to understand why one of the questions for each walk tripis, "How many streets did you cross?"

(b) Explaining the survey - it is essential that respondents know a bit aboutthe background of the survey. The manual should provide a graduatedseries of explanations for the interviewer to use to satisfy varyingdegrees of curiosity. Knowledge of the survey helps give theinterviewers a professional demeanour which is essential to gaining highresponse rates.

(c) Finding the sample - often the sample may not be easy to locate, e.g.finding households in high-density inner urban areas or, conversely, inlow density semi-rural or rural settings. The interviewer should also beinstructed on how many call-backs to make if the respondent cannot becontacted immediately.

(d) The method by which the sample was selected. This provides theinterviewer with an understanding on what do if there are problems in

Chapter 5

210

finding the sample and in explaining the way the sample was chosen tothe respondents.

(e) Gaining entry to household - often the most difficult task is perceived tobe gaining initial acceptance by the respondent. The Interviewers'Manual should provide guidance on factors such as the effects of theappearance of the interviewer, the timing of the initial contact, theopening introductions and the methods of conveying a sense ofrespectability.

(f) Handling problem situations - often the interviewer will have to dealwith comments like "I'm too busy", "I'm not interested", "Who's behindthis survey?", "Do I have to fill it out?", "What's the use of the survey?",or "Sorry, he's not at home". The manual should provide methods ofdealing with these, and other, potentially embarrassing situations. Themanual should also describe how to deal with outright refusals.

(g) Dealing with third parties - often when asking opinion questions it isdesirable for the interviewer to be alone with the respondent. In family-situations, however, this may not always be possible. The manual shoulddescribe how to "get rid of" unwanted third parties.

(h) Asking the questions - instructions to interviewers for asking thequestions may be included generally in the instructions manual andspecifically on the questionnaire form. In the instructions manual thefollowing guidelines are generally appropriate:(i) Use the questionnaire carefully, making sure to ask the

questions verbatim. Asking questions in a different way isthe same as asking different questions;

(ii) Know the specific purpose of each question;(iii) Know the extent of probing allowed on each question;(iv) Follow the order of questions indicated on the questionnaire;(v) Ask every (appropriate) question;(vi) Do not suggest answers;(vii) Do not leave any questions blank.

(i) Probing techniques - in an effort to obtain an adequate response to eachquestion, the manual should describe some of the types of probes whichare acceptable, and give examples of those which are not acceptable. Thetype of probe allowed for each question (if any) should be indicatedeither on the questionnaire or in the manual. Some types of acceptableprobes are:


211

(i) The "pregnant pause" which encourages respondents to fellthey should fill in the silence with an answer (not to beconfused with the "embarrassed silence");

(ii) Overt encouragement;(iii) Elaboration;(iv) Clarification;(v) Repetition.

All probes are acceptable only if they are neutral and do not lead therespondent to answer in a specified way.

(j) Recording the responses - particular attention needs to be paid to themethods of recording responses to open and field-coded questions. Thefollowing instructions provide some useful guidelines:(i) Record all answers immediately;(ii) Abbreviate words and sentences according to an agreed

system;(iii) Include all probes used;(iv) Think about the writing device to be used. It is easier for

interviewers (and data processors) to record responses in acolour different from the one in which the questionnaire isprinted. If it is an outdoor survey, consider the weather (seeSection 5.4).

(j) Concluding the interview - convenient and non-hasty ways of concludingthe interview should be suggested in the instructions.



6. Pilot Surveys

This chapter is relatively short but very important! At this stage in the surveyprocess, we have a questionnaire and a sample of respondents to whom we wishto administer the questionnaire. It would therefore seem appropriate to go aheadand do just that. However, as shown in the survey process diagram (Figure 1.1),it is wise, if not essential, to first perform a pilot survey before embarking on themain survey.

6.1 WHY A PILOT SURVEY?

We can hear you saying that you know of many survey designers who have"jumped in at the deep end" without using a pilot survey and survived. They mayhave survived, but the credibility of their data has undoubtedly been questioned.And it is unlikely that they have had reasonable answers to justify the queries. Ifa survey question was asked badly, or if a sample was chosen incorrectly, thereoften is no real answer. The survey was simply a waste of time.

Check with anyone who has designed a survey - ask them if they can be sure thedata they collected was reliable (ask yourselves if you have designed a survey!).We can almost guarantee that those people who did not do a pilot survey will beable to mention many weaknesses. Indeed, those of us who carry out pilot

Chapter 6

214

surveys as a matter of course still find small problems as we go into the mainsurvey, so it is not conceivable that pilot-free surveys do not have problems.

Although pilot testing is one of the most important components of the surveyprocedure, it is also one of the most neglected. The usual reasons for not doingpilot surveys are said to be either lack of time or money (or both). However, notdoing a pilot survey, or at least a series of pre-tests, almost always turns out to bea false economy. If the survey designer has been correct in all the assumptionsthey have made in the design of the sample and the questionnaire, then the pilotsurvey will not pick up any problems and, in many cases, the data obtained canbe combined with the rest of the data obtained from the main survey. In such acase the pilot survey will have effectively cost nothing. If, on the other hand, thesurvey designer has been less than perfect and the pilot survey does detect someproblems, then at least these problems can be rectified before the main survey iscarried out. In this case, the pilot survey will have saved the cost of the errorsoccurring in the main survey. For the above reasons, a pilot survey is a usefulfail-safe precaution to take before the conduct of the main survey.

Even survey techniques and questionnaires which have been used successfully insimilar circumstances in other surveys by the same survey designer haveeffectively been subjected to extensive pilot testing and need to be tested again ifthey are carried out on anyone other than the original population. An example ofthis was a large-scale self-completion travel survey carried out by the TransportResearch Centre in Brisbane, Australia (Richardson and Ampt 1993a). The surveywas conducted in 1992 and in 1993 the same questionnaire was to be used inMelbourne. Although it was known that the changes were very minor (differentfare structures for public transport, no toll system in Melbourne etc.) a pilotsurvey was carried out. Some significant differences were, in fact found, whichneeded moderately major modifications to the questionnaire and survey design.For example, the large ethnic population needed addressing, there were manyfewer holiday homes in Melbourne (which affected the sample) and a muchlarger number of people describing themselves as "other pensioners" needed tobe dealt with in the questionnaire design (Ampt 1993, 1994; Richardson andAmpt, 1993b).

6.1.1 A Test of ALL Aspects of Survey Design

Pilot surveys are most often associated with testing the adequacy of thequestionnaire instrument itself. The association of pilot surveys withquestionnaire design is, however, slightly misleading. As describedcomprehensively by Ampt and West (1985) and Oppenheim (1992), pilot surveysare designed to test all aspects of the survey process, not just the adequacy ofquestionnaire design. Informal trial-and-error testing of various components of

Pilot Surveys

215

the survey process is performed by pre-tests, whereas pilot surveys are asystematic dress rehearsal of the main survey.

In many cases, various options may be tested within a pilot survey, within thecontext of a controlled experimental design, to establish which option will proveto be most effective in the main survey. An excellent example of this technique ina transportation setting is provided by Sheskin and Stopher (1982b). The surveybeing tested was an intercept survey of people travelling on-board buses and wastesting the new design of asking respondents to fill out some questions whiletravelling and to take another part of the questionnaire home to fill in duringtheir own time. Two versions of the on-board form and three versions of the take-home form were devised. The versions were distributed in a systematic mix toconsecutive bus travellers as they boarded, to assure that, as far as possible, thefull range of survey instruments was distributed at each stop.

There were many interesting findings, one of which was that the longer versionof the on-board form actually gained a better response than its shortercounterpart. Apparently the presence of some perceptual questions sparkedrespondents' interest in the longer form.

Not surprisingly, some rewording was suggested after careful scrutiny of thereturns in the pilot. In addition, it was found that it was possible to remove oneentire section of the survey questions (about a specific mode), since sufficientresponses were actually being obtained elsewhere in the questionnaire. Thus,some very positive, and in the long run, cost-saving measures were learned fromthe extensive pilot study of the on-board and take-home forms. Moreimportantly, a small in-house pre-test on secretarial staff had failed to uncoverthe full extent of the problem revealed in the pilot study. As the authorssummarise, "Had a decision been made on the basis of the in-house pre-test touse the form, the expensive main survey might have failed to generate data ofsufficient quality to support the modelling effort".

6.1.2 A Need for Several Tests

In large surveys it may well be advisable to conduct more than one pilot survey.For example, in connection with the 1981 Sydney Travel Survey, Ampt and West(1985) describe the conduct of four different pilot surveys as described inTable!6.1.

The four pilot surveys were spread over a considerable period of time (nearly 2years). The information gathered very early in the first exploratory skirmish wascollected at a point in time when the final form of the survey was far from clear,but nonetheless it had a considerable impact on shaping the final survey. Forexample, it was initially hoped that some of the more advanced methods of traveldemand modelling, exemplified, for example, by the work of Jones et al., (1983)

Chapter 6

216

and Brög and Erl (1980), could have been applied at a large metropolitan areascale. It soon became clear, however, that such an ambition was not feasible atthat scale, and the direction of the data collection and modelling efforts wereadjusted accordingly very early in the process before too much time, money andintellectual effort had been invested in further development of the modellingprocesses.

Table 6.1 Pilot Studies for the 1981 Sydney Travel SurveySource: Ampt and West (1985)

PILOT STUDY TIME TESTING OBJECTIVES

1. ExploratorySkirmish

July 1979 To observe travel determinants and to test thefeasibility of collecting behavioural data on alarge scale.

2. Skirmish Aug. 1980 To test the effectiveness of questionnairewording and design.

3. Pilot Dec. 1980 Continuing test of design. To gain preliminarydata and response rates and throughput datawith varying pre-contact methods.

4. DressRehearsal

April 1981 Final test of all aspects including coding and alloperational details.

6.2 USES OF THE PILOT SURVEY

Pilot surveys (including various types of pre-test) are used to give guidance on allof the following aspects of survey design:

6.2.1 Adequacy of the Sampling Frame

Is the chosen sampling frame complete, accurate, up-to-date and easy to use? Thebasic question to be answered is how well does the sampling frame correspond tothe desired survey population. If the frame gives very poor correspondence or isgrossly deficient in any other way, it is as well to know this before the mainsurvey begins. If the frame is only slightly deficient it may be possible to remedythis and still use it in the main survey. For example, the pilot survey mightsuggest some screening questions which might be added to the start of theinterview to refine the final target population.

6.2.2 Variability of Parameters Within the Survey Population

As described in Chapter 4.6, it is necessary to have an estimate of the populationvariability of the parameter to be estimated before the required sample size canbe calculated. Pilot surveys can be useful in supplying estimates of variability fora number of parameters of interest. It is of dubious value, however, to use pilot

Pilot Surveys

217

studies as the sole source of information on parameter variability. Pilot studiesare generally restricted to a relatively small sample and, hence, the reliability ofan estimate of the variance (which is itself subject to a standard error) will not bevery great. To obtain reliable estimates of parameter variance would require alarge sample size, which is generally not feasible in a pilot survey. The pilotsurvey should therefore be relied on to provide supporting evidence only, exceptwhere no other sources of prior information are available.

6.2.3 Non-Response Rate

The probable number of refusals or non-contacts (households where no contactcan be made after repeated visits in a personal interview survey) and instances ofsample loss (e.g. vacant dwellings, non-existent phone numbers, etc.) can beestimated from a pilot survey, provided that the same methods are used in thepilot survey as will be used in the main survey. Alternatively, a number ofdifferent methods of reducing non-response may be tried in a pilot survey andthe results then compared to determine the most effective method.

Causes of non-response should be noted, to assist in the possible redesign of thesurvey procedure. The magnitude of non-response can also assist in estimatingthe total number of sampling units to be chosen from the population (to obtain aspecified number of completed questionnaires).

If possible, some indication of the characteristics of non-respondents should beobtained to determine whether non-response bias is likely to be a major problemin the main survey. This can be done by following up non-respondents to checkwhether their travel behaviour is significantly different than that of respondents.A methodology for doing this is described in Chapter 7.

6.2.4 Method of Data Collection

The ease or difficulty of collecting the required data may indicate whether anappropriate data collection method is being employed. For example, very lowresponse rates and a general lack of understanding of a self-completionquestionnaire may indicate that the wrong survey method is being used. Perhapsa personal interview survey is necessary to increase response rates and to providefor the assistance of an interviewer in asking questions and recording responses.

An interesting example of testing various survey methods to collect the samedata (Ampt, 1992a) used 6 different data collection methods: personal interview,and self-completion methods which tested 1 and 2 day variations of "linked" and"un-linked" (stage) data collection methods (see Chapter 2.5 for a definition).

Chapter 6

218

6.2.5 "Skirmishing" of Question Wording

An important reason for carrying out a pilot survey is to test the wording of thequestions in the survey. One way of doing this is to get respondents to completethe questionnaire (if it is self-completion) or to answer the questions in a personalinterview, and then go over each of the questions with an interview and, using acompletely open question, check the respondent's understanding of the question.This process, which is often time-consuming in the short term, is an invaluablemethod of ensuring that respondents not only understand the questionnaire inthe same way as the survey designer, but also that everyone is understanding thequestion in the same way. As we have mentioned several times, pilot surveys -but tests of wording in particular - need to be carried with the type of people whowill complete the main survey.

A similar procedure which has been used with success in self-completion surveysis a type of "participant observation" survey. This involves giving therespondents the questionnaires in the envelope (in the way they would receivethem in the mail) and observing all behaviour from emptying the contents of theenvelope to filling in the forms. This has proved invaluable in understandingmany of the problems which cannot otherwise be understood in trying todevelop these survey forms. Small hesitations, and observing common "wrongmoves" with the pen, all give clues about the user-friendliness of the design.

6.2.6 Layout of the Questionnaire

Apart from question wording of the questionnaire, layout is also of criticalimportance in both self-completion and personal interview surveys. For example,is the location of the tick-boxes relative to the respective answers causing aproblem (see Section 5.7.5)? To discover this, it will often be necessary to actuallyask respondents or interviewers what they meant, since the analyst will only seea ticked box and the respondent/interviewer will have had no trouble ticking abox according to their own perceptions.

Colour and contrast can be used to good effect on the questionnaire form, butwhen things are written too small, together with the use of the "wrong" colours,they tend to be overlooked by the person completing the form, and the form mayneed re-working after the pilot survey.

6.2.7 Adequacy of the Questionnaire in General

This is the reason most often voiced for the need for pilot surveys. Although thequestions will probably already have been tested on colleagues and friends, apilot test using the same kind of respondents (and interviewers) as will beencountered in the main survey is the only true test. The aspects of questionnairedesign which need to be specifically checked in pilot surveys include:

Pilot Surveys

219

(i) The ease of handling the questionnaire in the field - are instructionsclear and are answers easy to record? Are the survey forms of amanageable size and shape?

(ii) Are definitions clear? Are there any consistent misinterpretations onthe part of either the respondents or the interviewers?

(iii) Are questions clear and unambiguous? Are there signs thatrespondents, or interviewers, have misunderstood the intent of thequestion?

(iv) Are the letters of introduction signed by the right people - tooprestigious, or not prestigious enough? Is the phone query line beingused?

(v) Are special techniques such as attitude rating scales producing validanswers. Too much bunching of answers may indicate a leadingquestion or badly chosen categories. Too many "Don't know"responses might indicate a vague question, one which ismisunderstood by respondents or one which is simply pointless. Toomany refusals to a question may indicate that it should be askedmore delicately, or that the order of questions should be changed orthat the question should be omitted.

(vi) Is the questionnaire too long? Too many unanswered questions orhurried answers towards the end of the questionnaire indicate thatperhaps the questionnaire is too long for the amount of interestshown in the subject matter. Perhaps the questionnaire should beshortened or else attempts should be made to raise the level of therespondents' interest in the survey.

(vii) Can some open-questions be converted into closed-questions in themain survey? The responses obtained in the pilot survey can be usedto define the closed-question categories.

(viii) Is the sequencing of questions clear? Are respondents being forced toanswer a series of questions which do not pertain to them? Are morebranching and skip questions needed? Is it clear what questionshould be answered next after a branching question?

(ix) Are the coding categories clear and adequate for closed and field-coded questions? Should some categories be removed, combined oradded to the list? Is there a problem with the lack of an openalternative with any of the closed questions?

Chapter 6

220

One of the most important ways to test these aspects in a personal interviewquestionnaire is for the survey designer (usually you!) to actually carry out someinterviews on the same population as will be sampled in the main survey. In ourexperience, there is no better way to understand the complexities of surveydesign!

6.2.8 Efficiency of Interviewer and Administrator Training

The pilot survey provides an ideal opportunity for testing the interviewertraining procedures as well as for in-field training of interviewers. Theperformance of interviewers under real survey conditions may indicate areas inwhich retraining is necessary. Areas of poor performance can be detected bychecking the collected questionnaires and by supervision of selected interviewsby a field supervisor or the survey designer.

Training of administration staff is an important part of many surveys,particularly those using mail-back self-completion questionnaires. The day-to-day management of the survey office requires extremely well-organised andknowledgable people. Not only do they have to understand the daily tasks ofmailing (even though this is usually computer-automated, there understanding isessential for problem-solving on bad days!), but they have to be extremelycompetent in dealing with respondent's phone enquiries, and often also with dataentry (where this is part of the responsibility). This means that these staff needsubstantial training - very similar to interviewer training, in fact (Section 7.1.2).Failing to test this in the pilot survey can have significant negative effects -particularly in the first busy days of the main survey!

6.2.9 Data Entry, Editing and Analysis Procedures

The pilot survey should include these aspects of the survey design processbecause of the backward linkages of these parts of the process to earlier parts ofthe survey. The effectiveness of pre-coded questions, the ease of data entry, thecompleteness of coding information, the required logic and editing checks, andthe relevance of the analysis output to the objectives of the survey are examplesof the sort of items which should be checked in the pilot survey. There have beenmany occasions when a data item collected in a survey has not been in the exactform anticipated by the user of the data, and has been subsequently rejected fromthe analysis. A simple check in the pilot survey could readily have averted thisproblem.

Do not put off testing these parts of the survey process just because it seems thatit is still a long time before you will have to analyse the real data - it will be toolate to rectify problems in the structure of the data when it comes time to performthe real analysis.

Pilot Surveys

221

6.2.10 Cost and Duration of Survey

The pilot survey can also provide estimates of the cost and time requirements percompleted interview which will assist in the determination of the total surveyresources required for the survey of the total sample. If this amount exceeds thebudgeted amount, then a decision can be made at this stage as to whether to seekan enlarged budget, or to accept a reduced accuracy (through smaller samplesize) or to abandon the main survey altogether.

In pilot studies for personal interview surveys, it may be possible to payinterviewers by the hour and the distance travelled, and then to calculate a perinterview rate for the main survey. This not only simplifies administrationprocedures for the main survey, but also ensures that interviewers workefficiently during the main study.

One of the costs which is most often under-estimated in survey design is that ofdata-entry, validation, editing and analysis. Particularly in large-scale travelsurveys, time and again we hear of stories of lack of funds for these keyprocesses. Here again, the simple inclusion of these processes in the pilot -together with detailed costing of each stage - can help to avoid this problem.

In order to facilitate least error in the projections from the pilot survey costs to themain survey, it is necessary to plan the accounting process prior to the pilotsurvey. Even though the pilot survey may be relatively low cost, it is important todocument all items and time spent to ensure that proper projections can be made.

6.2.11 Efficiency of Survey Organisation

The efficiency of office administration and field supervision can be checked bytreating the pilot survey as a small-scale dress rehearsal of the main survey. Is theworkload for interviewers and supervisors manageable in a personal interviewsurvey? Are there any problems with travel to the interview sites, security whileperforming the interviews, problems with minority languages? For a self-completion mail survey, does the outbound and return mail system work as itshould? How long does it take for mail to travel in each direction? What are thepostage rates and how does the weight of your survey material (both outboundand inbound) compare with any thresholds in postal rates? Are the proceduresfor stuffing and addressing envelopes working satisfactorily and, importantly,will they work the same when you are dealing with a much larger volume of mailin the final survey? Many of these issues may seem trivial at first, but taking thetime to test them in the pilot may mean that you will not have to spend a lot oftime trying to work around their deficiencies in the main survey when you willhave a million-and-one things to do already!

Chapter 6

222

6.3 SIZE OF THE PILOT SURVEY

The size of a pilot survey is a trade-off between cost and efficiency. It cannot be asextensive as the main survey but nevertheless it should be large enough to yieldsignificant results. This is especially important if the pilot survey is being used tocompare alternative procedures in survey method, sample design orquestionnaire design. In such a situation the pilot survey should be of sufficientsize to ensure that if substantial differences between the methods do exist thenthey can be detected statistically.

Usually, however, the size of the pilot survey will be related to the size of themain survey. A reasonable rule of thumb is to expect to spend about five to tenpercent of the total budget on the pilot survey. While this may seem, to some, tobe a waste of resources, it generally turns out to be a wise investment. Rememberthat many of the tasks required for the pilot survey will be required to be donelater in any event, so they may as well be done early in the process when youhave a chance of correcting your mistakes. In addition, if there are no problemsdetected in the pilot, then it may be possible to combine the pilot survey datawith the main survey data in a final data set.

Finally, with suitable cautionary notes, you can use the results of the pilot surveyto present an interim report to the sponsors of the survey to let them know thatthe survey is in progress, and to give them a feeling for the type of results whichthey could expect to see from the main survey. This, in fact, can also be seen assimply a pilot survey of the final stage of the survey process, which is thepresentation of results and the documentation of the survey.

Administration of the Survey


7. Administration of the Survey

While pilot surveys form a vital component of survey design, as discussed in theprevious chapter, several other aspects of survey design must be consideredbefore a pilot survey can actually be administered. This chapter will describeparticular aspects of the administration of various types of survey including theroles of recruitment and training of interviewers, pre-publicity, field supervisionand follow-up validation procedures. It will then provide some more specificcomments with respect to the five major types of survey described earlier inChapter 3.2 (viz., self-completion, personal interview, intercept surveys,telephone surveys, and in-depth surveys). Other aspects of survey administrationmay be found described in texts such as Warwick and Lininger (1975), Parten(1965) Hoinville and Jowell (1978) and Pile (1991). A particularly good referenceis the book on Total Design Principles by Dillman (1978). While this book waswritten in a pre-computer era, it highlights the myriad of details which must beattended to in the design and administration of surveys.

7.1 GENERIC PROCEDURES FOR SURVEY ADMINISTRATION

The recruitment, training, and supervision of survey personnel and the need forfollow-up validation procedures are four highly interrelated tasks, since theamount of effort required on any one of these tasks is closely related to theamount of time spent on the other three. For example, a!poor recruitment



campaign will mean that more time and effort must be placed into training andfield supervision at a later date. Because of the interdependent nature of thesetasks they will be discussed together in general terms in the first section of thischapter. Some of the tasks relate more to one type of survey than another,although there are some common threads running through the tasks for all typesof survey discussed in this book.

7.1.1 Recruitment of Survey Staff

There are three different sources of recruits for survey staff for transport surveys:(a) Captive markets, e.g. yourself, members of your company, students,

members of organisations.(b) Market research firms.(c) Respondents to advertisements.

Captive market recruits usually have a number of particular problems. First, thestaff tend to be atypical of the general population. They may be all students, or allprofessional people or all members of an organisation with distinctcharacteristics. Second, if they have been chosen because the topic of the survey isof special relevance to them and if they are to employed as interviewers, theymay be overzealous in their pursuit of answers thereby introducing a certaindegree of interviewer bias into the survey. On the other hand, if they have beendirected to do the interviewing or survey administration as an adjunct to theirnormal duties within an organisation, they may have little motivation to do agood job (especially as it may often include evening work) and thereby introducea different type of bias. Third, some of the people chosen may not have suitablepersonalities or aptitudes for the task. This is problematic since the recruits may,for example, all be volunteers from a community organisation, and it is difficultto reject recruits without appearing to be ungrateful for the organisation'sassistance. This usually means that considerable extra training is needed toovercome these weaknesses. In general the main advantage of using a captivemarket of recruits lies in the considerable cost advantage over other methods ofrecruitment.

The practice of using market research firms to provide interviewers can beundertaken in one of two ways. Either the firms can be approached to simplyprovide a group of interviewers or else the firm can be approached to conduct amuch larger part of the survey. Either method has a number of advantages. First,it eliminates the need for the investigator to get involved in the actual recruitmentprocess. Second, it is most likely that a market research firm would not need to gothrough a full recruitment program since it probably has a number ofinterviewers already "on the books".



Using market research firms does, however, have some disadvantages, especiallyif they are used for more than just the supply of interviewers. First, the cost ofusing a market research firm can be considerable. Second, it is possible that verylittle of the expertise gained in the conduct of the survey will stay within thesponsoring authority. Third, it is necessary to realise that the quality of marketresearch firms varies. In particular, a market research firm which is skilled ingeneral survey techniques may not necessarily possess the special skills andexperience needed for travel surveys. For example, many market researchinterviewers do not understand the severe travel-data implications of choosing aneighbouring household to replace one in the sample where no-one was at home,or of not calling back to a household numerous times to ensure that not only theeasily accessible respondents are reached (see Section 7.3.6). It is thereforeadvisable for the investigator to first determine which aspects of the survey canbe handled "in-house" and then to bring in specialised consultants for theremaining tasks for which they possess special skills. It is wise to ascertain themethods used by the market research firm and their previous experience andthen retain some control over survey procedures.

The third type of recruitment is the most complex, but if performed correctly, it isthe most satisfactory method of recruiting staff for a particular survey. In thismethod, the staff are hired by the investigator for the survey at hand. The firststep in this recruitment process is for the investigator to decide just what type ofperson is needed to do the job. In particular, the following questions need to beanswered:

(a) What type of person can do the work?(b) Does it require highly skilled people with a special knowledge of the

subject matter of the survey, or will a comprehensive trainingprogram be provided?

(c) What facilities are required by the staff/interviewers, e.g. is itessential that they have a driver's licence and access to a car.

(d) What degree of availability is required of the staff/interviewers, e.g.will they be required to work during the day or at night or onweekends?

(e) Are any specific abilities required such as the ability to speak aforeign language?

Given that the investigator can adequately answer these questions, the next stepis to draw up an application form which will obtain a profile of the personapplying for an survey staff position, which can then be matched against thedesired staff profile to assist in staff selection.

The application form should ideally include questions on the following topics:



(a) Name;(b) Home address;(c) Home phone number;(d) Date of birth;(e) Highest education level reached;(f) Previous study or work experience which may be relevant to

interviewing;(g) Foreign languages spoken (if any);(h) Car availability;(i) Current employment and hours of work/study;(j) Health condition;(k) Availability for training;(l) References;(m) Reasons for applying;(n) Source of information on interviewing job.

Items (a) to (g) on the above list give a general description of the applicant andtheir abilities which may be compared with the desired profile identified earlier.Items (h) to (k) provide information on possible constraints which the applicantmight face in carrying out the interviewing task. These questions are included notonly to provide information to the investigator, but also to make the applicantaware of the constraints which they may face. It is probable that full realisation ofthese constraints may dissuade many potential applicants from making anapplication, thereby saving the investigator time and!money by not having toprocess applications which will, eventually, probably be rejected. In asking forthis information, you need to be aware of any legal restrictions you may facelocally with respect to Equal Employment Opportunities and other legislation.Item (l), references, is a useful piece of supportive information for the earlierclaims. Item (m), reasons for applying for the job, should provide some indicationof the general motivation of the applicant. It may also be a warning to watch forany excessive interviewer bias if the applicant expresses a keen interest in thesubject of the survey. Finally, item (n) can provide useful data to assess theeffectiveness of different types of advertising. This!information may be useful forsubsequent recruitment of more survey staff.!

Following on from this last point, it is necessary at this stage to determine whereadvertisements should be placed to attract the appropriate type of person. Thechoice of media for the placement of advertisements essentially lies between:



(a) Newspapers - national, urban, and local;(b) Specialised journals and newspapers;(c) Electronic media - radio, television.

The choice of media will depend to a large extent on the scale of the survey, bothin terms of the number of staff required and the geographic spread of theinterview locations in cases where interviewers are needed. The use of electronicmedia will generally attract a lot of applicants, but generally they are of a lowerquality and this will entail a lot of expense in screening of unsuitable applicants.

The type of newspaper advertisements used can significantly affect the type ofapplicants obtained. First, the choice of newspaper will result in a considerablevariation in the characteristics of applicants, depending on the readershipeducational level, political orientation, etc. Second, the placement of theadvertisement within the newspaper will also affect response. Differentapplicants will be attracted depending on whether it is placed in the classifiedsection or in the body of the newspaper. If placed in the body of the newspaper,different responses will result if it is placed in the business news section ascompared to near the shopping specials or in the sports section. The guiding rulein all this is to know, beforehand, the type of person wanted and then to placeadvertisements where that type of person is most likely to see them.

As an alternative to placing advertisements in the general media, it might beuseful to approach directly several sources of possible applicants, such as thelocal unemployment office. This has been shown to be very successful inAustralia and New Zealand. It has the advantage that people are often very keento do the work, and there is rarely the problem of "untraining" habits whichmarket researchers sometimes exhibit. Second, most universities have a Careersand Appointments Office where students register for part-time and vacationemployment. These should only be considered if the survey work is confined toholiday periods, or the employment is to be part-time. Third, it might be useful tocontact schools with a view to offering part-time employment to teachers duringvacation periods. Applications received from each of these three sources should,however, be treated the same as applications received in response to mediaadvertisements, i.e. they will still need to be screened since not all applicants aresuitable.

The selection of suitable applicants from the applications received shouldproceed in three stages. First, all applicants would be required to complete theformal application form described earlier. The completion of this form servesthree functions in the selection process. By requiring all applicants to fill in theform in their own handwriting one can quickly see whether their writing issufficiently clear not to present problems to other people who may later have to



read their writing (e.g. data enterers). Completion of the form can also test theapplicant's ability to follow written instructions. Finally, the information on theform can give a general indication of the applicant's suitability and, moreimportantly, will indicate any constraints which would make the applicantcompletely unsuitable.

The second stage of selection may involve some formal testing procedures toscreen out applicants with insufficient skills in a number of areas. The generaltypes of skill required for all survey staff may be thought of as intelligence, wordknowledge, clerical skills (an ordered mind), familiarity with the computer if thestaff are to do data entry, and telephone skills if the use of the phone is important.

The third stage, for applicants who have passed the two initial stages, is apersonal interview to determine whether applicants have the necessary personalqualities to be good survey staff. The qualities which one would look for insurvey staff would include:

(a) The person must be able to understand the general purpose of thesurvey.

(b) If they are to be interviewers:• they should be of presentable appearance to encourage acceptance

into respondent's homes• they should have a similar background to the potential respondents

to encourage the development of empathy during the interview.(c) If they are to employed as either interviewers or office staff dealing

with phone enquiries,:• they must be enough of an extrovert to enable them to keep an

interview moving and to elicit answers from reluctant respondents.• they must be resilient enough to withstand the occasional refusal.(d) If necessary, the person should be able to speak a foreign language

which will likely be encountered in the study area.(e) They must be of general honest character such that there is no great

danger of fabrication of results.

At the end of these three stages, a number of applicants will be selected fortraining.

7.1.2 Survey Staff Training

Having recruited apparently suitable applicants, it is now necessary to train themin the skills of survey administration, interviewing, or data entry. Three factorsneed to be considered when discussing staff training:



(a) Why should we train applicants?(b) On what should we train them ?(c) How should we train them?

7.1.2.1 Why train staff?

It seems obvious that people should be trained for interviewing, but whatparticular reasons can be offered for the expenditure required in the trainingprogram. There appear to be three basic reasons for interviewer training:

(a) Increased response rates - experience has shown an 8-10% differencein response rates between "trainee" and "trained" interviewers.Besides increasing the value of the data in terms of reducedsampling bias, higher response rates also have the pragmaticadvantage of decreasing the probability of irate respondentscomplaining to supervisors.

(b) Increased likelihood of following instructions - by training people inthe purpose of the survey, they are more likely to see the need forfollowing instructions exactly. For example, if people are trained inthe purpose of sampling, there is likely to be no substitutionsampling in the field. Similarly, awareness of the biases which canarise from changes in question wording will induce people to askthe questions verbatim where necessary.

(c) Reduced error rate - training can reduce clerical errors in completingthe interview form quite substantially. This will result in aconsequential reduction in editing effort required. Trainedinterviewers have about half the error rate of trainee interviewers.

But what about training for administration staff and data enterers?Administration staff in a self-completion survey have several important roles tofill - each of which is carried out best if the staff have a comprehensiveunderstanding of the total survey process.

(a) Mailing of questionnaires. Knowledge of sampling proceduresallows survey staff to understand important objectives such as whyquestionnaires have to reach the household before the "travel day",and what to do with inadequate addresses.

(b) Receipt of questionnaires. Appropriately designating returnedquestionnaires or completed interviews as fully or partlyresponding, or as sample loss, can only be done with a goodunderstanding of the survey objectives.

(c) Answering phone enquiries. Since every phone enquiry - even anirate person - can be considered to be a potential respondent,



response rates can be increased considerably be knowledgeable,friendly staff answering phones.

Many travel survey designers are opting to use "intelligent" data enterers as partof the quality control process. If properly trained, these people can not only enterdata, but can highlight suspected inconsistencies in the data and initiateprocedures to follow-up respondents.

7.1.2.2 Content of Training Sessions

Given that we see training to be essential, the question then becomes one of whatdo we train these people to do. While training is often only considered to apply tointerviewers, administrators (particularly of self-completion surveys) and dataentry personnel should also attend training sessions. There are six broad areas oftraining - some are suitable for all three types of people, and others are confinedto interviewers.

(a) Survey objectives and design - the underlying reason for carryingout the survey, the objectives as set out by the investigator and/orclient, and the reason for selecting the methodology which has beenchosen. Interviewers, administrators and data entry people all needto know these in order to carry out their jobs efficiently and to makewise decisions throughout the survey period.

(b) Subject matter of survey - all three groups of people need to beaware of several factors concerning the subject matter of the survey.They need to know the background of the survey organisation, thesubject matter technical details, the survey administrationprocedure, and how the data will be used. Training on the subjectmatter will enable the interviewer and administrator to answer anyquestions raised by respondents in the field. More importantly, goodtraining will give the all these people confidence in, andcommitment to, the aims of the survey. Such confidence is highlycontagious to respondents and generally results in higher responserates and respondent co-operation.

(c) Questionnaire details - interviewers, administrators and data entrypersonnel should be familiarised with the details of each question onthe questionnaire to be used.

(d) Technique - the technique and skills of personal interviewing is acritical area in need of training for interviewers. Definition of termsshould be clarified, coding conventions for field-coded questionsoutlined, and the degree of probing and clarification allowed foreach question should be specified.

The type of issues which should be covered are:



(i) How to gain co-operation;(ii) How to handle difficult cases;(iii) How and when to clarify answers;(iv) How and when to probe;(v) Clerical recording techniques.These issues should also be addressed in the Interviewers' Manual(Section 5.9) and in a manual for administrators and data enterers.All these manuals need to be given to the client as vital articles ofdocumentation at the end of the study.

(e) Sampling frame - all people involved in the survey need to knowhow the sample was selected and what sampling frame was used.This very often helps answer respondents questions about "whyme?". For example, if dwellings were selected from a list of peopleconnected to water, it is possible simply to say that it was not theywho were selected, but their dwelling.

(f) Sampling - interviewers should be trained in the basic principles ofsampling and, in particular, they should be told about the varioussources of sampling bias which exist. They should be made to realisethe importance of adhering to the selected sample. In this way thereis a greater chance of them adhering to sampling instructions or else,if in a difficult position in the field, they will be able to make adecision of whether to contact their supervisor for assistance orwhether they can change the selected sample without affecting therandomness of the sample.

(g) Administrative details - information on this aspect of the survey isnecessary for interviewers, in particular, if the administration of thesurvey is to proceed smoothly. Factors which need to be coveredinclude:

(i) Who is the immediate supervisor in case of field difficulties?(ii) Who do they report to for collection and return of interview forms?(iii) How are time sheets to be filled out?(iv) How do they get paid?(v) What times are they expected to work?(vi) Where do they obtain their lists of households for interviewing?



7.1.2.3 Training Methods

The methods to be used in training are many. It has been found that thedistribution of a Manual prior to the training session (e.g. Ampt 1994), with someHome Study Exercises including a tape recording of the way an interview islikely to run, can be invaluable in ensuring that participants learn the maximumduring the training sessions. Lectures and tutorials with extensive use of audio-visual materials (especially video-tapes) can also be very useful.

For interviewers, however, the most important aspect of training is practice, whichcan be of two types; group practice in class and field practice under supervisionof a more experienced interviewer. Group practice in class is a convenientmethod of gaining practice where three or four interviewers interview each otherin turn. In this way, lessons are learned by being both an interviewer and arespondent. It works best where mock answers are arranged beforehand,although if this is not possible, it is possible to interviewers to interview eachother. Make sure that when they are "playing respondent" they use real examplesand not fictitious ones since this can get very confusing. These group practicesessions should include both the asking of questions and the recording ofanswers, and should be video-taped if possible.

Survey administrators primarily need to be trained on the underlying reasons forthe survey and the design of the questionnaire. This gives them importantbackground information for all their work. If they are to be supervisors ofinterviewers, however, they should always complete the interviewer trainingcourse, including carrying out some interviews. Similarly, if they are to supervisedata entry personnel, they should both undertake that training course and carryout some data entry themselves.

During training, data entry staff need to be given plenty of chances to see (andcorrect) the errors they make, as this is the best way of ensuring that their skillsare of high quality.

While the procedures finally adopted in recruitment and training will depend toa large extent on the scale of the survey and the amount of previous experienceheld by the survey staff, the above discussion will provide a framework for sucha process. More complete details on recruitment and on training may be found inWarwick and Lininger (1975) and Moser and Kalton (1979).

7.1.3 Pre-Publicity

In an attempt to increase response rate, it is often considered useful to conductsome pre-publicity for the survey some time before the survey takes place. Thispublicity may be pitched at two levels; at the general population, and at therespondents in the sample.



Publicity aimed at the general population is normally by way of the generalmedia, particularly the newspapers. As with the recruitment of survey staff, theselection of newspapers for publicity, by means of a community interest story,should be guided by the location and characteristics of the survey population. Aswell as arousing interest in the survey, the newspaper clipping of the story can bea very strong focal point when attempting to gain initial acceptance at thehousehold's front door on the day of the interview. With media publicity, careshould be taken not to oversell the survey since this may arouse negative feelingsand suspicions about the survey's purpose.

Another issue lies in the fact that if the survey is being run over a period longerthan a few weeks, it is difficult to decide when the publicity should be run so asto affect all respondents equally. This is important, because response rates may beaffected (either positively or negatively) by publicity, and clearly it is importantto know and measure which response levels are "normal" and which are affectedby publicity. For this reason, it is often argued that no publicity should be donefor longer-term surveys.

The more important form of pre-publicity is direct contact with the respondentsin the sample. The normal form of contact is by means of a letter of introductionexplaining the survey purposes and advising that an interviewer will be calling inthe near future or that a questionnaire will be sent. The letter should be fairlybrief and should be sent out only a few days before the interviewer is to call. Thepurpose of this letter is threefold. First, it makes respondents aware that aninterviewer will be calling or that a questionnaire will be arriving in the mail, andhence they will be expected. In a personal interview survey, the interviewer istherefore less likely to be mistaken for a door-to-door salesman when initialcontact is made, and hence refusal rates should decrease. Secondly, it gives therespondent the opportunity to think about the general subject matter and thismay improve the quality of response.

In some survey designs, the interviewer is required to contact the respondentdirectly by phone to arrange for a mutually convenient time at which to hold theinterview. While phone contact is sometimes useful as a follow-up to anintroductory letter, it should never be used alone. It is much easier for someone torefuse an interview over the phone than it is when you are standing at the door.Phone contact should generally only be used in special circumstances.

The combination of local newspaper articles, introductory letter and phonecontact was used to good effect in a survey of elderly travel in Sherbrooke Shire,Victoria, Australia (Richardson, 1980). In this survey, described earlier in Chapter4.3, it was felt that the elderly residents may have been somewhat reluctant togrant the interviews since many interviews were performed at night and most ofthe interviewers were male (a rarity in household interviews). Effective pre-



publicity, including the provision of a phone number for respondents to find outmore about the survey, resulted in no outright refusals in a sample of 72households.

7.1.4 Survey Execution and Monitoring

While attention paid to recruitment and training of survey staff will reduce theproblems encountered in the field, such problems will never be eliminated andthe survey administration process needs to take this into account with a rigorousmonitoring process.

7.1.4.1 Personal Interview Surveys

Some interviewers, while performing well in training, cannot cope with thedemands of real interviewing, while others who initially perform well in the fieldtend to become careless after some time. For these reasons, it is generallynecessary that the administration of any survey of reasonable magnitude bemanaged by a field supervisor. The main tasks of a field supervisor are toorganise work groups, to distribute work assignments, to supervise in the field,to review completed work, and to act as a liaison with the survey office. Inaddition, the administrative tasks connected with the survey, including paymentof interviewers, will need to be handled efficiently to ensure smooth surveyoperation.

For large personal interview surveys, it is generally advisable to break up thetotal number of interviewers into work groups of five to ten interviewers whowork under one supervisor in the same geographic area. The formation of smallgroups enables personal interaction between interviewer and supervisor andallows the development of a sense of camaraderie within the group. This isimportant if the survey is to continue over a long time period or if interviewsmust be made under difficult conditions.

The distribution of work assignments should be carefully managed by thesupervisor. The general idea is to assign work to interviewers in relatively smallbundles with the requirement that all completed interviews must be returned tothe supervisor before more work will be assigned. In this way, tight control iskept over the flow of work and interviewers are prevented from deferring thoseinterviews which they perceive to be most difficult. It is also a good general ruleto require interviewers to return completed interviews no more than one or twodays after they were completed. This insures that interviewers carry out theirediting duties on each questionnaire before they have forgotten the details of thatinterview. Prompt return of completed questionnaires also ensures a continuousflow of work to the data enterers.



Before passing on completed interview forms to the survey office, the fieldsupervisor should check through the questionnaires in some detail. In this way,recording errors and other problems can quickly be drawn to the attention of theinterviewer who can then make the appropriate correction. This procedure alsoserves to alert the interviewer to such problems before more interviews areconducted, thereby avoiding the perpetuation of systematic error ormisunderstanding.

At least once during every survey, a supervisor should attend one interview witheach field person to ensure that the correct procedures are being carried out.Naturally while notes can be taken during the interview, any comments to theinterviewers would be given in private after the interview. Good supervisors arequick to spot interviewers who are not in the habit of reading questions verbatim,who usually probe too much, and so on, and can help them (and the data quality)by reiterating the principles of good data collection after the supervisedinterview.

7.1.4.2 Office Procedures

To maintain control over the administration of any type of survey, it is necessaryfor the survey office to keep administrative records detailing the current status ofthe work program. Three types of record are particularly important. They are bestkept as part of a computer package or spreadsheet, but can also be keptmanually:

(a) Sample control log - this is a listing of the total sample by address,with space allowed to record the current status of each sampleelement, such as: not yet issued, in field, questionnairereturned/completed (with date of interview and name ofinterviewer if appropriate), refusal, sample loss, etc.

(b) Questionnaire control log - a list of the status of each questionnaire -and of all interviewers and interviewers assigned to them in apersonal interview survey. This information is useful in assessingthe quality and quantity of the interviewer's work. This log is alsodesigned to keep control of questionnaires when they are given todata entry staff and when they are returned for shelving or filing.

(c) Completed questionnaire log - a list of all completed questionnaireswhich have been returned to the survey office. Details of the date ofreturn or each questionnaire, the interviewer's name, etc. should bekept in the office records.

Several computer program packages are available for survey management andcontrol, including some which are suitable for use on micro-computers. As earlyas 1981, Neffendorf (1981) described the Survey Management Information System



(SMIS). More recently Bethlehem and Hundepool (1992) gave examples of using acomputer for all stages of the survey process including administration. Eash(1987) describes how the management of a small telephone-based home interviewsurvey was enhanced in a number of ways by using a microcomputer to carry outthe major administrative tasks. The tasks computer selected the sample, mailedthe pre-contact letters, organised interviews, checked the status of the interviews,kept a log of successful interviews and performed some data analysis.

One aspect of survey administration which has been the subject of debate formany years is the relative merits of different forms of payment to interviewers.The two major methods are "piecework rates", where interviewers are paid percompleted interview, and hourly rates. Those in favour of piecework rates arguethat it provides a strong incentive to interviewers to make contact and tocomplete interviews, allows for greater ability to predict the total cost of thesurvey, and results in higher productivity in terms of number of completedinterviews per dollar. It also has the advantage of being much simpler todetermine the payments to be made to each interviewer.

Proponents of hourly rates, on the other hand, first point to the disadvantages ofpiecework rates. They argue that rather than provide an incentive to interviewersto make contacts, there is a definite incentive for interviewers not to makecontacts unless there is a high chance of completing an interview. Thus difficulthouseholds tend to be avoided and call-backs are neglected since they are lesslikely to result in payment. This can result in substantial sampling bias. It is alsoclaimed that once a contact has been made, there is pressure on the interviewer tofinish that interview as soon as possible so that another contact can be made. Inthese circumstances the interviewer tends not to probe too deeply since probingcan be time consuming. In travel surveys, this probably results in under-reporting of trips by respondents who cannot immediately remember theircomplete travel pattern. Piecework rates are therefore seen to contribute towardsquantity of interviews rather than quality of interviews.

The use of hourly rates, however, also has its practical problems. Interviewershave to keep time-sheets and have to be relied upon to fill them out correctly andhonestly. There is an incentive for interviewers to pad each interview, spendinglonger than is really necessary in each household. The productivity in terms ofinterviews per dollar must therefore fall.

Thus each method of payment has its own advantages and disadvantages withopinion being divided among the two methods. Generally, the hourly ratesmethod appears to be more favoured because of its greater emphasis on qualityof response. With an efficient survey management information system, it ispossible to identify those interviewers who appear to be particularly slow andchecks can be made to determine whether they are padding their time-sheets or



whether the responses they are getting are actually of higher than averagequality.

One commonly used approach to payment methods is to use hourly rates for thepilot survey, when only experienced interviewers are used, to allow the "time andmileage data" from the pilot to be converted into piecework payments for themain survey. This ensures fairness of payment and has the second benefit ofallowing more realistic calculation of the survey costs in advance.

7.1.5 Follow-up and Validation Procedures

The field component of many surveys ends when the first round of completedquestionnaires arrives at the survey office for coding. It is considered that "we didour best" and that the analysis will just have to proceed with the data which havealready been obtained. Often, there is a time deadline for the results to bedelivered to the sponsor, and it is considered that it is simply not worthwhile to"chase" those in the sample who did not respond. However, it is becomingincreasingly common for follow-ups to be seen as an integral part of the totalprocess of conducting a survey, whether it be a personal interview or a self-completion survey and for the time and costs of these process to be calculated aspart of the survey process. Various methods of follow-up and validation whichare applicable to the different types of survey will be addressed in the followingsections.

7.1.6 Confidentiality

Confidentiality of survey data is an extremely important aspect of theadministration of all surveys. Since respondents are giving the information to theresearcher in good faith, we should respect this in every way possible. There areseveral simple things which can be done to assist this:

• Mark the questionnaires with "In Confidence" or something similar.• Train interviewers and all other staff to "not only be confidential, but

be seen to be confidential". This involves things like not takinganyone else in the car in a personal interview, covering thequestionnaires in the back seat so that passers-by cannot see them,and so on.

• All respect should be given to confidentiality in the office. Dataenterers, coders and administrative staff should not discussindividual respondents in the office or at home.

• Questionnaire forms should be shredded at some time after thesurvey data has been analysed.



Several more substantive things may also be done (and, indeed are mandatory insome countries):

• Addresses may be removed from the computer files subsequent tocoding these to zones or x-y co-ordinates.

• Specific locked rooms (for which only certain persons havepermission to enter) may be designated within the survey office forall completed questionnaire forms.

7.1.7 Response Rates

Every survey has a response rate associated with it, and it is imperative that theseare reported correctly. The response rate measures the extent to which the sampleresponds to the survey instrument. The objective is to obtain a high response rate,such that there is a greater probability that the set of respondents more closelyrepresents the chosen sample, which should in turn represent the targetpopulation. Response rates, in principle, are calculated in the following way.From the gross sample size is subtracted those members of the sample from whoma response could not possibly be obtained. These forms of sample loss (i.e. invalidhouseholds, such as vacant or demolished dwellings, invalid phone numbers) donot affect the quality of the sample, and are sometimes said to be quality neutral.The number left after subtracting the sample loss from the gross sample size isthe net sample size. The number of total responses is then taken as a percentage ofthis net sample size. The following example is useful:

Given 100 households in the gross sample, 5 vacant dwellings (sample loss)and 64 full responses, the response rate would be:

Gross sample size 100Sample loss minus 5Net sample size 95

Responses 64Response rate 64/95 = 67.4%

While the above method of calculating response rates is simple in concept, it isnot quite so simple when being used for a particular type of survey. Importantly,however, consistent methods of calculating response rate must be used across alltypes of survey, especially when comparisons are being made between surveytypes to decide which type of survey method to select.

7.2 ADMINISTRATION OF SELF-COMPLETION SURVEYS

While it is often thought that organising a personal interview survey is morecomplex than organising a mail-out/mail-back survey, self-completion



questionnaire surveys have very special administrative needs - particularly withregard to quality control. As noted in Chapter 2, a major problem with self-completion surveys is that very often the response rate is quite low, and thereforethe opportunity for non-response bias to occur is quite high. The greatestcontribution to high quality in self-completion surveys is therefore to raise theresponse rate as much as possible, while also collecting information about non-respondents in order to account for the residual effects of non-response.

Several factors are important determinants of the response rate for self-completion surveys and, as such, are important components of the administrationof self-completion surveys. They are outlined in the following sections.

7.2.1 The Use of a Reminder Regime

Reminder letters are undoubtedly the most effective way of increasing theresponse rate. A general procedure for a mail-out/mail-back self-completionsurvey, including reminder letters, is based on the well-tested KONTIV method,originating in Germany (Brög, Fallast, et al., 1985). The method described herehas been widely used (for example in the U.S. and Europe (Brög, Meyburg et al.,1985), and in Australia (Richardson and Ampt (1995)) for a household travelsurvey where all people in the household are asked to report travel for a specifiedtravel day. The same principles could, however, be used for any self-completionsurvey.

(a) Initial contact. This stage is to introduce the respondents to the factthat they have been selected to participate in the survey and tolegitimise it in some way. This is done with an introductory letterand informational brochure which is sent just over one week prior tothe Travel Day allocated to the household (each household is askedto provide complete travel and activity data for one pre-specifiedTravel Day).

(b) First mailing. The first mailing includes the following items:• A follow-up covering letter• A household and person form• 6 trip forms (to cover the maximum expected number of

persons in the household).• A trip form with a pre-printed completed example.• A postage-paid return envelope .This mailing is sent in an envelope with a postage stamp to make theletter seem more personal. The letters are sent so that they arrive 2working days prior to the Travel Day.



(c) First reminder. This takes the form of a post-card either to thankrespondents who have already returned their forms or to remindrespondents to return the questionnaire and to allocate them a newtravel date (one week after the initial one) in case the forms have notyet been filled in.

(d) Second reminder. The second reminder is a letter sent in anordinary business shaped envelope, again signed by the SurveyDirector. Once again, a new travel date is suggested for those peoplewho have not yet filled in the forms.

(e) Third reminder. This reminder contains all the items sent in the firstmailing with the addition of a cover letter from the Survey Directorstressing the importance of cooperation by respondents in returningthe forms. Again, a new travel date is proposed.

(f) Fourth reminder. For this (final) reminder a postcard is again used -but in a different colour. A new travel date is again proposed.

Notice that each element of the reminder regime is either a different shape (orcolour). This is designed to discourage people from discarding (without reading)"yet another letter from these survey people".

As a rough method of estimating returns, it appears that about the samepercentage of persons sent questionnaires respond to each mailing. For example,if 50% respond to the first mailing then 50% of the remainder will respond to thesecond mailing (i.e. 50% of 50% = 25%), while 50% of the remainder (i.e. 50% of25% = 12.5%) will reply to the third mailing etc. Therefore while the use ofreminder letters will increase the response rate, the initial response is critical indetermining what the final response might be. It is therefore important toconsider all factors which influence the initial response rate.

7.2.2 Validation Methods

In addition to the postal reminders, a number of other techniques can be used toimprove response rates and the quality of the reported data (Ampt andRichardson, 1994; Ampt, 1993), particularly in self-completion surveys.

7.2.2.1 Phone Interviews

When the data from the returned forms is initially entered into the data base,queries or apparent mistakes are "tagged" by the data enterers. These are thenfollowed up by phone interviewers who telephone these households in order toclarify any points of uncertainty. The phone numbers are provided by therespondents in response to a question on the survey form (in Australia, about85% of respondents provide their phone numbers), and approximately 60% of all



responding households are phoned. During phone interviews, there is a checkmade of which person in the household completed each travel form in order togain a measure of proxy reporting.

7.2.2.2 Validation Interviews

A sample of responding households can be selected for a personal interview tocheck on the quality and completeness of the data provided in the self-completion phase of the survey. Each household member is asked to go throughthe information provided for their travel day. A variety of techniques have beenused for this interview.

One method is to carry out a full personal interview (using the original self-completed form as a memory jogger). In this way data on all travel is verifiedpersonally. Since respondents are also asked who filled in the original trip form,this is of particular value for measuring the effects of proxy reporting. Thepersonal interview also means that the method is directly comparable with dataobtained in the non-response interviews.

In many cases a graphical summary of the travel and out-of-home activities isused in these validation interviews. The representation is based on the time lineconcept (Jones 1977) with a line for each of home, travel and out-of-homeactivities (Figure 7.1).

4 am 6 am 8 am 10 am Noon 2 pm 4 pm 6 pm 8 pm 10 pm Midnight 2 am 4 am

Home

Travel

Outside

Household:Person:

1020401

ModeWalk/Bike Public Transport Car/Taxi Other Mode

ActivityGet on/off

Accompany someone

Buy something

P/D something

P/D someone

Eat/drink

Education

Work

At home

Other

Figure 7.1 Graphical Representation Used in Validation Interviews

This was developed to assist interviewers and respondents to view the travel dayat a glance. For example, in the above example it would be easy to check whetherthe respondent left work for lunch by simply asking "Did you stay in the sameplace between 9 am and 5 pm?".

The main purpose of these interviews was to obtain information on the under-reporting of trips in the self-completion phase of the survey and thereby to beable to calculate non-reporting weights (Section 9.3).



7.2.2.3 Non-Response Interview Surveys

Finally, a sample of non-responding households can be selected for a personalinterview to check on the reasons for their non-response. As shown in Table 7.1,in about half the cases in the Brisbane study area in South East Queensland, thehousehold agreed to complete a travel survey when contacted by the interviewer,and this information is used later in the calculation of non-response weights.

Table 7.1 Response Behaviour of Non-Response Validation Households

Final Response Type No. of Households % of Households

Valid Response 107 45%Sample Loss

No such address 2 1%Vacant 29 12%Other sample loss 38 16%

Other LossNon-contact (after 5 visits) 16 7%Refusal 46 19%

TOTAL 238 100% (Source: Richardson and Ampt, 1993a)

These non-response interviews have proven to be especially valuable inidentifying those households which contained stubborn non-respondents, thosewho were merely forgetful, and those households which did not actually exist(i.e. sample loss) - again important pieces of information for the weightingprocess.

Using the above survey design and quality control procedures, average responserates of 65%-75% have been achieved.

7.2.3 Sponsorship of Survey

Obtaining official sponsorship for the survey from a well-known and respectedgroup or individuals is likely to markedly increase the response rate, andtherefore needs to get particular attention in self-completion survey design. Non-controversial government authorities, research institutions, universities andpublic figures are useful sponsors whose name, or letterhead paper, can be usedin the covering letter. As noted earlier, if there is any doubt about the impact ofthese sponsors, it is important to check the effect of these sponsors during a pilotsurvey.

An interesting example occurred in New Zealand where (at the time of thesurvey) the Ministry of Transport was not only a policy and planning body, butalso the policing authority for all traffic offences. Although it seemed as if theprestige of the Ministry was a positive aspect of the survey design, people whoreceived a letter in the mail with a potential offence notice, did not react



positively to the survey - and the sponsoring authority was changed after thepilot survey!

7.2.4 Consideration of Respondents

In general, self-completion surveys have most success where the populationunder study is literate and concerned and/or interested with the subject understudy. For surveys of the general public, there is evidence to suggest that non-response is highest among the lower socio-economic groups. This reinforces theneed for special measures to be introduced to ensure participation by all groups ifthis is required by the objectives of the survey and to carry out follow-up surveysto give information on non-respondents.

7.2.5 Use of Incentives

A special case of the consideration of respondents is the use of incentives.Opinions vary as to whether incentives actually help or retard response rates.Intuitively, the use of a small payment or gift would seem appropriate. There issome evidence to suggest, however, (e.g. Chipman, et al., 1992 and Bonsall andMcKimm, 1993) that this is not always the case. It is likely to be safest to useincentives such as special postage (see Section 7.2.10 below). We would arguethat one of the best incentives is a survey design where the purpose and layout iseasily understood and where it is easy to contact someone if questions need to beasked.

7.2.6 Covering Letter

Since there is no opportunity to personally introduce and explain thequestionnaire to the respondent, the use of a covering letter is essential in all self-completion surveys to increase response rate and the understanding of thequestions. The letter need not be overly personalised but should be clear, friendlyand not officious. Handwritten notes urging reply in reminder covering lettershave been found to be effective. In addition, it is sometimes useful to enclose abrochure explaining the survey in a more informal manner than is generallypossible in a letter.

7.2.7 Use of Comments Section

The use of a comments section at the end of the questionnaire can often improveresponse rates by giving an opportunity to respondents to air their own views onthe subject, independent of the formal questions which may have been asked inthe main part of the questionnaire. These comments may or may not be codedand used in the analysis.



7.2.8 Provision of a Phone-in Service

One of the key elements of the administration of a self-completion survey is theprovision of a phone-in service for respondents. Since there is often no otherpersonal contact with survey investigators, it is imperative that this service isavailable (free, if possible) for as many hours of the day as practicable. Given thatmost people are away from home during the day, and are likely to be completingthe forms at nights and on weekends, it is not at all practical to limit the hours ofoperation to work times. This may mean having someone on duty during non-work times, or switching the phone through to the private homes of the surveyadministrators. As noted earlier (Section 7.1), all people who answer the phoneneed to have comprehensive training in the survey objectives as well as thequestionnaire content.

7.2.9 Preparation of Questionnaires for Mailing

For small self-completion surveys, this may be a fairly minor task, but for large-scale household surveys, with complicated reminder regimes (e.g. Section 7.2.1),the preparation of questionnaires for mailing can be a major operation. In thiscase, piloting of times taken as well as space used, and ergonomic methods ofpreparation become critical. As an example, for a survey of 20,000 households(gross) using the 6 stage method of reminders it will take about 100 person daysto prepare the questionnaires for mailing - and this does not include allreminders, since in some cases it is necessary to know which household needsreminding before customised questionnaires can be prepared!

7.2.10 Type of Postage Used

The type of postage used on outward and return letters can have an effect onresponse rates. On outward letters, the use of first class mail, priority paid postand registered letter (in that order) can increase response rates by demonstratingthe apparent importance of the survey to the respondent. For return letters, threetypes of postage in increasing order of effectiveness are commonly used:

(i) No return postage paid. This usually results in very low responserates and is not recommended. Those people who do reply are likelyto be very much more committed to the survey than those who donot.

(ii) Reply-paid post permit. This method involves payment of postageon return of the letter to the investigator. A licence number mustusually be obtained from the Post Office for this method. Whilereducing postage costs, because only returned letters are paid for, itdoes not necessarily increase response rate to an optimal levelbecause the respondent feels under no obligation not to waste theinvestigator's money. There is some evidence to suggest that it may



be an effective method when the size of the letter (and hencepostage) is quite large. It seems that the temptation to removestamps (method (iii) below is greater for these larger letters, makingthe reply-paid method equally effective (while somewhat cheaper).

(iii) Stamped self-addressed return envelope. This involves placing anormal stamp on the return envelope before the questionnaire is sentout. On receiving this envelope the respondent has three options:1. Throw it away with the questionnaire;2. Throw the questionnaire away but steam the stamp off the

envelope; or3. Return the completed questionnaire in the envelope.Option 1 leaves the respondent with the feeling of having wasted aresource - a perfectly good stamp. Option 2 leaves some respondentswith a guilt feeling of having gone to a lot of trouble to get a stamp.The only way to overcome both of these guilt feelings is to adoptoption 3. While this method appears to obtain better response rateswhen the questionnaire is small, the cost of individually affixingstamps to each envelope needs to be compared with simply gettingthe reply-paid envelopes printed. This process is being monitored indetail in the current Melbourne survey (Richardson and Ampt,1993b).

7.2.11 Response Rates

As stated before, the objective in postal questionnaires is to increase the responserate. Every effort should be made to achieve this objective. Some valuablereferences on various aspects of self-completion surveys and response rates areKanuk and Berenson (1975), Galin (1975) and Richardson and Ampt (1994).

Despite our best attempts to increase survey response rates, there will always bea certain proportion of people who do not respond. If these non-respondents aresimilar to the respondents, then there is no great cause for alarm since the non-response bias should be minimal. However if, as is usually the case, the non-respondents are atypical of the general population, then steps should be taken toaccount for the biasing effect of this non-response. It should be emphasised thatthe similarity between respondents and non-respondents should be assessed notsimply on the basis of socio-economic characteristics but on the basis of thesurvey parameters of particular interest, e.g. in a travel survey, on the basis oftrip rates. In a transport context, useful methods of accounting for non-responsebias have been proposed by Brög and Meyburg (1981) and Stopher and Sheskin(1982), and are discussed further in Chapter 9.



7.3 ADMINISTRATION OF PERSONAL INTERVIEW SURVEYS

A vital component of the administration of personal interview surveys is themaintenance of quality control by means of various procedures. There are sixmajor items to check to ensure high quality data.

7.3.1 Use of a Robust Interview Regime

A personal interview survey to gain information for transport planning must beone which ensures a high response rate and which is as easy as possible forinterviewer to carry out and for the respondent to take part in. The following is asummary of a methodology for a household travel survey which has hadwidespread use in several countries since its inception in 1981. It is described indetail elsewhere (Ampt 1981).

It uses a verbal activity recall framework (Chapter 5.3) and is not based on recallof travel, since respondents are notified in advance of the day about which theywill be required to give information.

7.3.1.1 Pre-Contact Letter

This letter is sent from the client informing the respondent of the survey andlegitimising it, in the same way as it is done for self-completion methods. To addto its authority, the name of the interviewer can also be written on each letter, ameasure which has been shown to be particularly conducive to high responserates. This letter is sent so that it arrives several days prior to the first attempt tocarry out a pre-contact interview (below).

7.3.1.2 Pre-Contact Interview

At this initial contact, the interviewer gains information on the householdincluding the such things as structure type, the number of household members,the number and type of vehicles and bicycles in the household. Since theinformation is entirely factual, a very structured method of interviewing can beused.

In addition to asking the above information, interviewers also leave MemoryJoggers for each person in the household. These Joggers are personalised, diary-like notepads on which respondents keep track of all travel at the level of detaildescribed to them by the interviewer. Finally, appointments are made to speakwith each household member over a certain age (often 9 years) personally at anappointment after the travel day/s. Data for children younger than 9 years isusually accepted by proxy.

This pre-contact interview is best attempted about 3 days before the Travel Day(the day about which the main survey is conducted) to ensure that up to 4 calls



can be attempted. This is important because each household must have anopportunity of being found at home. (Section 7.3.6). Surveys in which only 1-2calls pre-contact calls are attempted suffer serious under-reporting of tripsbecause the people who are most easily found at home are clearly not travelling -at least at that time!

7.3.1.3 Main Interview

The final stage of the personal interview takes place when the interviewer returnsto carry out interviews with each household member after the Travel Day. Againit is best to use a structured form if factual travel data is being asked. This meansthat the Memory Joggers are used by respondents only for the purpose theirname implies and it is not necessary for the interviewer to collect them.

In order to eliminate sample bias due to households "choosing" their own travelday, each household is allocated a specific travel date which should not be varied.

7.3.2 Training of Interviewers

This has already been mentioned in Section 7.1 but cannot be stressed enough.The two-fold aim of the interviewer training sessions is to give the interviewers athorough understanding of the survey objectives and processes, and to teach thespecific skills of interviewing related to the survey in question.

7.3.3 Checking of Interviews

Checking whether all interviews have, in fact, been carried out requires someform of follow-up contact with the respondent/s. This can be performed in one ofseveral ways.

First, a sample of respondents could be followed up with a telephone validationcheck. A few short questions can usually ascertain whether or not the interviewhas been carried out. It is usually useful to ask some questions of content inaddition to whether or not the survey actually took place to ensure that theinterview was done as required. The phone check can also be used to ensure thatno proxy interviews were carried out, if this was an important objective of thesurvey (which is usually the case in travel surveys).

The phone validation method should never be used alone since it would bepossible for the "astute" interviewer wanting to avoid detection to omit the phonenumber from any questionnaires, claiming perhaps that respondents did notwant to give it. For this reason, a sample of respondents who do not have thephone can be sent postcards containing a small questionnaire. These postcardscould ask similar questions to the phone interviews - whether they wereinterviewed, when it took place, how long it took and what were their



impressions of the interview. The main problem with postcards is thatrespondents may not return them. In particular, those respondents who were notinterviewed may think that they have received them by mistake and therefore notbother with them. Ideally, a small sample of non-respondents to these postcardsshould then be visited. As an alternative method, all respondent/s to bevalidated could be sent postcards, although combination with the phonevalidation is method is usually the most efficient.

A third means of follow-up is for call-backs to be made to selected householdsby a field supervisor to check whether the interview has been made. Such aprocess is however quite a costly method.

Three points need to be made with respect to the use of follow-up contacts.(a) In general, if interviewers are trained to understand the importance

of their task and of the study in general, and are treated asprofessional members of the study team, there is usually very littledifficulty with interviews not being carried out. Quality controlmeasures should, of course, still be carried out, but interviewersshould be informed of the results. Often, for example, words ofpraise are heard of the interviewer's performance, and in any case,general information about the quality control process helps toencourage a high standard of performance from team members.

(b) Furthermore, it is usually the case that the mere statement thatrandom call-backs will be made is enough to dissuade mostinterviewers from attempting falsification. Whether the follow-upprocedures are fully effective in detecting non-contacts may be ofsecondary importance.

(c) In some cases, validation can be limited to cases where it issuspected that an interviewer may be guilty of falsification. Suchsuspicion arise after a routine statistical comparison of eachinterviewer's recorded responses with those obtained from the restof the interviewers. Just as most people cannot deliberately pick arandom sample from a list of numbers, so most interviewers areunable to falsify data such that they produce the correct distributionwith respect to each of the survey parameters. Even more difficult isthe task of correctly assessing the degrees of multiple correlationbetween all the parameters. Thus in their attempt to create realisticfalse data, interviewers give themselves away. Routine statisticalassessment of each interviewer's recorded responses shouldtherefore be a standard component of survey quality control. Notethat this procedure would be effective only if each interviewer has



enough completed interviews to produce statistically reliablecomparisons.

While the comparisons described above are invaluable to assist both the surveyadministrators and the interviewers, relying on this method to identify probleminterviewers is not ideal; a strict regime of quality control is alwaysrecommended.

7.3.4 Satisfactory Response Rate

In personal interviews, the response rate (that is, the ratio of completedinterviews to assigned interviews, minus any sample loss) should rarely be lessthan 80%. If the response rate for any interviewer is consistently lower than thatobtained by other interviewers in similar circumstances then this is an indicationthat the interviewer may either be in need of retraining in the art of making initialcontact, or else is incapable of making such contacts and should be dismissed.

The same test can be applied to individual questions within the questionnaire. Iftoo many "don't knows" or refusals are being encountered then it is a sign thateither the question is being asked incorrectly, that not enough probing is beingused, or that the interviewer's motivation or attitudes are inappropriate. Forexample, it has been shown that interviewers who, themselves, do notunderstand the importance of asking income questions have a higher refusal rateon income questions. On the other hand, those who understand its significancerarely encounter a refusal for this question. Once again, retraining or detailedexplanations of individual questions might be the appropriate remedy forrecurring problems.

7.3.5 Correct Asking of Questions

Poor asking of questions is perhaps the most difficult point to check and remedyby follow-up procedures. The most effective steps in preventing it occur prior tothe actual execution of the survey.

(a) At the design, planning and pilot stage the testing should include acheck of the "comfort" which interviewers have with the flow of thequestion. When this is well controlled, there is usually littledifficulty in the field.

(b) During training, a great deal of stress placed on the importance ofverbatim question asking, plus some hours of practice, also serve tominimise the problem.

Once again, an indication of problems during the survey itself may be gleanedfrom statistical analysis of the data or from examination of the completedquestionnaires. The task of definitely identifying and remedying the situation is,



however, not straightforward. Several methods are available, each of which hasdistinct deficiencies. Re-interviewing of the respondent by a high-levelinterviewer/ supervisor may pick up gross errors. However, there will always bedifferences between any two interviewers (even if they are both of a highstandard) and it must also be assumed that the respondent stays constant for bothinterviews. This assumption may not be true because of lapses of memory by therespondent and possible genuine changes of attitude.

The other methods of validation all involve observation of the interviewer whilean interview is in progress. This interview may be a real one in the field or a testinterview in the office, while the method of observation may be by a supervisoror by using a tape recorder. While these methods may pick up errors which theinterviewer is genuinely unaware of, it is difficult to detect the interviewer who isjust getting lazy or careless. The mere fact that the interview is being observedwill generally result in these tendencies disappearing (for a while). However,generally, a series of well-announced field supervisions serves to keep thestandards high.

7.3.6 Number of Interviewer Call-Backs

An important decision in the administration of personal interviews in the numberof call-backs which interviewers are required to make before determining that thehousehold or person is not able to be contacted - often termed "a non-contact".

Theoretically, given time, the interviewer to a household would eventually beable to reach all members of the household. This is generally not cost-efficient.

On the other hand, there is ample evidence to suggest that there is significantdifference in travel behaviour between those people who are interviewed on thefirst visit and those interviewed at second and subsequent attempts (e.g. Brögand Ampt, 1983), and it is therefore advisable to make up to 3-4 attempts atdifferent times of the day and week. Interviewers who understand the principlesbehind high response rates have been shown to be ingenious in finding elusiverespondents (particularly shift-workers) at home.

7.4 ADMINISTRATION OF TELEPHONE INTERVIEW SURVEYS

If telephone interview surveys are being conducted, four specific areas ofadministration need to be considered. They are sampling, dealing with non-response, maximising data quality, and several operational aspects.

7.4.1 Sampling

As the use of telephone interviewing has grown in the last 25 years, telephonesampling methods have increased in diversity. Early telephone sample designs



used telephone directories as the sampling frame because they were readilyavailable and they were thought to contain a "representative" selection of thetelephone household population. Given that telephone interviewing is onlydealing with those people who have phones, and that problems associated withusing this sampling frame has been discussed in Section 3.2.4, the problem ofnon-phone ownership will not be considered here.

Hence, when the proportion of unlisted (ex-directory) phone numbers is small,the representativeness argument is fairly persuasive, particularly considering thetrade-offs in convenience and cost over other methods. However, in manycountries the frequency of unlisted numbers has increased to levels that raiseconcern about the accuracy based on directory samples. Three other approachesare therefore possible in most countries:

(a) Random digit dialling. This provides coverage of both listed andunlisted telephone households by generating telephone numbers atrandom from the frame of all possible telephone numbers. In someareas, local telephone agencies can assist in providing ranges andconstraints to these numbers, which greatly increases the efficiencyof the task (e.g. Stopher 1985a).

(b) List-assisted designs. These designs use information in telephonedirectories to generate telephone number samples that include bothlisted and unlisted telephone households.

(c) Multiple frame sampling. This method basically combines thedirectory and random digit dialling sampling frames into a singledesign.

An excellent review of probability sampling methods for telephone householdsurveys in the United States is given in Lepkowski (1988). There is, in fact, awhole body of literature dealing with sample designs, comparing efficiency, costfactors and quality of the samples, as well as the relationship of the sampledesign to non-response rate (e.g. Groves, et al., 1988) which is worth pursuing iftelephone sampling is to be an important part of the transport survey process.

7.4.2 Dealing with Non-Response

As noted in Section 3.2.4, telephone survey non-response has become increasinglytroublesome and threatens to eliminate the unique property of sample surveys -statistical inference to a known population. This section focuses on non-responsein cold household telephone surveys. It will omit discussion of telephone surveysof non-household populations such as businesses which were briefly mentionedin Chapter 3 as generally having higher response rates, in any case.



7.4.2.1 Sample Loss

It is important at the outset to realise that phone calls to households result insample loss - non-response which does not affect the quality of the sample. Thereare a number of reasons for this:

- a person may answer the phone and report that the number is abusiness,

- a message may report that the phone number does not exist, or hasbeen changed,

- in random digit dialling in some countries, it is possible to gain aringing tone from a non-valid number.

These cases can be treated as "sample loss" since they do not affect the quality ofthe sample being surveyed.

7.4.2.2 Factors Affecting Non-Response

Part of the design of a telephone survey involves understanding factors whichmay affect non-response. They are, of course, many and varied, and as wementioned earlier, the important ones are those which affect the values of theparameters being measured (usually travel behaviour of some type). Includedamong these factors are some which relate to respondents ((a) and (b)) and somewhich relate to the characteristics of the interviewers ((c) to (e)).

(a) Education. There is some evidence to suggest that there is highernon-response rate among lower education groups (Cannell et al.,1987).

(b) Urban-rural differences. While there is evidence that there is ahigher non-response in large urban areas than in other areas forpersonal interviews, this seems to be diminished in phone surveyswhere urban-rural differences are minimal.

(c) Interviewer voice quality may affect levels of cooperation. Overall,interviewers rated as speaking loudly, with standard localpronunciation, and perceived as sounding competent and confident,have lower refusal rates than those with the opposite patterns. Withregard to intonation patterns, interviewers using a falling tone onkey words early in the introduction have lower refusal rates thanthose using a rising tone (Oksenberg and Cannell, 1988).

(d) Interviewer gender and experience. In general females seem to gainhigher response rates than males, and more experience is alsoassociated with better response rates.



(e) Interview length. Surprisingly, there is very little evidence tosupport the strong intuitive belief that shorter interviews are alwaysmore successful. In the U.S., researchers generally use shorterinterviews for telephone surveys than for personal interviews,although Swedish researcher encounter less resistance to longtelephone interviews. In the U.K. there is evidence of a 5% lowerresponse rate to a 40-minute telephone interview as compared to a20-minute interview (Collins et al., 1988).

7.4.2.3 Methods of Dealing with Non-Response

Several methods can be used to deal with some problems of refusals in telephoneinterviews.

(a) Timing of calls. When phone calls interrupt specific events likemeals they can lead to rapid refusals. For this reason, method (d)below is often a useful adjunct in the survey process.

(b) There is some evidence that the strategy of making appointments todefer the initial resistance to the phone call can reduce initial non-response.

(c) Advance letters (similar to the letters advocated for personalinterview and self-completion surveys) sent to respondents toexplain the survey have been shown to reduce non-response since itreduces the element of surprise (Clarke, et al., 1987).

(d) Attempts by a second interviewer to convert those who initiallyrefuse to take part, and

(e) The selection and training of interviewers to take into account thetypes of differences mentioned above.

7.4.3 Improving Data Quality

There are several things which may affect data quality in phone interviews whichare different from those found in other modes of data collection.

The first relates to the way in which people respond to sensitive or threateningquestions. There are two lines of argument about this effect. On the one hand, it isargued that because respondents cannot see the facial cues which are evident inface-to-face personal interviews, they experience some anxiety that they mightnot be giving the right answers, and may be less likely to respond. However, onsome topics, this very fact may encourage people to give this type of informationmore freely. Thus, for items or topics particularly susceptible to such effects, thetelephone may hold certain advantages of personal interviews. Pilot testing willobviously help to understand each particular case better.



Furthermore, it has been shown that there is a tendency to obtain truncated orshorter responses to open questions administered over the phone, than to thesame questions asked in person. Care needs to be taken to consider this duringthe question design stage.

Finally, there is the issue of response scales where, in a personal interview,people would normally be presented with a show card. One way to deal with thisis to read out the scale points, although another method is to ask the question intwo stages. For example a five point "satisfaction-dissatisfaction" scale would firstbe presented with three response options (satisfied, dissatisfied and neither satisfiednor dissatisfied). "Satisfied would then be asked: "Is that very satisfied or justsatisfied, and similarly for dissatisfied.

7.4.4 Other Aspects of Administration

One of the key components of telephone interviewing is the use of a computer-assisted telephone interviewing (CATI) system to assist interviewers and theirsupervisors in performing the basic data collection tasks of the telephone surveys.In typical applications the interviewer is seated at a computer terminal, wearing atelephone headset. As survey questions are displayed on the screen, theinterviewer reads them to the respondent and enters responses into the keyboard.In most systems, question wording and sequencing between items is computercontrolled based on prior entries, and answers can be checked for logic and rangeerrors as soon as they are input - meaning that inconsistencies can be checked onthe spot. For further information on these systems refer to Baker and Lefes (1988).

The questionnaire design using this system requires careful management. Forexample, it needs to cater for the need to go back to verify information and tocorrect interviewer and respondent error. Without extensive testing, this cancause significant problems.

Finally, many supervisors complain that with a CATI survey that uses a sample-management schedule, they lose the "feel" of the progress of the survey. Insteadof piles of cards to sort, they must deal with more abstract queues and reports. Itis important that these people work together with the programmers to be surethat their decisions are implemented correctly and have the desired effects. Assystem are becoming more sophisticated, there are also many cases wherecomputer graphics can assist in the alleviation of this problem.

7.5 ADMINISTRATION OF INTERCEPT SURVEYS

The administration of intercept surveys - those which are carried out with peoplethat have made a choice of a certain mode of transport - also requires special careif high quality data is to be obtained. As mentioned earlier, these surveys can



either be self-completion or personal interview. The discussion in this section isrelevant to both of these types.

7.5.1 Sampling

Although the theory of sampling has been discussed extensively in Chapter 4,some of the specific issues relating to intercept surveys are worth reiterating atthis point.

7.5.1.1 On-Board Surveys

For surveys on-board a mode of transport (such as a bus, train or plane), there aretwo specific issues. The first relates to the choice of the vehicle itself. Whichbuses/trains/planes are to be selected from the many which operate over a giventime period if the intercept survey is to be representative of that time? One optionis to survey on all vehicles. While this may be practical if the study period is brief,it is generally both more time consuming and more costly than necessary formost surveys. Unfortunately, the large quantity of data obtained is often assumedto make up for any biases caused by a lower than desired response. Furthermoreit is also often assumed that the survey on each vehicle in the sample must coverthe entire operating day, or as much of it as is important for the collection. Thisalso usually constitutes an unnecessarily large sample size and survey cost and,in general, these methods are not supported by a thorough understanding ofsampling issues and procedures.

An excellent method of selecting vehicles is described in Stopher (1985b). Twolevels of understanding are necessary. The first is a clear recognition that thesampling unit is the passenger not the vehicle; and the second is to know andunderstand the implication of the operational scheme of the transport systembeing surveyed. Knowing these two characteristics of the population makes itpossible to develop a system in which specific routes of a transit service areselected using a system of stratification of routes or runs. A multi-stage samplingtechnique is then used to select specific bus/train/tram runs or services. Usingthis method, the process of expansion to the entire system is not asstraightforward as it would be if a simple random sample of vehicles had beenselected, but knowing the way in which the sample is selected is not really verydifficult. However, it can readily be done and is described in more detail inKaiser Transit Group (1982).

7.5.1.2 Roadside Surveys

Another specific sampling problem occurs with surveys in which vehicles arestopped at the side of roads or at signalised intersections. This is much moredifficult than the case of on-board surveys since the vehicles in question in thiscase do not behave according to a set timetable.



Theoretically either all vehicles (or a random sample of vehicles) should beinterviewed, because stratification of any type is usually not possible whilestanding at the roadside. Since both of these can prove difficult it is importantthat three controls are always practiced.

(a) There should be no systematic omission of any specific vehicles. Thismeans that if vehicles are being stopped at the roadside, all vehiclesin a given period should be stopped, and then a group missed,rather than allowing the traffic control person to select what theymay consider to be "friendly" vehicles.

(b) There should be classification counts done in parallel with thesurveys to determine whether the surveyed vehicles arerepresentative of the entire fleet of vehicles passing the point duringthe survey period.

(c) As in all surveys, any refusals should be monitored.

Furthermore, as part of the survey sampling design, it needs to be clear whetherthe sampling unit is the driver or all persons in the vehicle, and the surveymethod (including the instrument design) needs to take this into account. Thisdepends to a large extent on the objectives of the survey. Route choice, forexample, may be largely decided by the driver, but other decisions about thetravel (e.g. destination and time of day) may be a decision of all the passengers,not just the driver.

7.5.1.3 At Activity Points

Intercept surveys which take place at activity points such as shopping centres orairports are probably the most difficult from the point of view of sampling. Inalmost all cases the sampling unit here is the individual, and while findingpeople to interview is rarely a problem, keeping control of the population fromwhich this sample is selected is much more difficult.

Because of this difficulty there has been very little research done on ways toimprove the sampling method, and many reports simple make do with an over-representation of the people who are easy to find (like meeters and greeters at anairport) and under-representation of those who are difficult (like businesspeople). This is not serious when the total population of each sub-group isknown, since the data can be expanded appropriately. However, these cases arenot common.

A method which can sometimes be useful to gain this information is to find apoint at the activity centre where all persons need to pass before entering orleaving. While it may not be possible to carry out the interviews at the level ofdetail required at these points, it is often possible to ask one or two screening



questions to gain information on the population of those people in the centre,thereby getting a base point for expansion.

Although these methods seem time-consuming, they need to be considered if thedata collected at these centres is not to be disregarded for its lack ofrepresentativeness.

7.5.2 Training of Staff

The staff training for intercept surveys is as important to the success of the surveyas for all other survey methods. It is, however, frequently relegated to a quickgathering of personnel a half an hour prior to commencement of the survey. Thisresults in poor performance by the staff and frequent "no-shows" with all theirattendant problems.

Training sessions are used, as in other surveys, to give the staff information onthe background to the survey. A knowledge of sampling methods is also oftenvery important. In surveys at public transport hubs, unreliability of timetablesoften means that the staff will have to take back-up measures under somecircumstances and understanding how and why a sample is chosen can be crucialto their making the correct decision.

Furthermore, in cases where a self-completion questionnaire is being distributed,particularly where this is happening on a moving vehicle, it is important that thetraining sessions are used actually practice the distribution of forms. Get all the"passengers" to file past an interviewer at the speed people board a bus and checkthat the interviewer can keep up! The art of giving out forms (and pencils) to allpeople boarding a bus in peak hour - with the appropriate smile ofencouragement - is not for the faint hearted, but can be easily mastered with alittle practice!

7.5.3 Other Administrative Details

There are several other details which need to be considered in the administrationof intercept surveys.

(a) The surveyors need to have writing implements that work under theweather conditions being experienced at the intercept site (felt-tipped pens smudge in the rain, and biros do not work on wet paperor in temperatures less than about 3oC).

(b) If you are distributing pencils to respondents on a moving vehicle,consider square rather than round ones to avoid their rolling whendropped.



(c) If weather will be a problem, make sure there is protection for thesurveyors.

(d) In all cases the surveyors will need somewhere to put their personalbelongings during their work periods.

(e) Finally, do not forget to record the non-response. Without thesedetails, expansion to the population will not be meaningful.

7.6 ADMINISTRATION OF IN-DEPTH INTERVIEW SURVEYS

In-depth interview surveys, while always incorporating personal interviewsurveys of one form or another, require special care from an administrativeperspective. A comprehensive description of the conduct of these surveys, notexclusively in the field of transport, is given by Gordon and Langmaid (1988).

Figure 7.2 shows the various stages involved in a typical in-depth interviewprocedure and highlights the complexity of the process. Correspondingly, theadministrative procedures need to include the following components.

7.6.1 Organisation of "Pre-Interviews"

This phase begins with the selection of the sample of people to be interviewed.Depending on the survey objectives, and bearing in mind that one of the key usesof in-depth interviews is exploratory (Chapter 3), the sample is often not randomand consists of people with known travel characteristics (who are likely to bemost affected by changes to the status quo).

The pre-interview phases can often include the gathering of information oncurrent travel behaviour which requires the distribution of self-completiondiaries, or the organisation of personal interviews to collect travel data if needed.Given the length of the main in-depth interview, it is generally advisable to keepthe pre-interviews completely separate from the main survey.

7.6.2 Preparation for the Main Interview

This will involve preparation of any interview aids such as boards in the exampleor the preparation of any other aids which will be needed in the interview. If asimulation approach is used the data from the pre-interviews can, in some cases,already be set up ready for immediate use. Other in-depth methods may includethe use of physical models to help understand certain changes which are likely tooccur (e.g. an architectural model of a High Street before and after theintroduction of clearways). All of these items need to be prepared in advance,and often in a new configuration for each interview. Another important part ofthe preparation for this interview is the availability of highly sensitive (yetportable) tape recorders.



quantitative:coded diaries

qualitative:tape recording of discussion

group discussion test possible responseson HATS display board

record revisions onsecond set of diary sheets

household evaluationof policy impacts

discussion of householdactivity-travel patterns

agreed revisions to household activity-travel patterns

examination of impactson household members

representation of diary dataon HATS display boards

specific change in policy

completion of diaryrecords by participatinghousehold members

Pre-Interview

Main HATS Interview

Post-Interview

Analysis

TOPICS INVESTIGATED

Routines, roles, constraints,linkages, options, preferences

Changes in timing, locationor cost of activity/travelfacilities; change in carownership, etc.

Response strategies, priorities,decision rules, trade-offs,options considered, rankingof options

Perceived impacts; gainsand losses for each person;overal assessment

Figure 7.2 Typical HATS Interview Procedure(Source: Jones, 1985)

7.6.3 Training

In comparison with structured personal interviews, the training of theinterviewers for in-depth surveys needs to be much more extensive. Frequently adetailed understanding of the political context of the topic under exploration, aswell as of the survey components is a necessary prerequisite for an in-depthinterviewer. For this reason, and also given the relatively small sample sizesusually used in in-depth studies, it is common that the investigator is the onlyinterviewer on a given project.

7.6.4 Preparation of Questionnaires/Data

After the interview, interviewers need to be given plenty of time to record anyimportant details which emerged from the interview and which may not be



recorded on the tape. For example, if people's changes in behaviour are beingrecorded on board games it might be necessary to record any revisions whichrespondents in writing.

7.6.5 Time and Resources for Transcriptions

Both time and resources need to be set aside for the transcriptions of the tapesafter the interview. As a rule of thumb, it takes between just over twice the actualinterview time to make the transcription. If appropriate, this has been found to beeffectively done by the interviewers themselves, since issues such as attributingvoices to the correct people is more easily done by them than other people.

7.6.6 Report Writing

Report writing is a component of the in-depth interview which is critical to thequality of the survey. On the one hand, reports need to record sensitively thefeelings of the qualitative information presented by respondents. On the otherhand, there are many quantitative components of the method, and these shouldnot be disregarded. For example, the changes in behaviour which may occur afterthe introduction of a new measure during the interview (e.g. reactions to adoubling of travel time) can often be quantified quite easily. The output ofmethods like the situational approach (Figure 3.3) give an excellent example ofthis type of quantification.

7.7 COMPUTERS AND THE CONDUCT OF TRAVEL SURVEYS

Whereas, as recently as 15 years ago, computers were not used at all extensivelyin the administration and conduct of surveys for transport planning, they arenow used increasingly in all phases from survey design, management and datacollection. An excellent description of the uses up until 1990 is given in Jones andPolak (1992). Computers are, of course, also used for data entry, but this isdescribed in more detail in Chapter 8.

7.7.1 Survey Design and Management

The sampling frame for most surveys - particularly those of a larger scale - isalmost always made available in computer-readable format. Whether the samplesare from voting or utility lists, postal address lists or telephone records, it wouldbe rare that they are not provided on disk or tape. One of the chief (and oftenfrustrating) tasks of survey administrators is to ensure that the data iscomprehensible.

Another important area for computer applications is survey management andcontrol, which covers the allocation of work and monitoring of returnedquestionnaires. This process has been described in Section 7.1.



7.7.2 Data Collection

As Jones and Polak (1992) point out, many types of transport survey alreadymake use of computers for rapid data capture and entry and their role is well-established. Among these are

(a) Telephone interviews. Computer assisted telephone interviewing(CATI) has already been discussed and is now the norm in manycountries, although its application in transport is more limited thanin other areas. The ability to combine the data collection andmanagement functions is a specialty. Non-answering numbers canautomatically be stored, quality control through listen-in proceduresis very straightforward, sequencing can be automatic, and data entrycan occur as the interview progresses.

(b) On-board surveys. In the U.K. and many other countries, simple on-board surveys to collect passenger profiles on bus and rail are oftencarried using light pens for data capture linked to a small portablecomputer. Hamilton (1984) gives a good example of using acomputer-based bus passenger monitoring system in the north eastof England.

(c) Traffic surveys, while not the specific subject of this book, often usehand-held computers. These can be used for such diverse surveys astraffic classifications or counts, turning movements, or parkingsurveys.

The use of computers in the home and for personal interviews in general, is arelatively recent phenomenon. In some fields, however, it has had remarkableresults. For example, the Dutch Labour Force Survey has had a fullycomputerised household interviewing system since 1987 in which about 25,000people are interviewed each month. Data is down-loaded each night via modemsfrom the 400 laptop computers kept by the interviewers, and each month's data isready for use nine days after the end of each month (Keller and Metz, 1989).

The computer can be used in two main ways in travel surveys.(a) It can be used as though it were an electronic questionnaire. In this

type of survey the questions and interview style are kept the same asthey would be in a conventional paper survey, with the computerscreen being read by the interviewer rather than the questionnaires.This was the earliest approach and it was sometimes used to helpdesign experiments in stated preference exercises (e.g. Ampt et al,1987; Jones et al., 1989).

(b) Both the interviewer and the respondent can use the screen, enablingamong other things, the use of graphics and video scenes. Examples



of this type of use can be found in Polak et al. (1989) and Polak andAxhausen (1990).

Both of methods can add quality to the data being collected because theyeliminate any potential sequencing errors made by respondents or interviewersand because they also do away with the need for data entry, thereby saving costand reducing one phase where errors can be introduced.

Another very specific use of computers in travel surveys for their specific abilityto adapt the questions to people's responses. For example, in stated preferencesurveys (Section 5.5.4) respondent's answers to questions on facts about a specifictrip (such as travel time, wait time, length or cost) can be used to develophypothetical scenarios for future changes to the system which are based onpeople's current travel. For example, if people are currently travelling 15 minuteson a bus line and the operator is trying to check how people would respond to amore circuitous route (with more helpful drivers), the new travel times offeredthem would be based on the current 15 minutes (perhaps +10% or +20%). In thepast, without computers, all respondents tended to get presented with the sameabsolute changes in travel time, meaning that they were often very unrealistic.Much more reliable data is possible with the aid of this facet of computerinterviews.



8. Data Processing

Once the field survey has been conducted and the completed interviews orquestionnaires begin to flow into the survey office, it is time to begin the oftentedious task of editing, coding and analysing the results. Although the physicalcomponent of this task begins now, it should be realised that the planning phaseshould have been completed much earlier in the survey process. It is too late tostart designing the coding and analysis procedures once the completedquestionnaires begin to arrive. Rather, these tasks should have been largelyfinalised when the design of the questionnaire and the sample was beingconsidered. Indeed, attention given to these tasks at those earlier stages willgreatly facilitate the smooth completion of these tasks now.

The task of transforming completed questionnaires into useable results iscomposed of several discrete tasks including initial editing of questionnaires,coding, data entry, computer editing, data correction, analysis, interpretation ofresults and preparation of reports. In addition, to enable use of the data forsecondary analysis at a later date, it will be necessary to arrange for satisfactorystorage of the data. This chapter will concentrate on the coding and editing ofdata in preparation for analysis. Later chapters will examine the tasks involved inthe latter stages of data analysis.

Chapter 8

264

8.1 INITIAL QUESTIONNAIRE EDITING

Before the completed questionnaire from a personal interview survey reachesthe survey office, it should already have been subjected to two forms of editing;field editing and supervisor editing. Field editing is performed by the interviewerand is used to check on the completeness of responses, to expand on notes jottedon the questionnaire by the interviewer and to check for legibility. Such fieldediting should be done as soon as possible after completion of the interview, sothat problems can be cleared up while the interview is still fresh in theinterviewer's mind. In many cases, the data can be checked after the interviewerreturns to their vehicle and before they drive to the next interview (assumingthey are driving). In this way, if any information is found to be missing orunclear, the interviewer can return immediately to the household and obtain themissing data or clarify the uncertainty. To avoid introducing interviewer bias atthis stage, it is essential that in supplying missing information the intervieweronly enters responses which were given in the interview but which, in the heat ofthe moment, were not recorded. No attempt should be made to infer what theanswers "should" have been.

Supervisor editing should be carried out as a quality control check, and to ensurethat the completed questionnaire sent on to the survey office for coding is legible,complete and consistent. If the supervisor detects persistent errors by the sameinterviewer then it is possible for this error to be drawn to the attention of thatinterviewer before too many more interviews have been conducted. It may benecessary to request that the interviewer attend a retraining course if the errorrate is high. On the other hand, if a similar error or inconsistency is detected overa number of interviewers, this may point to an inadequacy in the questionwording or in the instructions for asking or recording answers to that question.This would call either for rewriting of the question and/or instructions, or forretraining of all interviewers with respect to this question.

In these days when most survey data analysis is performed by computer, it isusual to suspend editing activities at this stage after these two types of editing.More complete editing is performed after the data has been transferred to thecomputer, because the computer is far quicker and less error-prone in picking uperrors and inconsistencies than an individual human editor could ever hope to be.For this reason, we now turn attention to the coding of the data.

Data Processing

265

8.2 CODING

Coding is the translation of data into labelled categories suitable for computerprocessing. In the overwhelming majority of cases, this means numericallabelling. It should be realised, however, that in many cases the labels used,whilst numerical, are purely nominal; they do not necessarily imply anunderlying numerical relationship for the categories of the variable in question -for example it would be ridiculous to compute a mean value for marital statusfrom the numerical codes assigned to each state. Similarly, the conventionalcoding of sex as male (1) and female (2) does not imply a value ordering of thesexes!

In many cases, however, the codes do possess more than a nominal property.Three other types of code are possible; ordinal, interval and ratio codes. Whilstnominal codes merely attach labels to data items, ordinal, interval and ratio codescarry more information about the data item.

In addition to naming the data items, ordinal codes rank the data items in someorder and hence allow the mathematical operations of equality, inequality,greater than and less than to be applied to the codes. An example of an ordinalcode in travel surveys is the use of a priority ranking of modes which can then beused in the "trip-linking" process. Thus, in addition to giving a numerical label toeach mode, the codes also order the modes in the order of priority to be used inthe linking process, e.g. train=1, bus=2, car=3, bicycle=4, walk=5.

Interval (or cardinal) codes go one step further in that, in addition to ranking thedata items, they also impute measures of separation to the numerical differencesin the code values. The mathematical operations of addition and subtraction aretherefore possible with interval codes. An example of a interval code is the timeof day attached to the start and end of trips. These time codes order the sequenceof start and end times and can also be subtracted to obtain the travel time on thetrip. Similarly, the year of birth of a respondent is a interval code, which whensubtracted from the year in which the survey is performed gives the age of therespondent at the time of the survey.

The most general form of code is a ratio code which imputes meaning to theorder, differences and ratios of code values. Thus all four mathematicaloperations (+, -, x, ÷) are possible on ratio code values. Many codes used in travelsurveys are ratio codes since they represent the quantities of physical attributes,e.g. distances of trips, number of cars in household, age of respondents. Notethat ratio codes are often obtained by the substruction of interval codes, e.g.obtaining age from the reported year of birth.

In devising a coding procedure it is therefore important to know how the codeswill be used in analysis, so that an appropriate choice of code type can be made at

Chapter 8

266

this stage. This is particularly the case for scales used in attitudinal questions. If itis desired to carry out a range of mathematical operations on the answers toattitudinal questions, then it is necessary that these answers are in the form ofratio codes. This can be done by suitable selection of the type of attitudinal scalequestions to be used in the survey.

8.2.1 Coding Method

In devising a coding procedure, it is important to first decide on the generalmethod which will be used for coding and data entry. Several options are open tothe survey designer, each of which will entail a different amount of effort at thecoding stage:

(a) Mark-sense or optical-scan questionnaire formsFor questionnaires which consist entirely of closed or field-codedquestions, it may be possible to use questionnaire forms on which allanswers are recorded directly in pre-specified locations. Each questionwould have a limited number of categories and the recording of answerswould be done by filling in the square or circle corresponding to thatcategory for that question. The questionnaire forms would then be feddirectly into an optical scanner attached to the computer whereupon thecode categories selected would be transferred directly to a data file readyfor editing and analysis. The use of this method is possible only for highlystructured questionnaires and is dependent upon the availability of ascanner. Where applicable however, it does provide a very rapidturnaround between data collection and calculation of results.

(b) Self-coded questionnairesThe next step down in automation of the coding procedure is the use ofself-coded questionnaires where the respondent, or interviewer, recordsanswers directly into boxes printed on the questionnaire. These boxes areoften identified by small numbers indicating the columns into which thedata will be recorded onto "card images" within a data file (the idea ofcard-images is a hangover from the days of mainframe computers,where data was often punched onto 80-character computer cards, whichwere then grouped into decks of cards representing the entire dataset).The advantage of self-coded questionnaires is that data entry personnelcan type information into the computer directly from the questionnaire,thus minimising the need for the use of coders (the distinction betweencoders and data entry personnel is also a hangover from mainframedays, where the computer cards were typed out by datapunch operatorswho took no part in the coding of the data from the questionnaires; thesedays, coding and data entry are often done by the same person). Onceagain, however, this method is only applicable for highly structured

Data Processing

267

questionnaires. In addition, the "computerised" appearance of the surveyform may be confusing and/or intimidating to respondents when usedon a self-completion questionnaire form. Obviously, for the mark-senseand self-coded questionnaire methods, the coding procedure must becompletely specified before the questionnaire is printed because both relyon the use of pre-specified codes on the questionnaire form.

(c) Coding sheetsIn the past, the most commonly used coding procedure relied on the useof specially designed coding sheets onto which data was manuallytranscribed from the questionnaire. A typical coding sheet consisted of 24lines of data each containing 80 columns (to conform with the standardcomputer card). All data, from both open and closed questions, weretransferred to these sheets by coders, and then given to data entrypersonnel for typing into the computer. Coding sheets were useful whenthe questionnaire contained a large number of open-questions, but wererather wasteful for closed questions where the task simply involvedrewriting all the information from the questionnaire form onto thecoding sheet. A trade-off was necessary between the extra time and costfor coders to perform this task against the reduced time and cost for dataentry (which was much quicker and less error-prone than typing directlyfrom the questionnaire). These days, however, very little coding and dataentry is performed using coding sheets, and the process has largely beenreplaced by the use of interactive computer programs which performboth the coding and data entry tasks.

(d) Interactive computer programsWith the recent rapid advances in computer technology, it is now possibleto combine the tasks of coding and data entry into one by having thecoder enter data directly into the computer from the questionnaire form.Thus, instead of writing the data onto coding sheets, the coder types itdirectly into the computer, each coder being provided with a separateterminal or microcomputer for data entry. This procedure provides formuch quicker coding and data entry and has a number of otheradvantages as will become apparent later. Since this procedure hasbecome the dominant mode of data entry, especially with the widespreadand increasing use of microcomputer spreadsheets and databases,particular attention will be focussed on it in the following discussion ofcoding, although the general principles are equally applicable to the othermethods of coding and data entry.

(e) In-field lap-top and hand-held computersEven more recently, the advent of powerful lap-top and hand-heldcomputers has meant that a greater amount of data entry can be

Chapter 8

268

performed directly in the field. This applies both to observational surveys(such as passenger counts and running time surveys on public transportvehicles) and to various types of interview surveys (such as interceptsurveys and stated preference surveys, e.g. Jones and Polak (1992)). Thein-field use of computers can drastically reduce the amount of time andeffort involved in the coding and entry of data. With appropriatesoftware, it can also increase the accuracy of recordings (especiallyrecording of times in observational surveys) and can greatly reduce thenumber of errors in the data.

8.2.2 Code Format

Irrespective of the coding procedure adopted, the data that is entered into thecomputer must be stored in a pre-determined format, i.e. the user of the datamust know how to get access to individual pieces of the data set in order toperform the various analyses. In specifying a coding format, a basic questionwhich much be answered is whether data is to be recorded and stored in "fixed"or "free" format. The difference between the two methods is that whereas fixedformat specifies exactly where each item of data is to lie within a long string ofdata (previously, an 80-column card image field, but now more likely to be a 256-character word), free format simply allows for the sequential entry of dataseparated by some form of delimiter (such as a comma, a space, a tab or acarriage return). Thus, for example, the sequence of numbers 121, 6, 37, 0, 9, 12might appear in fixed format as:

121 6 37 0 912

where the number of columns reserved for each number is 3,2,4,2,2,2respectively.

In free format the numbers look like:

121,6,37,0,9,12

When using fixed format it is necessary to specify that all numbers should be"right-justified" in their field, i.e. any blanks must appear to the left of a number.For example in the above example, the number 37 when written in its four-digitfield must be written as ∆∆37 rather than ∆37∆ or 37∆∆ (where ∆ represents ablank space). This is because most computers do not distinguish between blanksand zeros when dealing with fixed-length records, and hence ∆37∆ would beinterpreted as 370 and 37∆∆ as 3700.

The choice between the two formats will often depend on the computer languageand/or programs to be used. Some commercially available statistical programsrequire that data be in a specified format. Given a choice, however, each format

Data Processing

269

has its own strengths and weaknesses. Free format is often easier to enter sincethere is no need to ensure that each data item is typed exactly in the correctcolumns. On the other hand, fixed format is more efficient to use inprogramming because it is possible to specify exactly where a data item will beon every questionnaire's computer record. One can therefore access that dataitem directly, without having to read all the preceding data items as one wouldhave to with free format. Fixed format is also preferable when using relationaldatabases (such as household, person, and trip files) where frequent cross-referencing is required.

The relative advantages of each type of format can be used most effectivelywhen interactive terminals or microcomputers are used for data entry. In thesesituations, data is entered by the coder/data enterer in free format and thecomputer can then assign the data item to a fixed format data file for lateranalysis. The free format data entry can be taken one step further withinteractive terminals by having the computer ask for the next item of datarequired, by means of a "prompt message" or input field written on the terminalscreen. The coder responds to this prompt by entering the required code andthen hitting a carriage return. The computer then asks for the next item of data.This technique is particularly powerful when a questionnaire contains numerousfilter questions. Depending on the coder's response to a filter question, thecomputer can then decide which is the next appropriate question to ask. Allquestions (and answers) skipped can be supplied automatically with missingvalues by the computer. This avoids the tedious and error-prone task of typinglong rows of blanks for fixed format computer data entry, as often occurs incomplex questionnaires.

8.2.3 Coding Frames

A coding frame describes the way in which answers are to be allocated codes.The collection of coding frames for an entire questionnaire is often referred to asa code book. The preparation of a code book involves two distinct processes; thedefinition of coding categories to be used, and the methods to be used inallocating individual answers to these categories.

Coding frames are required for two different types of questions; closed and field-coded questions, where the codes have already been decided by the time it getsto the coder: and open-questions, where the coder must decide on the choice ofcode corresponding to each particular answer. The techniques for each type ofquestion will be described below.

Before describing the specifics of coding-frame construction, three general pointsneed to be made. First, for closed and field-coded questions, the choice of codingframe had to have been made at the time of questionnaire design so that

Chapter 8

270

appropriate codes could either be printed on the questionnaire or supplied to theinterviewer. Open-question coding frames, on the other hand, generally cannotbe determined until after the completed questionnaires are received. Second, inselecting the code categories for both open and closed-questions it is wise tochoose categories, wherever possible, which have already been used in othersurveys, such as the Census of Population or other transport surveys, with whichcomparison might later be made. The Census is particularly useful for categoriesdealing with classification variables, such as occupation and dwelling type. Third,in addition to codes for questionnaire answers it is also necessary to have codingdetails describing the questionnaire itself. A unique identification number, date ofinterview (or date of questionnaire return by post), and interviewer's name areall useful data for control of the analysis procedure.

8.2.3.1 Open-question Coding Frames

The definition of coding frames for open-questions can be a difficult process. Foreach question, the frame should consist of mutually exclusive and exhaustivecategories into which responses may be classified. The process of developing acoding frame for an open-question is usually a mixture of theory and empiricalobservation. Thus, while the investigator may have some theoretical idea of whatthe possible codes should be, these can only be verified by reference to empiricaldata. Such data might come from previous studies, pilot surveys or from themain survey itself.

Using responses from the main survey to define the coding frame entails theselection of a sample of completed questionnaires and then systematic recordingof the responses received to the open-question. From this list of responses (orcontent analysis), a coding frame is designed which satisfies the criteria of beingexhaustive and mutually exclusive. Thus every recorded response in the sampleshould fit into one, and only one, category in the selected coding frame.

The design of coding frames for open-questions is as much an art as a science. Adelicate balance is required between over-generalisation and semantic niceties.One problem arises when handling the ubiquitous category of "other". Onerecommendation is that any response that occurs in the initial sample, howeverinfrequent, should be allocated a category code just in case it turns out to bemore significant than expected in the rest of the complete sample. A point worthconsidering is that it is easy to collapse categories in later analysis but you cannotexpand general categories to obtain data that were never coded. On the otherhand, if there is a long list of categories with minimal differentiation, the use ofsuch categories for coding is likely to be unreliable and subject to considerablecoder bias. One further point to remember is that if, in the analysis, it is desirableto find out how few responses of a given kind are observed, then it is necessaryto include such a category code even though it may rarely be used.

Data Processing

271

8.2.3.2 Closed-question Coding Frames

The coding of closed-questions (or the entry of codes derived for open-questions)is relatively straightforward although the consistent use of certain conventionsand techniques can increase the efficiency of the coding process. Some factorswhich need to be borne in mind when designing a coding frame for closed-questions include:

(a) A decision needs to be made as to what the coding unit will be for eachquestion. Will each question be coded separately or will certaincombinations of questions be used to give an overall code with respect toresponses to the set of questions? For example, each person in ahousehold may be asked their relationship to the oldest person in thehousehold, and the set of these answers may then be used to derive avariable which describes the overall structure of the householdrelationships (e.g. nuclear family, single-parent household etc.). As will bedescribed later, it is generally better to keep the coding tasks as simple aspossible. Therefore it is generally advisable to code individual questions,and then let the computer derive any combinations at a later date.

(b) Several different types of code are possible. These include the field code inwhich numbers representing actual amounts are coded, more or less asthey were given by the respondent e.g. income recorded to the nearestdollar. If field codes are used, it is necessary that sufficient space be left inthe coding frame to accommodate the largest number of digits expectedin any answer. Bracket codes involve the assignment of a code to cover arange of numbers. Generally bracket codes would not be used unless thequestionnaire itself had used categories to simplify the question for therespondent. The only reason why bracket codes would be used inpreference to field codes is if it was essential that space be saved in thecoding frame. Pattern codes involve the assignment of a single code to apattern of responses to either a single question or to a series of questionse.g. patterns of licence holding/car ownership in a household. Patterncodes are often used to save space in the coding frame although thecoding task is made more difficult because the coder has to decide on thepattern before selecting the code.

(c) Many computers do not distinguish between blanks and zeros. Thismeans that zero codes with a substantive meaning (e.g. zero carownership) should not be used in a coding frame where a blankrepresents something else (such as "not applicable"). This point reinforcesthe statement made in Chapter 7 that interviewers should not leave anyquestions blank. It is good practice to differentiate between "notapplicable", "don't know", "refused to answer", and simple omission on

Chapter 8

272

the part of the interviewer, because all of these items may be needed forlogical edits and quality control checks. Many computer packages providefacilities for several kinds of missing data, and for omitting these missingvalues from statistical calculations as necessary.

(d) If provision is made for missing data, it is convenient to apply consistentcoding conventions for recording such data. For example, a "-3" codecould always represent "don't know", a "-2" code could represent "notapplicable", a "-1" could represent "no answer provided when one wasexpected", and so forth. Another situation where a consistent conventionis required is in the coding of YES/NO answers. A convention often usedis to code YES as "1" and NO as "2". On no account should these answersbe coded as "1" and "0" because of the problem with zeros and blanksmentioned earlier.

(e) There are several ways in which space may be saved in coding frames.Sometimes it is useful to make use of embedded codes whereby codesare contained within codes. For example, in response to a filter question arespondent may be asked a different follow-up question depending upontheir initial reply. The responses to both the filter and follow-up questionmay be recorded by one code if, for example, the responses to the firstfollow-up question were coded as 1 to 4 whereas the responses to thesecond follow-up question were coded as 5 to 8. In this case the responseto the filter question could be ascertained by checking whether the codewas less than or equal to 4. Another use of embedded codes is in theconstruction of identification numbers for respondents where in additionto uniquely identifying the respondent the ID number can also providesome basic information about the respondent. A third use of embeddedcodes is where the data may want to be used at different levels ofrefinement. An example of this is in the use of occupation codes, such asthe Australian Standard Classification of Occupations (ASCO) used by theAustralian Bureau of Statistics, where a one-digit code gives a fairlygeneral split such as:

1 Managers & Administrators2 Professionals3 Para-professionals4 Tradespersons5 etc., etc.

Within each of these codes, however, a second digit may be used to refinethe occupation codes, such that, for example:

21 - Natural Scientists22 - Building Professionals & Engineers

Data Processing

273

23 - Health Diagnosis & Treatment Practitioners24 - School Teachers25 - etc., etc.

Third and fourth digits could be added to further stratify the classificationof occupations. Each occupation would then be coded as a four-digitnumber, but in using the data the analyst could choose only that numberof digits which is required to give a level of definition commensurate withthe purpose of the analysis of the survey data.

(f) When coding responses to menu items, such as the selection of modesused for a given trip from a list of modes of transport, two options areavailable. First, one could use a listing code wherein each mode isallocated a column in the coding frame and YES/NO responses are codedfor each mode. Thus, for example, if six options are listed then a possiblecoded response might look like 122112 indicating that the first, fourth andfifth modes were used by that respondent. The trouble with listing codesis that, while simple to use, they do take up a lot of computer storagespace. An alternative method is the use of a geometric code whereby thesame information can be packed into a smaller number of columns. In theabove example, the nth mode would be given the code 2n-1. Thus the firstmode has the code 20 = 1, the second 21 = 2, the third 22 = 4, the fourth 23

= 8 etc. The combination of modes used would then be coded as the sumof the codes for each mode. In the example given above, the code wouldbe 20 + 23 + 24 = 25. This two digit code, which represents one and onlyone possible combination of modes, contains the same information as thesix digit listing code used earlier. The problem with geometric codes isthat when used manually by coders there is a large risk of errors incoding. A better method, if coders are using computers, is to enter thedata in listing code format and then let the computer pack it intogeometric code. With geometric code, a two digit code is equivalent to asix digit listing code, a three digit code is equivalent to a nine digit listingcode, while a four digit code is equivalent to a thirteen digit listing code.

(g) While geometric codes are good for representing combinations, they arenot suitable for representing permutations (or sequences). In this case alisting code is more suitable, where each mode is assigned a specific codenumber and these numbers are then listed in the appropriate sequence.One possible alternative is to use a pattern code where each sequence ofmodes is assigned a unique code. In this case there is a slight reduction inthe space required (e.g. a nine digit listing code is reduced to a 6 digitpattern code) but there is also a substantial increase in coding effort ifperformed manually.

Chapter 8

274

(h) In selecting the number and definition of code categories, attentionshould be paid to achieving compatibility with other sources, to beingexhaustive and mutually exclusive (if required), to minimising thenumber of columns used in the coding frame, and to achieving acompromise between accuracy and simplicity.

(i) One factor which should not be forgotten when defining code categoriesis to specify the units in which the answer is required. In transportsurveys, the major decisions regarding units will be concerned with clocktime (ask in a.m./p.m. format but code in 24 hour format), distance (kmor miles), money (dollars or cents), time (minutes or hours) and income(per week or per annum).

8.2.3.3 Location Coding

One particular type of open-question coding frame which is of special significanceto transport surveys is the coding of geographical locations e.g. home, work anddestination locations. Very often, the coding of locations is the most time-consuming component of transport survey coding, especially when complete triprecords for a survey period are involved. The information on the questionnairegenerally consists of a street name and suburb - the coding procedure requiresthis to be transformed into a numerical code. Several approaches have beenadopted to attempt this transformation.

The most widespread location coding method, in the past, has been the allocationof locations to zones. Thus, each street/suburb combination lies within aparticular traffic zone. This zone is normally determined by the use of look-up-tables - a very time consuming process. The use of zones also has thedisadvantage that while they may be aggregated to form villages, towns, citiesand counties, they cannot be broken down into finer divisions and the zoneboundaries cannot be changed.

In some circumstances, an alternative to the use of specially defined traffic zonesis the use of postcodes as a proxy zonal system. The use of postcodes, whilst veryconvenient in surveys where minimal resources are available for coding, has anumber of severe disadvantages for transport surveys. First, they possess all thedisadvantages of inflexibility mentioned for zones. Second, the boundaries areoften not compatible with other zoning systems, such as town and countyboundaries, and hence comparison with other data sources is difficult. Third, thepostcodes zones are defined for a completely different purpose to that which isrequired for transport survey analysis. There is therefore little regularity in zonesize or shape, which causes problems for transport analysis. For urban travelsurveys, postcodes are often too large for meaningful analysis of origin-destination patterns. The major advantage of using postcodes is that, in manycases, this information can be supplied directly by the respondent on the

Data Processing

275

questionnaire, and thus the tedious task of coding locations is largely avoidedexcept in cases where the respondent does not know the location's postcode.

In an attempt to avoid some of the problems associated with rigid zonaldefinitions, a number of surveys (Richardson, et al., 1980; Young, et al., 1978)adopted the use of an x-y coordinate system based on a city street directory.Using this system, which was based on a grid element size of 400m x 400m,coders looked up the street name and suburb in the street directory index andrecorded the map number and page coordinates. These two pieces ofinformation were later converted to an overall x-y coordinate system, whichcovered the entire study area, by a computer program subroutine. The use ofsuch grid coordinates allowed for the amalgamation of grid elements into anylarger zoning system such as postcodes or Census districts. It also allowedconsiderable detail to be retained in the location coding and enables detailed"scattergrams" of location to be plotted using conventional computer printers.Such scattergrams were most useful for graphical presentation of location results,and were also found to be useful in the editing of location information. The majordisadvantage with this system is the considerable time involved for coders inlooking up the map number and page coordinates in the street directory index.

To alleviate some of the work involved in using the street directory system ofcoordinates, a self-completion survey of student travel patterns at MonashUniversity in Melbourne, Australia (Jennings, et al., 1983) asked respondents toinclude the street directory location reference for their home location from themost popular of the commercially available street directories - the "Melways"(Melway, 1992). A total of 60% of the completed questionnaires did include thisinformation, considerably reducing the coding effort required for this survey. Asthe use of such grid reference systems becomes more widespread in thecommercial and retail world (to indicate locations of shops etc. to potentialcustomers), the self-reporting of location codes may become a technique which isworthy of further investigation.

Self-reporting of location codes may also be attempted by asking respondents toindicate the x-y coordinates of trip origins and destinations on a speciallyprepared map which is supplied to all respondents. The success of this methoddepend on the size of the study area and the precision with which geographiclocations are to be recorded. Given a specific study area, there needs to be atrade-off between the size of the map supplied to respondents and the amount ofdetail included on the map. Naturally, self-reporting methods assume that therespondents can read a map and that they can, and will, report their trip originsand destinations in terms of x–y coordinates. An example of the use of such alocation coding system is shown by the map depicted in Figure 8.1(a) and (b),which was used by the authors in a county-wide travel survey in Upstate New

Chapter 8

276

York. Over 80% of the population in this study was able to correctly identify thelocation of their homes and their destinations for all trips during a day.

The final method of location coding, which is certainly the most efficient for largetransport surveys, is the direct geocoding of street names and suburbs (and otherprincipal locations). This method involves the use of a computer programwhereby the user enters an address, the computer checks through a list of suchaddresses (in much the same way as a coder would look through a streetdirectory index), and then returns an x-y coordinate for that location.

The computer program should be able to match coordinates with incorrectaddresses (caused by misspelling of street, or the use of adjacent suburb name)and should take account of house numbers, especially on long streets. Thecoordinate system adopted will depend on the data base used. With theincreasing availability of computer technology, the geocoding system is likely toreplace existing systems of location coding since it gives a more accurate resultwith lower coder effort required. The use of Geographic Information Systems(GIS) for the conversion of geographic information about home addresses andtrip destinations into machine-readable format (geocodes) was one of the moreinnovative aspects of the South East Queensland Household Travel Survey(SEQHTS) survey (Richardson and Ampt, 1993a). In past travel surveys,destination locations have often been coded directly to rather aggregate trafficzones, at suburb level of detail, with the result that considerable information hasbeen lost about the location of destinations. However, coding survey data to thelevel of the Census Collectors District (CCD) is extremely useful for the plottingof trip information, for more accurate calculation of distances betweendestinations, and allows greater flexibility for the design of more specific zoningsystems (e.g. for the analysis of public transport corridors).

Figure 8.1(a) Map used in Self-Coding of Location Coordinates (front)

Figure 8.1(b) Map used in Self-Coding of Location Coordinates (back)

The general procedure adopted for the geocoding of locations in the SEQHTSproject is shown in Figure 8.2.

Data Processing

277

Travel Data File

Address File

Geocoding Methods

Full StreetAddresses

Cross-StreetAddresses Landmarks By

Sampling

reported addresses

corrected addresses

CCD location

x-y coordinates

Figure 8.2 General Procedure for Geocoding of Locations

Locational information was obtained from the travel data files in the form ofreported addresses. These addresses may be from the sample frame database ofresidential addresses obtained from the South-East Queensland Electricity Board(SEQEB), in the case of the household file addresses, or from the respondents, inthe case of the stop file destination locations. These addresses were transferred toan address file which contained only the address and an identifier which enabledthe geocoded CCD location to be transferred back to the travel files at the end ofthe geocoding process. The locational information, especially from respondents,was of varying degrees of completeness and accuracy. Therefore, beforeattempting to geocode the address information, the addresses had to becorrected to put them in a format which was compatible with the GIS database ofaddress coordinates. These corrected addresses were then geocoded by one ofvarious methods of geocoding, as described below. The x-y coordinates of theaddresses were then transferred back to the address file. By comparing thesecoordinates with the CCD boundary files, the CCD in which the address waslocated was obtained, and this CCD number was then transferred back to thetravel data files.

The geocoding procedure used in the SEQHTS survey consisted of a series ofgeocoding methods applied in a hierarchical fashion to obtain a likely geocodefor an address. The accuracy of the geocode is dependent on the geocodingmethod used. Therefore, the more reliable methods were attempted first.

Chapter 8

278

The degree of accuracy of the geocoding depends on two factors; the accuracywith which the respondent can supply the locational information, and theaccuracy with which the GIS program (in this case, MapInfo© ) can use thatinformation to generate a set of coordinates. For example, a respondent mightknow that they went shopping at the Coles supermarket in Chermside. Fromtheir point of view, this is the most accurate description of their destination.However, whether MapInfo© can geocode this location correctly will depend onwhat information it has about the location of Coles supermarkets. If all Colessupermarkets are in the landmarks datafile, then this should provide a veryaccurate geocode. However, if they are not in the landmarks file, then the veryaccurate locational information provided by the respondent will be of little use,unless an alternative method of locating Coles supermarkets can be found. Forexample, it is possible to look up the Yellow Pages (either the paper version orthe electronic version on CD-ROM database) and find that the Coles supermarketin Chermside is on the corner of Gympie and Webster Roads. This information,in that form, is still not very useful since MapInfo© needs a street name andnumber to find a geocode. However, as will be described later, it is possible towrite a special module which finds geocodes based on the specification of cross-streets. Therefore, the accurate locational information supplied by the respondentcan eventually be converted into an accurate geocode. On the other hand, theinformation that MapInfo© is most accurate in working with (i.e. full street name,number and suburb) is often not easily supplied by the respondent. For example,very few people would know the street number of the Coles supermarket inChermside, even if they knew what street it was on. If they provided only thestreet name, then we would be forced to select a random position along thestreet within the suburb - providing a less accurate geocode than that providedby use of the shop name.

In the actual computer implementation of the geocoding methods, four programmodules were developed for the SEQHTS project. These are:

• geocoding using MapInfo© ;• geocoding using a cross-street database;• geocoding with the assistance of a street directory; and• geocoding by sampling.

In addition, an interactive spelling checker program was developed to automatethe correction of spelling errors/mismatches of street names and suburb names.Spellings were checked against a dictionary created from the electronic referencemaps provided with MapInfo© .

The next few sections will discuss how addresses are prepared to make themsuitable for geocoding and then details of the four geocoding program modulesmentioned above will be provided.

Data Processing

279

Preparation of the address data

A crucial factor in geocoding is the success of matching the address information(i.e. street name and suburb name) provided by the respondents to that used inthe electronic reference maps. Slight differences in spellings result in a mismatchand consequently a geocoding failure.

Steps were made to minimise spelling mismatches in the SEQHTS data byproviding a pop-up dictionary of street names and suburb names in the dataentry program for the travel data. The pop-up dictionary even went as far asdisplaying only those streets which belong to a specified suburb. However, as thedictionary was not really complete, some addresses were still required to beentered manually. Also, a few entries in the dictionary were discovered to bemisspelt but nonetheless they were assumed correct for matching purposes.

The more common causes of spelling mismatches are variations in abbreviationssuch as Ter & Tce for Terrace, and Mnt & Mt for Mount, and reversals ofcombinations of names such as Wynnum West and West Wynnum.

Considering that there is so much address information to check for mismatches,a rudimentary interactive program was developed for the purpose. The programstarts off by extracting the address records from the travel data files and savingthem into an address file. This latter database saves the spelling changes, with theoriginal address information provided by the respondents left unmodified in theformer database. A way of relating the information in the address file to that inthe travel data file must be maintained to be able to transfer the geocodinginformation in the address file to the travel data file . This was done via theunique household, person or stop identification numbers.

The interactive spelling checker program was implemented usingFoxBASE+/Mac and has the basic features of a word processing spelling checker.It finds an item that is not in the dictionary and displays candidate dictionaryitems using the "soundex" function of FoxBASE+/Mac. Soundex is used todetermine if two words are phonetically similar, that is, if they sound alike.

It was expected that only a few addresses would turn up as mismatches owing tothe use of the dictionary pop-up during data entry. However in the case ofsuburb or locality names there were quite a number of mismatches. This isbecause the suburb boundary maps for several areas were not complete at thetime of the study. Postcode boundary maps were provided, however, so thatmismatches in suburb names were resolved by entering postcode numbers.

To speed-up the process in most of the geocoding methods, identificationnumbers are used instead of the actual names of streets and suburbs. A table ofunique identification numbers for each street name and suburb name was

Chapter 8

280

created along with postcodes for each suburb. The identification numbers andpostcodes are attached to the address file after the spelling changes have beenmade.

Geocoding full street addresses

The initial task in this procedure is to extract a unique listing of full street addressrecords from the address file . A full street address is one whose street number,street name, and suburb name are given.

Geocoding of full street addresses is done using MapInfo© . MapInfo© basicallyneeds two inputs for geocoding: a street address (which consists of a streetnumber and a street name); and a bounded area (known as a boundary file) suchas a suburb or a postcode to refine the search. When provided with thisinformation, MapInfo© finds the street segment in the given suburb containingthe specified house number. This street segment is defined in terms of its startpoint and end point, which are located by means of latitudes and longitudes. InFigure 8.3, this is shown as (X1,Y1) for the start point and (X2,Y2) for the endpoint. The data file also contains the range of house numbers on either side of thestreet segment. Thus in going from the start point to the end point of thesegment shown in Figure 8.3, house numbers N1 through N2 are on the rightwhile house numbers N3 through N4 are on the left. MapInfo therefore knowswhat side of the street the specified address is on. For an address on the left sideof the street, it then divides the left side of the street segment into 1"+"(N4-N3)/2equal lengths and finds the position along the centre line at which the specifiedaddress is located. For example, the house with number N5 is located at aposition which is on the left and (N5-N3+1)/(2+N4-N3) of the way along the linkfrom the starting point. MapInfo then proceeds to automatically position thegeocoded location (the latitude and longitude) 10 metres off the centre line of thestreet on that side of the street, as shown in Figure 8.3. For example, if N3=1 andN4=7, then house number 5 would be located at coordinates (X,Y) five-eighths ofthe way along the link and 10 metres to the left of the centre line of the link.

(X1,Y1)

(X2,Y2)N1

N2

N3

N4N5 10 metres

Figure 8.3 The Location of Full Street Addresses using MapInfo

It is quite common that respondents give incorrect suburb information and sothe address cannot be geocoded. This, however, is often circumvented byassuming that respondents are likely to give a suburb not far from the correctsuburb. Respondents often upgrade their suburb to a nearby, more socially

Data Processing

281

distinguished, suburb. By using this assumption, success in geocoding can beimproved by re-attempting to geocode using an increasingly larger boundaryfile.

Postcode boundaries are generally larger than suburb boundaries and so theyare used in the geocoding process after the suburb boundary. Larger boundariesare further defined using the nearest eight suburb boundaries and the nearesteight postcode boundaries. The number "eight" is chosen with the idea that if asuburb boundary is roughly square, then there will be four adjacent suburbs oneach side of the square and another four on its corners. The nearest eight suburband postcode boundaries were determined by comparing distances betweenboundary centroids. This was done only once with the result saved in a databasefile for use by the appropriate geocoding methods.

It is expected that the probability of correctly geocoding an address diminishes asthe boundary used becomes larger.

When geocoding a small file of full street addresses, all methods may beattempted in MapInfo before attaching the geocodes onto the address file.However, when the full street address file is large, it saved time if geocodes wereattached to the address file after each method was applied and then the full streetaddress file was compressed by removing geocoded records.

Once an x-y coordinate had been attached to an address, the CCD in which thiscoordinate was located was found by overlaying the boundaries of the CCDs onthe geocoded coordinates. The region (CCD) in which the geocoded point waslocated was then transferred back to the travel data files as the most disaggregatedescription of the location of that address.

Geocoding cross-street addresses

As in the geocoding of full street addresses, a list of unique cross-street addressessupplied by respondents was extracted from the address file to avoidunnecessary repetitions in geocoding. A cross-street address consists of twostreet names and a boundary (e.g. suburb or postcode).

As mentioned earlier, MapInfo did not have the capability to geocode cross-streets, at least as a standard function. A program was therefore written to fillthis gap using a fairly straightforward procedure. A database of cross-streetswith their coordinates was set-up from the reference maps provided withMapInfo with each record having the following fields:

street_one - id number of the first streetstreet_two- id number of the second streetx_coord - longitude of intersecting pointy_coord - latitude of intersecting point

Chapter 8

282

subb_bdry - id number of the suburb boundarysubb_mult- number of multiples within the suburb boundarypcod_bdry - id number of the postcode boundarypcod_mult- number of multiples within the postcode boundary

Geocoding a cross-street address was then a matter of searching this cross-streetdatabase to find a match between the first and second streets within theappropriate suburb boundary. The latitude and longitude obtained from a cross-street matching correspond to the centre of the intersection of the two streets, i.e.it lies on the intersection of the centre lines, as shown in Figure 8.4. Thus, if therespondent nominates Smith Street as the address, with the nearest intersectionbeing Brown Street (or vice versa), then that location was initially given thecoordinates on (X,Y).

Smith StreetBrown Street

(X,Y)

Figure 8.4 The Location of Cross Street Addresses

The last four fields of the cross-street database listed above are necessary becausemultiple occurrences of a cross-street in various locations are possible. To be ableto identify which cross-street is pertinent, the cross-street database has to have aboundary field that qualifies each record. Searching a cross-street in turn mustalso have boundary information as part of the input. Thus, Smith and BrownStreets may have intersections in several suburbs (because they are differentSmith and Brown Streets). So long as the suburb information is provided alongwith the cross street names, the overlaying of the suburb boundary file willidentify which is the correct location.

However, this only partially solves the problem of multiple cross street matches,as multiples may also exist within a single boundary. A good example is a "court"type street where it intersects another street twice, with both intersections likelyto be in the same suburb or postcode boundary, as shown in Figure 8.5.Knowing the number of multiples allows for a randomised approach to selecting

Data Processing

283

a pair of X and Y coordinates among the multiples. It should be clear thatmultiple occurrences of a cross-street which are in different boundaries shouldnot be considered as multiples.

Court

Main Street

Figure 8.5 The Occurrence of Multiple Cross Street Locations

The geocoding of cross-streets, as in geocoding of full addresses, is also done insuccessive stages with the next stage using a larger boundary than the previous.Once again, the probability of a correct geocode decreases as a larger boundary isused. For cross-streets, this is aggravated by the random process of selecting across-street from its set of multiples, if any.

A further problem with cross street locations is the difficulty of allocating thislocation to a specific CCD. Because CCD boundaries often run along the centrelines of streets, a cross street often coincides with the junction of several CCDs, asshown in Figure 8.6. Thus each quadrant of the cross street belongs to a differentCCD. The question is which CCD should be used to represent the location of thecross street, and hence the location of the destination described in terms of thecross street. Unless further information is provided about the destination, any ofthe CCDs might be correct. If the destination is on Smith Street, then it could beon either side of the street on either side of Brown Street. Thus any of the CCDsmight be the correct one.

Chapter 8

284

(X,Y)

CCD 1

CCD 2

CCD 3

CCD 4

Smith StreetBrown StreetFigure 8.6 Cross Streets with Multiple Bordering CCDs

Under these circumstances, the (X,Y) coordinate was moved slightly so that it laywithin one of the surrounding CCDs. Since there were several thousanddestinations in the SEQHTS survey where this problem occurred, it was clearlyimpractical to move these points manually. Therefore, an automated procedurewas devised to accomplish this task. Around each cross street location wherethere were multiple possible CCD allocations, four candidate sets of coordinateswere generated by adding ± 0.00001 to both the longitude and latitude of thecross street coordinate. This had the effect of generating four points roughly at45° to the cross street location, as shown by the small points in Figure 8.7. One ofthese points was then randomly selected and the CCD in which the selectedpoint was located was then found.

CCD 1

CCD 2

CCD 3

CCD 4

Smith StreetBrown Street

Figure 8.7 The Allocation of a Cross Street Address to a CCD

Data Processing

285

By examining Figure 8.7, it is clear that the above procedure will fail when thecross streets themselves are at 45° to the lines of latitude, since the four newpoints will lie on the centre lines of the four arms of the cross street. In thesecases, the above procedure was rerun, with the four candidate locations beinggenerated at angles of 0°, 90°, 180°, and 270°. This resulted in the selected pointgenerally falling in one of the neighbouring CCDs.

Geocoding landmarks

It was allowable in the survey for respondents to nominate a landmark as adestination address. Examples of landmarks include the name of a restaurant, aschool, a bank, a government office, a shopping centre, a park, a beach, etc. To beeffective as a valid address, a landmark has to be qualified to identify it uniquelyfrom all others with a similar name. A bank, for example, needs to have thebranch (usually a suburb) appended to its name.

The geocoding of landmarks was done by one of several means. Firstly, it wasanticipated that many landmarks would be able to be geocoded by searching adatabase of landmark names with geocodes, supplied with MapInfo.Unfortunately, the original landmarks database supplied with MapInfo onlycontained railway stations as landmarks. Because of the limited scope of theoriginal landmarks file, it was necessary to create a new landmarks file containinga wider variety of landmark types. This was done by compiling information fromvarious sources such as telephone books and street directories on a variety oflandmarks, such as:

- schools, pre-schools and childcare centres- universities and colleges- shopping centres- food outlets- sporting centres- places of interest- parks, ovals and reserves- caravan parks- hospitals- ambulance stations- police stations- fire stations- churches

Guide and Scout halls- Masonic Centres- bus and airline offices- ferry terminals- post offices

Chapter 8

286

- public libraries- council offices- bays and beaches- boat ramps- theatres and cinemas- hotels and motels- commercial buildings- racecourses- TAB agencies- golf courses- bowling centres- swimming pools

For each of these landmarks, an equivalent full street address or cross-streetaddress was determined manually from these printed sources, and then thegeocoding methods for full street addresses and cross-street addresses (describedearlier) were used to generate the geocodes.

Not all landmarks were easily identified by a street address. Finding an addressfor a landmark posed a problem in cases where one was not available and/or thearea covered by the landmark was large (e.g. beaches and parks). For such largeareas, the area centroid may represent a more appropriate definition of thelocation to be used as a geocode. Centroids of areas could be marked andgeocoded in a MapInfo map, but this process proved to be laborious. Analternative geocoding method was, therefore, developed.

The alternative method involved the development of a computer program thatgenerates a geocode given a map reference from a street directory. An exampleof a map reference is "A 4 15" where "A" and "4" are row and column referencesrespectively while "15" is a map number in the street directory. A map referencemay also be specified as a fraction for a more precise specification as in "B.2 6.348A", where "B.2" refers to a point which is 20% of the way between row B and C,"6.3" refers to a point which is 30% of the way between columns 6 and 7, and"48A" refers to map 48A.

The task of assigning map references to landmarks was made less taxing byhaving somebody who was knowledgeable of the study area do the work. Inaddition, some data entry personnel entered map references as part of thedestination address on a number of occasions.

The street directory map reference method was used extensively in the SEQHTSsurvey to geocode full street addresses and cross-street addresses that failed toobtain a geocode in their respective methods, primarily because the streetnetworks in that area were missing from the MapInfo electronic reference maps.

Data Processing

287

This method worked well where the address could be positively located in thestreet directory maps. However, even for a full street address, the task ofidentifying the exact location using the street number was sometimes difficult,especially when the street directory maps did not show street numbers.

The accuracy of the geocodes obtained using this method, however, dependsgreatly on the accuracy of the street directory maps and the accuracy with whichthe maps had been digitised into the computer files. Accuracy may be verified bymapping, in MapInfo, a sample of geocodes obtained using this method for eachstreet directory map. This is important to determine the real position of eachgeocoding method in the hierarchy of geocoding methods.

Geocoding by sampling

Addresses provided by respondents were not always complete. Somerespondents intentionally omitted street numbers or just indicated their suburbor locality - probably for privacy reasons. The approach that was used togeocode these cases in the SEQHTS survey was to sample a point along thelength of the street, if a street name was given, or to sample a point within asuburb, if a suburb was all that was available.

A long or winding street in a MapInfo map is divided into short segments,usually at street intersections and when it bends or changes direction. Sampling apoint along a street therefore consists of gathering all the segments belonging tothe given street within a given area (suburb or postcode), then randomlyselecting which segment to use (segments may be assigned relative weightsbased on their lengths), and then sampling a point along the selected streetsegment. Sampling a point within an area (suburb or postcode) also followed thisprocedure, with the added step of firstly randomly selecting a street among thestreets within the area.

In addition, the selection of the side of the street was also randomised, and thesampled point was then offset transversely from the street by about 10 metres.This was felt to be necessary as the lines defining the streets on a MapInfo maprepresent the centre lines of the streets and thus an adjustment had to be made toaccount for the street width. This adjustment was required because CCDboundaries also follow the centre lines of streets, and this method minimised theincidence of locations falling on the boundaries between adjacent CCDs. Theoffset of 10 metres is consistent with the way in which MapInfo geocodes fullstreet addresses, as shown in Figure 8.3. This practice, however, resulted in somegeocodes "spilling out" of boundary files or onto water areas when the streetsegment was near a river bank or beach. These occurrences were correctedmanually, after visually examining a plot of the geocoded points.

Chapter 8

288

As in geocoding of full street addresses and cross-street addresses, progressivelylarger boundaries were used when the given street could not be found within thegiven suburb boundary.

The methods belonging to this last category of geocoding were all implementedoutside of MapInfo using specially written program modules, but using thereference maps provided with MapInfo. Locations in other parts of Queensland,in other States, and overseas were not geocoded, but were assigned a pseudo-CCD code to assist in identifying their location.

The methods described above gave rise to a range of geocoding methods, whichwere recorded in the respective data files using the following codes:

Code Geocoding Method10 "full address, exact match on suburb"11 "full address, exact match on postcode"12 "full address, exact match on nearest 8 suburbs"13 "full address, exact match on nearest 8 postcodes"14 "interactive matching"20 "cross-streets, exact match on suburb"21 "cross-streets, multiple exact matches on suburb"22 "cross-streets, exact match on postcode"23 "cross-streets, multiple exact matches on postcode"24 "cross-streets, exact match on nearest 8 suburbs"25 "cross-streets, multiple exact matches on nearest 8 suburbs"26 "cross-streets, exact match on nearest 8 postcodes"27 "cross-streets, multiple exact matches on nearest 8 postcodes"30 "landmark, exact match using MapInfo landmarks"31 "landmark, with equivalent full address"32 "landmark, with equivalent cross-streets"33 "landmark, exact match using UBD Refidex landmarks"40 "sampling along a street, within a suburb "41 "sampling along a street, within a postcode "42 "sampling along a street, within nearest 8 suburbs"43 "sampling along a street, within nearest 8 postcodes "50 "sampling of street, within suburb"51 "sampling of street, within postcode"60 "not geocoded, but pseudo-SLA coded"

8.2.4 Coding Administration

The process of coding needs to be carefully supervised if high quality datatransmission is to be obtained from questionnaire to computer. The processreally consists of two phases, although in many surveys these will be performedsimultaneously by the same person. The two phases are coding and data entry.In many surveys, these tasks will be deliberately separated with a team of codersdeciding on and allocating codes to the questions, and a team of high-speed

Data Processing

289

typists entering data into the computer. The following comments apply more tothe coders than the typists.

A decision which needs to be made before coding begins is whether data entry isto be independently verified. This verification is designed to detect errors whichare purely typing errors and which might otherwise be undetectable. Verificationis performed by having a second typist type in every form which has alreadybeen typed in and then having the computer check the second typed entryagainst the first typed entry. If any discrepancies are noted an error message isprinted so that a check can be made against the coding form or, if necessary, theoriginal questionnaire.

The advantage of verification is that it reduces typing errors to virtually zero. Theerror rate for experienced typists is in the order of 2%. If verification is used, theerror rate can be reduced to 0.04%. The disadvantage of verification is that itobviously doubles the cost of data entry. It should also be realised that many ofthe typing errors will be picked up by other editing checks at a later stage. Theonly errors which will not be detected are those which do not go out of range orcause a logical error. The decision as to whether to verify will depend on howserious these types of errors are seen to be to the overall survey analysis.

The recruitment and training of coders follows a similar procedure to thatoutlined earlier for interviewers. The main difference is in the skills required. Theprincipal requirement for coders is that they possess good clerical skills, clearhandwriting and plenty of patience. Note that these skills are not the same asthose required for interviewers. Therefore good interviewers will not necessarilybe good coders, and vice versa. As with interviewers, however, the majortraining for coders should consist of practice coding. A useful technique is to haveall coders code the same set of questionnaires which have been chosen, orartificially constructed, to illustrate many of the different types of problemswhich the coders are likely to encounter.

In production coding, a decision must be made as to the order in which questionsare to be coded from the questionnaire. With simple questionnaires it is easiest ifall questions are coded in the order in which they appear on the questionnaire,letting the computer rearrange the items into logical units of information. Formore complex questionnaires with several difficult questions to code, it isworthwhile employing some special coders whose job it is to code the moredifficult questions such as location coding and the coding of open-questions.These coders may mark their codes directly on the questionnaire forms and leaveit to the general coders to transfers these codes to the coding sheet for typing.The advantage of special coders is that the variability between coders is reducedand the productivity is increased since the special coders quickly develop a feelingfor their task and are able to innovate short-cuts to help in the task. With the use

Chapter 8

290

of computer terminals for data entry, special coders can also enter their codesdirectly into the computer by accessing the respondent's record by means of theunique identification number. They then leave it to the general coders to accessthe record at a later date to enter the remainder of the codes.

It is useful in production coding for the supervisor to perform some check-codingon a sample of the coded questionnaires. This quality control procedure enableserrors to be detected in an individual coder's work or enables detection ofsystematic errors by many coders which may indicate a deficiency in the codingframe or the coding instructions. Typically, the sampling rate for check-codingshould be quite high at the beginning of the coding process, and should reduce torandom spot-checks when most problems in coding appeared to be rectified. Ifcheck-coding did detect deficiencies in the coding frame (such as not enoughcategories for a particular question), then it is essential that the supervisor be theone who issues the amendments to the coding frame. Individual coders shouldnot be allowed to add extra codes as they see fit; certainly they should beencouraged to suggest changes to the supervisor, but it is the supervisor's job toissue changes.

Despite all the best intentions and preparations, coding is rarely error-free. Twoprincipal sources of error are coder variability and reliability. Coder variability isa measure of the discrepancy in codes assigned to the same response by twodifferent coders whereas coder reliability is a measure of the discrepancy in codesassigned by the same coder at two different points in time. In a study reportedby Durbin and Stuart (1954) it was shown that although coder variability wasgreater than coder unreliability, both were a cause for some concern.Significantly, it was shown that both were more of a problem with more complexquestions. The implication of this is that the coding task should be kept as simpleas possible with all editing and data transformations being performed by thecomputer after data has been entered.

8.3 COMPUTER-BASED DATA ENTRY AND EDITING

8.3.1 Data Entry

The methods of data entry which are becoming increasingly popular are the useof commercially available spreadsheet and database programs (such as MicrosoftExcel, Lotus 1-2-3, dBase, and FoxBASE/FoxPro). These generic programsprovide most of the facilities for data entry, editing and basic analysis. Theconcepts underlying spreadsheets and database programs are, however, subtlydifferent.

A spreadsheet, such as Excel, subdivides the computer screen (and beyond) into aseries of intersecting rows and columns. Each cell so created can hold an item of

Data Processing

291

information, where that item may be text, a number, or an equation whichoperates on the contents of other cells. Each cell is uniquely identified by thename of the intersecting row and column. In using a spreadsheet for data entryfrom a questionnaire, two options are available: first, the columns can representdata items for each questionnaire, while the rows represent different respondentsto the questionnaire. Second, the screen can be set up as a replicate of the actualquestionnaire form, and the coder then simply copies the data from the actualform to the replicate on the screen.

An example of the use of spreadsheets is given below. Suppose that a samplequestionnaire such as that shown in Figure 8.8 is to be coded. Using the firstspreadsheet method, the screen might appear as shown in Figure 8.9, whereasthe second method would result in a screen such as that shown in Figure 8.10.Note that it is possible to eliminate the row and column gridlines and the rowand column headings from the screen display, so that the data entry form lookslike a normal sheet of paper.

How many of the following vehicles are in this household?(Please include all vehicles usually kept here overnight)

Cars & Station WagonsVans & Pickups Bicycles

Motor Cycles

Others (please specify type ........................)

HOUSEHOLD FORMA household consists of sameall mailing address.persons who live together and share the

Do you have a telephone in this household? Yes

No

Phone No.

How many persons (including yourself) live in this household?

Of these people; How many are less than 10 years old?

How many are 10 years or older?

What type of dwelling is occupied by this household?

Other (please specify type.....................................)Mobile Home

Condominium / TownhouseSingle-family DetachedApartment

Figure 8.8 Sample Questionnaire to be Coded

Chapter 8

292

Figure 8.9 A Simple Spreadsheet Layout

The use of a database for data entry has some similar features in that the screencan be set up with either row and column entries, or with a replicate of thequestionnaire. The main difference between a database and a spreadsheet is thata spreadsheet identifies a data item by its position within the spreadsheet (interms of row and column numbers), while a database identifies a data item bymeans of a field name and a record number. Another difference lies in the waythat data items are normally displayed; a spreadsheet displays data for severalrespondents in the form of a matrix, whereas a database often displays thecontents of each field for one record at a time (although this is not necessary).

t

Figure 8.10 A Replicate Spreadsheet Layout

Whether you choose to use a spreadsheet or a database for data entry andediting will depend on the availability of the two types of program and on thestructure and size of the survey data being coded. There are two essentially

Data Processing

293

different types of data structure encountered in typical transport surveys: flat-filedatabases, and relational databases. A flat-file database is often used withrelatively simple surveys where the data from each respondent to the survey isstored in one record within one data file. A relational database is used when thereis a natural hierarchical nesting to the data obtained from one respondent, orgroup of respondents. For example, in many transport surveys, travel data arecollected from households. There is a set of data describing the household itself,some data describing each member of the household, and some more datadescribing each trip made by each member of the household. While it would bepossible to store the data describing the person and the household with the datafor each trip, this would be very wasteful of storage space since the householdand person data would be repeated many times. A more efficient method is tostore the household data in one file, the person data in a second file, and the tripdata in a third file, and then to establish "linkages" between the three files. Theselinkages or "relationships" form the basis of a "relational database". The conceptof a relational database for travel survey data is depicted graphically inFigure"8.11.

HOUSEHO LD FILE TRIP FILEPERSON FILE

Figure 8.11 A Relational Database for Travel Survey Data.

Chapter 8

294

The household file is always the shortest of these files while the trip file is alwaysthe longest (in terms of number of records). In Figure 8.11, the relationshipsbetween the files are shown by the connecting lines. Thus the information aboutthe people in the first household is stored in the 3rd and 4th records of the personfile, while the information about the people in the third household is stored in the8th, 9th and 10th records of the person file. For these three people, the trips forthe first person are stored in the 11th, 12th and 13th records of the trip file, thesecond of these people made no trips, while the trips for the third person arestored in the 14th, 15th, 16th and 17th records of the trip file. In order to establishthese relationships, both the household and person files must contain a fieldwhich uniquely identifies that household or person in the appropriate data file.Each record in the trip file then contains a "pointer" which uniquely links each tripwith one and only one person in the person file. In turn, each record in theperson file contains a pointer back to the household file. By storing only a pointerback to the person and/or household in the trip file, rather than the full details ofthe person or household, substantial savings in storage space can be achieved.

As an example of a relational database for travel surveys, consider the dataillustrated in Figure 8.12. Because of the structure of the data, and the size of thedatabase, it makes sense to store it in a relational database, rather than a flatfiledatabase. In FoxBASE, an individual database is set up in a work area asrepresented by the circles in Figure 8.12. Within each of the ten work areas, adifferent database can be opened. In the case shown in Figure 8.12, the householddata file has been opened in work area A, the person file in work area I, and thetrip file in work area G.

The databases in the work areas can be related as shown by the arrowsconnecting the work area circles. Thus the trip file is related to the person file (i.e.extra information describing each trip is contained in the person file), and theperson file in related to the household file. By inference, therefore, the trip file isalso related to the household file. To create these relationships, a pointer variablemust be common between the two files being related. In this case, the trip filecontains a variable identifying the person making these trips, while the personfile contains a variable identifying the household to which the person belongs. Inaddition, the file to which the relationship arrow points must be "indexed" on thepointer variable. "Indexing" a file is similar to sorting the file on the basis of thatvariable; however, whereas sorting rearranges the records in the file, indexingcreates another index file which stores the order of the records if they had beensorted on that variable. Indexing essentially provides a fast way of findingrecords based on the indexing variable. It can be seen that the person andhousehold files have been indexed in Figure 8.12, by means of the "index finger"pointing to those work area circles.

Data Processing

295

Figure 8.12 A Relational Database Created in FoxBASE.

One of the advantages of relating database files is that one can now identify towhich person and household a particular trip belongs. For example, by clickingon the fourth row of the trip file (as shown in Figure 8.13 by the highlighted cell)and then clicking anywhere in the person file window, the person to whom thattrip belongs will immediately be highlighted (as shown by the highlighted cell forperson 3 in Figure 8.14).

Figure 8.13 Selection of a Trip Record in the Trip File

Figure 8.14 Selection of Corresponding Person Record in the Person File

Chapter 8

296

Figure 8.15 Selection of the Corresponding Household Record.

By clicking anywhere in the household file window, the household to whom thattrip and person belongs will also be immediately highlighted (as shown by thehighlighted cell for household 8 in Figure 8.15).

The ability to quickly find related trips, persons and households is very usefulwhen coding and editing the data. However, a relational database is not the idealformat for performing statistical analysis when you wish to test for relationshipsbetween variables in different files (e.g. trying to construct a modal choice modelbased on trip, personal and household characteristics). For these types ofactivities, we need to convert the relational database structure back into a flatfiledatabase (probably for use with a different statistical analysis program, sincedatabase programs are not particularly well-suited for multivariate statisticalanalysis).

To make this conversion, it will be necessary to copy some of the variables fromthe trip, person and household file to another file (which is a flatfile). This is donesimply by specifying the name of the new file, and by selecting the variablesfrom the three files that you want to copy across to the new file. Because of therelational database structure, FoxBASE will know which person and householdrecord to go to when you select a variable from either of those files. Figure 8.16shows a flatfile which has been created for the purpose of statistical analysis (tobe described in Chapter 10), and which has subsequently been assigned to a newwork area in FoxBASE. Whilst this file contains as many records as there arepersons (in this case, 904), it contains only eleven variables (including householdand person identifiers) instead of the entire set of 38 variables for all threedatabases. It is therefore possible to work more quickly and with less memoryrequirements when analysing this data set.

Data Processing

297

Figure 8.16 A Flatfile Created for Statistical Analysis

8.3.2 Data Editing

Once the data have been coded and entered into the computer, the major task ofediting can begin. The editing phase of the survey process is perhaps the mostboring but it is also one of the most important tasks. Most survey designerswould admit that more time and effort goes into the editing task than almost anyof the other tasks; and such effort is worthwhile. It is useless to proceed straightinto analysis hoping that the data are free from error; there will always be errorsin the data as initially coded. This error arises from several sources; therespondent, the interviewer, the coder and the typist. The errors may be genuineerrors in judgement or reporting, or may arise from problems in legibility. Someof these errors may be detectable during editing and some of them may be ableto be corrected.

The main editing technique for the detection of errors is the simple process oftabulation and cross-tabulation. The construction of frequency distributions, thecalculation of means, standard deviations and Z-statistics, and the plotting of thedata by computer may also assist in detecting outliers in the data. The threemajor problems which may be detected during editing are permissible rangeerrors, consistency checks and missing data.

Permissible range errors:

Typing and recording errors may result in obvious errors where the code valueis outside the range of codes permissible for that response, e.g. a code of 4 for aYES/NO question. In many cases, this type of error can be corrected by referringto the original questionnaire or coding sheet where, very often, a mistake hasbeen made in transcription because of poor legibility. In other cases, a

Chapter 8

298

misunderstanding as to the units to be used in the response will cause the answerto be outside the allowable range.

Permissible range error checks are all within a single data file. Examples of rangeerror checks performed in the 1992 SEQHTS survey (Richardson and Ampt,1993a) include:

Household Form

• Household size cannot be 0 when number of visitors is 0 (error)• Household size is usually not more than 10 (warning)• Household size cannot be negative (error)• Number of visitors is usually not more than 10 (warning)• Number of visitors cannot be negative (error)• Number of vehicles should equal sum of all vehicle types (error)• Number of vehicles is usually not more than 10 (warning)

Person Form

• Number of person records cannot be more than household size(incl. visitors) on household form (error)

• Birth year should not be more than 92 (warning)

Vehicle Form

• Number of vehicle records cannot be more than number ofregistered vehicles on household form (error)

Stop Form

• Arrival time cannot be less than 0400 (error)• Departure time cannot be less than arrival time for stop (error)• Arrival time cannot be less than departure time of previous stop

(error)

Logic checks:

Cross-tabulations will often reveal logical inconsistencies in the data, e.g. ahousehold with three people having four driving licence holders. Often, checkingwith the original questionnaire will show that one of the responses has beentranscribed incorrectly and can be easily corrected. If both responses have beencoded as they appear on the questionnaire then it is obvious that an error hasbeen made in recording one of the responses. To determine which one is in errorit is necessary to check other responses for that respondent to see which of thetwo responses is most likely to have been correct. For example, closer checkingof the questionnaire, in the above example, may reveal that only two of the

Data Processing

299

household members were of licence holding age and two cars were owned bythe household. It would therefore appear reasonable to recode the number oflicence holders as two. In other cases, a logical inconsistency while not beingimpossible, as in the above example, may be highly improbable e.g. a low-income two-person household owning five cars. Again the questionnaire shouldbe checked to determine whether responses to other questions indicate that oneof the responses is in error. In both cases, if no evidence can be found that one ofthe responses is in error, it is best to leave the responses as they stand.

Logic checks are cross-tabulation checks, sometimes within one file andsometimes across more than one file. Examples of logic checks performed in theSEQHTS survey (Richardson and Ampt, 1993a) include:

Within Stop File

• The last trip of the day should normally be to home. Check otherdestinations.

• Trips with home as the origin and destination should be checked.Usually indicates a missing stop record. The same applies to anyother location (not purpose or place) when it appears as both theorigin and destination (e.g. "my workplace" to "my workplace"without an intermediate destination).

• Check for combinations of destination place and purpose (seesection 2.5.2 for a full description)

• If the mode is public transport, then the destination place shouldnormally be a terminal for that mode (e.g. a bus trip to a bus stop).

• Trips with very high speeds (for the mode concerned) should bechecked.

• All trips of more than 2 hours duration should be checked.• Walk trips of more than 1 hour duration should be checked.

Within Person File

• Check year of birth against driver's licence (should not be less than17)

• Check year of birth against full-time employment (investigate if lessthan 15)

• Check year of birth against retired/old age pension (investigate ifless than 55)

• Check year of birth against preschool/childcare (investigate if morethan 6 years old)

• Check year of birth against primary school (should normally bebetween 4 and 13 years old)

Chapter 8

300

• Check year of birth against secondary school (should normally bebetween 12 and 20 years old)

• Check year of birth against university/college (should normally begreater than 16 years old)

• Compare entries in employment, studying and other activities fields.Should normally be at least one valid entry in one of the fields; if so,other fields should be coded as "not applicable"; if not, all three fieldsshould be coded as "missing".

Within Vehicle File

• Check spelling of vehicle makes and models• Switch make and model name , if necessary (e.g. Falcon Ford should

become Ford Falcon)• Check number of cylinders against make and model.

Between Admin and Household Files

• Those households on the Administration file with a response codecorresponding to a valid response should appear on the Householdfile.

• Conversely, those households on the Administration file without aresponse code corresponding to a valid response should not appearon the Household file.

Between Household and Person Files

• The number of records in the Person file for a household shouldcorrespond to the number of residents and visitors specified on theHousehold file.

Between Household and Vehicle Files

• The number of registered vehicles on the Household file shouldagree with the number of vehicles for which information is suppliedon the Vehicle file. If vehicle details are missing from the Vehicle file,missing values should be entered.

Between Person and Stop Files

• People without licences in the Person file should not appear as cardrivers in the Stop file.

Data Processing

301

Between Vehicle and Stop Files

• The vehicle number specified for any particular Stop record shouldcorrespond to an actual vehicle record on the Vehicle file.

Missing data:

Editing checks may reveal two types of missing data; firstly where completeinterviews are missing, and secondly where responses to particular items aremissing. With missing interviews (i.e. non-response) it is advisable to applydifferential weighting to the survey results, if something is known about the non-respondents, so that the sample results more closely represent the populationresults. Various methods of using adjustments for non-response are described byKish (1965) for general surveys, and are covered in greater detail in the nextChapter.

A more common problem of missing data relates to responses missing fromparticular questions. In this case, three options are available:

(i) Ignore the missing values and report the distribution of responses interms of the total number of responses received;

(ii) Report the percentage missing for each question so that results are basedon the total number of completed questionnaires; or

(iii) Attempt to estimate a probable value of the missing datum, usinginformation contained in the responses to other questions.

As an example of the treatment of missing data, consider responses to a questionon income. This question typically has a higher-than-average number of missingdata values. If these missing values were distributed evenly across the range ofincome values, no particular problems would arise in the calculation of averageincome. However it has been observed in many surveys that higher non-reporting of income occurs with higher income respondents (e.g. Ministry ofTransport Victoria, 1981). Therefore estimates of the average income based onlyon the data obtained from respondents will systematically underestimate theaverage income. More importantly, the difference in income between differentareas will appear to be less than it really is, if options (i) or (ii) are used fortreating the missing data.

It is possible however to estimate what the missing value of income might be forany one respondent by utilising information from responses to other questions(e.g. Bhat, 1994; Loeis and Richardson, 1994). For example, it can be shown thatincome is correlated with other variables such as car ownership, employmentstatus, occupation, gender and age. Each of these variables is less likely to be

Chapter 8

302

missing in the survey responses than income and therefore in most cases anestimate can be made of income. Whilst the resultant estimate is not intended torepresent what the income really is, the use of such estimates can markedlyimprove the validity of parameter estimates. It is better to make such estimatesthan to perform calculations where there is a high proportion of missing data.



9. Weighting & Expansion of Data

This section addresses the issues that arise from the fact that a number of factorswill have interfered with obtaining the exact information, both in quality andquantity, from the survey data in spite of the analyst's best efforts to chose theappropriate survey method, to develop the best instrument possible, and toadminister and execute the survey meticulously. Why then do we suspect that wewill not quite get the information we want, and why should any corrections,adjustments and weightings be necessary?

The answers to these questions fall into a number of categories. After the surveyinstrument was distributed, many of the analyst's conceptual, theoretical, andlogical considerations were up against a test in the real world; namely thebehavioural characteristics of the human beings from whom the surveyinformation was to be obtained. And these human beings do not necessarilyrespond to our request in line with our wishes, expectations, and theories. Someof them were not able to respond to our request, others did not want tocooperate, others responded only partially, others again misunderstood somequestions on the survey instrument.

Yet in spite of the less than perfect response that is likely to have occurred, theinvestigator still wants to, and has to, use the data to obtain information that isrelevant for the survey population and not just for the subsample of people thatresponded "perfectly". It should be remembered here from the discussion of

Chapter 9

306

sampling theory in Chapter 4 that the original intent, based on this theory, wasto develop population estimates on the basis of a carefully selected sample of thatpopulation. Unfortunately, in virtually all surveys, the population estimates haveto be derived on the basis of a response of less than one hundred percent, inmost instances from substantially less than this ideal target.

The purpose of this chapter then is to make the analyst aware of both the likelyreasons for, and the consequences of, having to deal with only a subset of thedesired sample. An awareness of these reasons and, particularly, of the effects ofan imperfect response rate, can go a long way towards understanding thelimitations of the survey results, the likely magnitude, direction, and implicationsof any biases resulting from them, and towards the developments of anyadjustments and compensating measures that might be possible.

The research literature is full of examples where researchers have concentratedtheir efforts exclusively on the intellectually more challenging and satisfyingexercise of developing sophisticated mathematical models without properattention to the quality of the data that they use to validate these models.However, as will be explained in Chapter 10, there is a tradeoff between dataquality and sophistication of the modelling process. Without a proper knowledgeof the characteristics of the dataset used, it is almost impossible to draw properconclusions about the quality of such models, since the source of the problemcould lie either in the data base or the model itself.

The other area where the improper use of survey subsample information canlead to disastrous results is in the area of simple statistical information andconclusions derived from sample information. Given the multitude of surveysconducted every day, this area is probably the more serious one, since suchstatistical information is used daily at all levels of government and in the privatesector for short-term and long-term decision-making, investments, andprojections.

There are three major sources of systematic error (bias, distortion) in a typicalsample survey dataset, namely:

(a) Non-response

(b) Non-reporting

(c) Inaccurate reporting

Non-response pertains to the situation where a household or individual did notprovide a response at all, i.e. no survey form was filled out. Non-reporting refersto survey responses where the analyst is in receipt of a survey form on whichcertain questions have not been answered. Inaccurate reporting describes the cases

Weighting & Expansion of Data

307

where the analyst has determined that some of the responses provided on thesurvey instrument are objectively incorrect, inaccurate, or incomplete.

In order to compensate for these deficiencies in the typical sample survey dataset, a number of "repair" strategies can be pursued:

(a) Editing.

This standard process of "repairing" the survey responses simplyeliminates obvious omissions, errors, etc. that can be rectified byobjective judgment on the part of the survey administrator and his/herstaff. Editing obviously will not address the problem of non-response,and in most cases will not contribute to overcoming the non-reportingissue.

(b) Coding.

The coding process eliminates additional errors and omissions in additionto identifying inconsistencies among the answers given by therespondent. This process does not address the non-response problem(obviously, coding and editing can only take place if the questionnairewas returned).

(c) Weighting Factors.

Socio-economic and statistical adjustments to account for non-observedinformation.

9.1 POPULATION EXPANSION FACTORS

As stated several times already, the eventual purpose of a sample survey is to beable to draw conclusions about the characteristics and behaviour of thepopulation from which the sample was drawn. If the sample has been selectedaccording to the simple random sampling method described in Chapter 4, thentheoretically the results of the sample survey can be expanded back up to thepopulation by multiplying by the inverse of the sampling fraction. For example,if a sample of 100 people has been randomly selected from a population of 1000,and if it has been found that this sample makes a total of 287 trips per day thenthe total number of trips made by the population can be inferred to be 2870.However, while the concept of sample expansion is quite simple, the process israrely as simple as described above, for the following reasons:

(a) Even with a simple random sample, there is no guarantee that the sampleis truly representative of the population. Chance random errors willresult in some groups within the population being over-represented,

Chapter 9

308

while others are under-represented (remember the example of the malesand the females in the simple random sample in Chapter 4). If thevariable in question (e.g. the number of trips per day) variessystematically across these groups, then simple expansion of the sampleresults will not necessarily provide good population estimates.

(b) In many situations, we will have used a more complex samplingprocedure, some of which (such as variable fraction stratified randomsampling) will never produce a sample which is representative of thepopulation, because we have deliberately under- and over-sampled thestrata. To obtain population parameter estimates, we need to take explicitaccount of the manner in which the sample was drawn, and then workbackwards to reconstruct the population estimates.

(c) Even if we have accounted for the manner in which the sample has beendrawn from the population, and if a perfectly representative sample hadbeen drawn, there is still no guarantee that what we obtain fromrespondents is what we expected to obtain. For example, not all peoplewill respond to the survey; furthermore, this non-response is unlikely tobe evenly distributed across the groups within the population. Thus thedistribution of respondents, across various characteristics, is unlikely tobe the same as the distribution of the sample across those parameters.

For the above reasons, it is usually necessary to explicitly account for thecomposition of the respondents before expanding the results to represent thepopulation to which the respondents belong. This explicit recognition isperformed by means of population expansion factors, which relate thecomposition of the respondent group to a known composition of the population.In order to calculate these expansion factors, however, it is necessary to have asecondary source of data describing the population in terms which can also berelated to the sample. The most common source of secondary data is a nationalCensus of Population, which provides complete information about thepopulation with respect to key socio-economic variables. Provided that yoursurvey asks these same questions of your respondents (in the same way andusing the same response categories), then you can calculate population expansionfactors to obtain population parameter estimates.

To give an example of the calculation and use of population expansion factors,consider a survey in which a sample of 1000 people over the age of 15, from atotal population of 10000, are surveyed by postal questionnaire about their tripmaking behaviour. As part of the survey, assume that each individual is askedtheir age, their sex, and the number of trips they made on a specified day.Assume that the sample was randomly drawn from the population, and that theoverall response rate was 40%. The number of responses in each age/sex


309

category are shown in Table 9.1, and the average number of trips per day foreach category is shown in Table 9.2. Based on this information we can calculate,by a weighted average, that the average trip-rate in the sample was 3.17 trips perday.

Table 9.1 Responses by Age and Sex

Age--> 1 5 - 2 4 2 5 - 3 4 3 5 - 4 4 4 5 - 5 4 5 5 - 6 4 6 5 + TOTAL

Male 56 30 14 13 19 20 152

Female 83 65 28 20 21 31 248

TOTAL 139 95 42 33 40 51 400

Table 9.2 Trip-Rates in Responses by Age and Sex

Age--> 1 5 - 2 4 2 5 - 3 4 3 5 - 4 4 4 5 - 5 4 5 5 - 6 4 6 5 + A V E .

Male 2.14 3.10 4.79 5.85 4.74 3.25 3.36

Female 2.53 1.91 3.50 6.70 4.90 2.81 3.05

A V E . 2.37 2.28 3.93 6.36 4.83 2.98 3.17

While it is known that the overall response rate was 40% (because we received400 replies to the 1000 questionnaires distributed), we do not know the responserates in the individual categories. To calculate these response rates, we need toknow the total number in the population in each category. Suppose that we havea secondary data source which provides the number in the population in each ofthese categories, as shown in Table 9.3.

Knowing this information, we can now calculate the response rates in eachcategory (assuming that the sample of 1000 was randomly distributed acrossthese categories) by dividing the number of responses in each category by thenumber in the sample in each category, to obtain the response rates shown inTable 9.4.

Table 9.3 Population Breakdown by Age and SexAge--> 1 5 - 2 4 2 5 - 3 4 3 5 - 4 4 4 5 - 5 4 5 5 - 6 4 6 5 + TOTAL

Male 2040 1130 560 430 380 390 4930

Female 1880 1090 570 470 450 610 5070

TOTAL 3920 2220 1130 900 830 1000 10000

Chapter 9

310

Table 9.4 Response Rates by Age and SexAge--> 1 5 - 2 4 2 5 - 3 4 3 5 - 4 4 4 5 - 5 4 5 5 - 6 4 6 5 + TOTAL

Male 27.5% 26.5% 25.0% 30.2% 50.0% 51.3% 30.8%

Female 44.1% 59.6% 49.1% 42.6% 46.7% 50.8% 48.9%

TOTAL 35.5% 42.8% 37.2% 36.7% 48.2% 51.0% 40.0%

The population expansion factors are now calculated as the ratio of the number inthe population to the number of respondents in each category, as shown in Table9.5.

Table 9.5 Population Expansion Factors by Age and SexAge--> 1 5 - 2 4 2 5 - 3 4 3 5 - 4 4 4 5 - 5 4 5 5 - 6 4 6 5 + A V E .

Male 36.4 37.7 40.0 33.1 20.0 19.5 32.5

Female 22.7 16.8 20.4 23.5 21.4 19.7 20.4

A V E . 28.2 23.4 26.9 27.2 20.7 19.6 25.0

Thus whereas the average expansion factor for the entire sample is 25.0, theindividual category expansion factors range from 16.8 to 40.0. When theseexpansion factors are applied to the number of trips made by respondents ineach category (obtained by multiplying the number of respondents by the triprate in each category), the total number of trips made in the population is foundas shown in Table 9.6, yielding an average trip rate in the population of 3.21 tripsper day (the difference between the sample trip rate and population trip rate isnot substantial in this case, but this would obviously depend on the data set beingused).

Table 9.6 Total Trips in Population by Age and SexAge--> 1 5 - 2 4 2 5 - 3 4 3 5 - 4 4 4 5 - 5 4 5 5 - 6 4 6 5 + TOTAL

Male 4370 3500 2680 2520 1800 1270 16140

Female 4760 2080 2000 3150 2210 1710 15910

TOTAL 9130 5580 4680 5670 4010 2980 32050

The above procedure would be used when the data in the secondary source is ofa comparable level of detail to that obtained in the survey. However, this is oftennot the case, and frequently the secondary source data can only be obtained at amore aggregate level. While we would like to know the number in thepopulation in each of the categories, often all we can get is the "marginals"; that is,the total number in each of the rows and columns. Thus in the above example, allwe may be able to get is a breakdown by age and a separate breakdown by sex,


311

but not a breakdown by age and sex together. This secondary data may berepresented as shown in Table 9.7.

Table 9.7 Marginal Population Totals by Age and SexAge--> 1 5 - 2 4 2 5 - 3 4 3 5 - 4 4 4 5 - 5 4 5 5 - 6 4 6 5 + TOTAL

Male ? ? ? ? ? ? 4930

Female ? ? ? ? ? ? 5070

TOTAL 3920 2220 1130 900 830 1000 10000

In such cases, it is still possible to calculate population expansion factors, but sincewe are working with less information, the reliability of these expansion factorswill depend on how much extra information is contained in the body of Table 9.7.To calculate expansion factors under these conditions, we need to adopt aniterative procedure where first we obtain agreement with respect to one of themarginals. For example, we can expand the values in Table 9.1, by multiplyingeach value by the ratio of the marginal total in Table 9.7 to the marginal total inTable 9.1, such that the correct number of males and females are obtained in theexpanded total, as shown in Table 9.8.

Table 9.8 Expanded Population Totals after First IterationAge--> 1 5 - 2 4 2 5 - 3 4 3 5 - 4 4 4 5 - 5 4 5 5 - 6 4 6 5 + TOTAL

Male 1820 970 450 420 620 650 4930

Female 1700 1330 570 410 430 630 5070

TOTAL 3520 2300 1020 830 1050 1280 10000

At this point, while the total number of males and females is correct, the numberin each age group is incorrect. It is therefore necessary to perform a seconditeration by adjusting the values in the matrix such that the column totals agreewith the number in each age group, as shown in Table 9.9.

Table 9.9 Expanded Population Totals after Second IterationAge--> 1 5 - 2 4 2 5 - 3 4 3 5 - 4 4 4 5 - 5 4 5 5 - 6 4 6 5 + TOTAL

Male 2030 940 500 460 490 510 4930

Female 1890 1280 630 440 340 490 5070

TOTAL 3920 2220 1130 900 830 1000 10000

At this point, in this example, the age and sex totals are correct and the iterationscan cease. In other situations, however, especially where there are a largernumber of control variables on which the sample is being expanded, it may be

Chapter 9

312

necessary to iterate several times before a stable condition is achieved. Heathcote(1983) describes the iteration process in some detail. However, even thoughstability has been achieved with respect to the marginal totals, there is noguarantee that the values within the matrix in fact agree with the real values inthe population. For example, by comparing Tables 9.9 and 9.3, it can be seen thatmales between the ages of 25 and 34 are under-represented in our expandedpopulation while females in this age group are correspondingly over-represented. This occurs because of the correlation between age and sex in thepopulation (females tend to be older) which is not accounted for in the iterativeprocess based on the marginal totals. At the end of the iterative process,population expansion factors may be calculated as the ratios of the estimatedtotals in the population to the number of respondents in each category, as shownin Table 9.10.

Table 9.10 Estimated Population Expansion Factors by Age and SexAge--> 1 5 - 2 4 2 5 - 3 4 3 5 - 4 4 4 5 - 5 4 5 5 - 6 4 6 5 + A V E .

Male 36.3 31.3 35.7 35.4 25.8 25.5 32.5

Female 22.8 19.7 22.5 22.0 16.2 15.8 20.4

A V E . 28.2 23.4 26.9 27.2 20.7 19.6 25.0

These estimated expansion factors may then be applied to the number of tripsmade by respondents in each category to find the total number of trips made inthe population as shown in Table 9.11, yielding an average trip rate in thepopulation of 3.18 trips per day. In this case, the population expansion factorshave moved the total number of trips closer to the real number (as given in Table9.6), but have not provided the correct answer because of the information whichwas missing from the marginal totals.

Table 9.11 Total Trips in Population by Age and SexAge--> 1 5 - 2 4 2 5 - 3 4 3 5 - 4 4 4 5 - 5 4 5 5 - 6 4 6 5 + TOTAL

Male 4340 2910 2400 2690 2320 1660 16320

Female 4780 2440 2210 2950 1670 1380 15430

TOTAL 9120 5350 4610 5640 3990 3040 31750

In addition to the mathematical problems involved in using marginal totals forthe estimation of expansion factors, there are a number of other practical issueswhich need to be resolved. First, one has to find a good source of secondary datawhich hopefully will provide the control variables in a cross-tabulated fashion(and not just in marginal total fashion). Second, the data in the secondary sourceshould have been collected in a similar manner to the survey currently beingconducted. In particular, the coding categories used should be similar between


313

the two data sets. Common definitions of items such as occupation, employmentstatus and housing type should be used (this may involve a compromise on yourpart if the secondary data, such as the Census, has already been collected). Third,there may be problems with the timeliness of the secondary data becomingavailable. For example, the Census typically takes about two years from the timeof data collection before the first results are available. Even then, these results aregenerally very aggregate in nature and may not be suitable for the purposes ofcalculating population expansion factors.

As a general rule, the design of the procedures for expansion of the data shouldbe performed very early in the design process, since the availability of secondarydata may often affect the choice and wording of questions on the survey.

9.2 CORRECTIONS FOR NON-REPORTED DATA

Non-reporting refers to the incompleteness of information in questionnaires thatwere returned. This incompleteness can refer either to questions or parts ofquestions which were answered incorrectly or incompletely, or to informationwhich was not supplied at all. In the context of travel surveys, this non-reportingphenomenon is of particular importance in the non-reporting of trips and tripcharacteristics since conclusions about trip volumes (by mode) and general tripmaking behaviour and characteristics are the focus of travel surveys.

A reason for non-reporting of trips and trip characteristics can be simple memorylapses, especially when the respondent is asked to recall trips made over asignificant period in the past. But even in short-term recollection, trips arefrequently forgotten or misrepresented. Another reason for non-reporting canlie in the conviction by the respondent that a trip was not "important", or it wastoo short, or it was performed on foot or by bicycle. Proper instructions abouttrip definitions and reporting requirements can reduce this source of non-reporting. In certain situations it is possible that a respondent is unwilling todisclose all trips because of an embarrassing trip purpose or destination. Verylittle, however, can be done to overcome this latter problem.

The problem of incomplete information has been studied within the context ofthe "KONTIV"(Kontinuerliche Erhebung des Verkehrsverhaltens - A ContinuousSurvey of Travel Behaviour) survey design in West Germany by Brög, Erl,Meyburg and Wermuth (1982) and Wermuth (1985a). The KONTIV design is ahighly refined self-administered survey developed by Werner Brög (reported inBrög et al., 1983) with a format similar to that shown in Figure 5.2. As such, itwould be expected that the problem of incomplete information would be at aminimum compared to other less well-designed surveys. Nonetheless, thepatterns of incomplete information are useful diagnostic information for thedesign of other surveys. Table 9.12 presents data on the percentage of responses

Chapter 9

314

for various questions for which there was incomplete information. These resultsare presented in two ways; the raw percentage of incomplete information on thesurvey form and the percentage incomplete after the coder had made anypossible corrections.

Table 9.12 Incomplete Information for Various Question Types

Incomplete Information

Characteristic On Survey Form After Corrections

Sex 3.0% 2.5%Age 4.5% 4.5%Marital Status 3.5% 3.0%Education Level 9.4% 3.1%Employment Status 7.0% 0.0%Occupation 9.5% 2.0%Drivers Licence 9.5% 4.5%Trip Details: 4.3% 3.5%

Destination Address 15.0% 11.0%Trip Purpose 10.0% 6.0%Travel Mode 54.7% 51.3%Travel Time 20.3% 12.3%

It can be seen that, initially, the extent of incomplete information on demographicand trip questions is in the range of 5 to 10%, but after coding and office editingthis can be reduced to less than 5%. With respect to incomplete trip details, themajor type of omission was with respect to travel mode. Wermuth (1985a) alsoshows that the extent of incomplete information for trip details varies with thetrip purpose and travel mode, as shown in Table 9.13. It can be seen thatshopping trips and recreational trips are most likely to have incompleteinformation, both before and after coder corrections, while non-motorised tripsare more likely to be incompletely specified.

Table 9.13 Incomplete Information for Various Question Types


On Survey Form After Corrections

Trip Purpose:Work 21.8% 9.8%School 37.0% 6.8%Shopping 60.3% 31.8%Other Discretionary trips 30.8% 9.6%Recreation 40.4% 13.7%Return Home Trips 8.9% 0.7%

Travel Mode:Non-Motorised 28.6% 10.8%Motorised 24.2% 9.6%Public Transport 30.3% 7.9%

It is also possible to relate the extent of incomplete information to the type ofrespondent supplying the information, as can be seen in Table 9.14. Thus, the


315

incomplete information increases as the respondent gets older, and tends todecrease as the level of education of the respondent increases.

Table 9.14 Incomplete Information for Various Types of Respondent


On Survey Form After Corrections

Age (years):10-15 31.9% 5.5%16-25 22.4% 4.5%26-45 25.1% 11.4%46-64 26.3% 11.0%>65 49.3% 22.3%

Education Level:Elementary School 31.5% 12.2%High School 22.0% 10.1%University 12.5% 4.6%

While the problem of incomplete information is an inconvenience, especially tothe coder who has to try to supply the missing information, a more seriousproblem is the complete non-reporting of trips (this may be seen as an extremecase of incomplete information). Brög, et al., (1982) and Wermuth (1985a) haveshown that the extent of non-reported trips can be related to personalcharacteristics of the respondent and to various characteristics of the missingtrips.

Table 9.15 Non-Reported Trips for Various Types of Respondent

% Non-Reported Trips

Age (years):10-15 27.4%16-25 7.4%26-45 15.5%46-64 16.3%>65 20.2%

Education Level:Elementary School 23.2%High School 9.6%University 17.8%

Licence Holding:Driver's Licence 11.6%No Licence 21.0%

The extent of non-reporting of trips as a function of respondent characteristics isshown in Table 9.15. With the exception of young teenagers, who tend to notreport many trips made by car (as a passenger), there is again a tendency forolder people to have more non-reported trips. Whether this is a function ofmemory lapses or is a result of the types of trips they tend to make will beexplored later. There appears to be no clear tendency for non-reporting of trips

Chapter 9

316

to be associated with any education level, but respondents without a driver'slicence tend to make more unreported trips.

Table 9.16 Trip Characteristics of Non-Reported Trips

% Non-Reported Trips

Trip length (km):0 - 0.5 26.5%

0.5 - 1.0 23.5%1.0 - 3.0 13.8%3.0 - 5.0 9.5%5.0 - 10.0 7.5%

10.0 - 20.0 7.5%>20.0 5.4%

Travel Mode:Moped, Motorcycle 25.0%Walk 22.9%Bicycle 14.4%Car Passenger 12.3%Car Driver 8.9%Public Transport 6.7%Train 0.0%

Trip Purpose:Shopping 18.4%Recreation 17.2%Other Discretionary Trips 14.5%School 8.1%Work 5.8%

In addition, the extent of non-reporting of trips appears to be a function of thecharacteristics of the trips themselves. As shown in Table 9.16, non-reported tripstend to be shorter than average, tend to be by non-motorised means oftransportation and also tend to be of a more discretionary nature. As a result ofthe characteristics of the non-reported trips, the increase in mobility afteraccounting for these trips varies depending on the measure of mobility used.Thus the proportion of mobiles increases least, the trip rate per mobile increasesmore, and the trip rate across all people increases most as shown in Table 9.17.

Table 9.17 Increases in Mobility after Allowing for Non-Reported Trips% Increase in Mobility

Measure of Mobility:% Mobiles 4.8%Trip Rate per Mobile 10.4%Trip Rate per Person 14.2%

The results reported in this section have been confirmed by other studies (e.g.Clarke , Dix and Jones, 1981; Barnard, 1985) and lend credence to the need to atleast be aware of, if not explicitly correct for, the effects of non-reported tripswhen presenting the findings of travel surveys.


317

Methodological research conducted as part of the South-East QueenslandHousehold Travel Survey (SEQHTS) (Richardson and Ampt, 1993a) has offeredfurther insights into the issue of non-reporting of trips and has suggested a wayof correcting for this non-reporting in the expanded data.

In the SEQHTS survey, validation interviews were performed with a sample ofthe responding households (as described in Chapter 7). The information for theestimation of non-reporting correction factors was obtained by means ofidentifying all additions made to the stop data as a result of the validationinterviews. These added stops were also classified as to whether they wereexpected or unexpected. Expected extra stops were those where, during dataentry (prior to validation), it had been identified that it was likely that an extrastop should have been reported - e.g. a person went to a shop and did not returnhome. Unexpected stops were those which had not been identified in this way,but which respondents reported during the validation interview checking.

As a result of experience gained in previous pilot surveys, it was decided toexamine the characteristics of these added stops in terms of their mode, theirpurpose, and whether they were the last stop of the day. As in those pilotsurveys, it was found that the added stops differed from the originally-reportedstops most significantly in terms of their purpose and position in the day. Thenon-reporting correction factors were calculated by dividing the sum of theoriginal stops, plus the expected added stops, plus the unexpected added stops bythe original stops, i.e.

Non-ReportingCorrection Factor = original stops + expected added stops + unexpected added stops

original stops

The resultant non-reporting correction factors for expected and unexpected stopsare shown in Tables 9.18 and 9.19.

Chapter 9

318

Table 9.18 Non-Reporting Correction Factors for Expected Added Stops

Last Stop of Day?DestinationPurpose

NO YES Total

Change Mode 1.015 1.000 1.015Pick Someone Up 1.012 1.000 1.012Drop Someone Off 1.000 1.000 1.000Accompany Someone 1.022 1.000 1.022Buy Something 1.000 1.000 1.000Education 1.058 1.000 1.058Work-Related 1.004 1.000 1.004Go Home 1.021 1.071 1.052Any Other 1.000 1.000 1.000Personal Business 1.016 1.000 1.000Social/Recreational 1.000 1.000 1.000Social/Welfare 1.000 1.000 1.000Medical/Dental 1.000 1.000 1.000Childcare 1.000 1.000 1.000Park/Unpark 1.200 1.000 1.200Total 1.014 1.070 1.024

It can be seen from Table 9.18, that the major impact of the non-reported stopcorrection factors for expected additions will be on trips home at the end of theday, which are frequently forgotten but often easy to detect. From Table 9.19, itcan be seen that the major impact of the non-reported stop correction factors forunexpected additions will be on "change-mode" stops made during the day andtrips home at the end of the day. These trips are primarily by walk or publictransport modes. The fact that stop purpose and mode are correlated means thatthe application of these non-reported stop correction factors based on stoppurpose will also result in an (upward) adjustment for stops made by walk andpublic transport during the day.


319

Table 9.19 Non-Reporting Correction Factors for Unexpected Added Stops


NO YES Total


The non-reported stop weights are then applied in the following fashion:

• any household/person/stop which was phoned or validation-interviewed does not need to have the expected or unexpected non-reported stop weights applied (because they would already have beenfound during the phone or validation interview),

• any household for which the data was judged to be perfect, and hencewould not have been phoned, needed to have unexpected non-reportedstop weights applied (because had they been interviewed, there was achance that an unexpected stop might have been found); and

• any household which had expected errors but which was neither on thelist to be validated, nor could it be phoned (because no number wasgiven), would need to have both the expected and unexpected weightsadded.

The procedure, therefore, for application of the non-reported stop weights was:

- if the household had been phone-edited, or was a participant in thevalidation or non-response interviews, then no non-reported stopweights were applied (this means a value of 1.00 was adopted)

- if the household had not been edited at all, then if they stated that theydid not have a phone or they did not say whether they had a phone(either way they definitely could not be phoned) then the expected and

Chapter 9

320

unexpected non-reported stop weights were applied to all stops made bythat household

- if the household had not been edited at all, and if they stated that they didhave a phone and they provided the phone number, then all stops inmade by members of that household would receive only the unexpectednon-reported stop weights.

The final sets of non-reported stop weights for households with and withoutphones are shown in Tables 9.20 and 9.21.

Table 9.20 Non-Reported Stop Weights (phone connected)


NO YES Total


As with the application of all correction weights, a major conceptual limitationmust be acknowledged in the use of non-reporting correction factors. The reasonfor the application of the non-reporting weights is that some people did not tellus about some of the trips they made. By way of the validation interviews, wedetermine which are the most likely types of trips not to have been reported. Wethen multiply those trips of this type, which have been reported, by a correctionfactor to compensate for the missing trips. In this way, the total number of tripsin the population should be more accurately estimated. However, from anindividual person viewpoint, we are adding trips to those people who havealready told us about their trips, and not adding them to the people who havenot told us about all their trips (because multiplying zero by any number stillleaves us with zero trips). Therefore, while the total number of trips should bemore accurately estimated, the distribution of trips per person will be pushedfurther away from the real situation. Statistically, we have improving the


321

estimation of the mean number of trips per person, but artificially increased thevariance of the number of trips per person. This occurs because of the use ofmultiplicative correction factors. To overcome this problem, we would need todevelop additive correction factors which add the non-reported trips onto thosepeople who have not told us about all their trips; this however is logically difficultto implement. Therefore, multiplicative correction factors must be used in therealisation that they improve estimates of the mean, but worsen estimates of thevariance. However, since estimates of the mean are generally more important, itis better to use some form of multiplicative correction factor than to not use anyat all.

Table 9.21 Non-Reported Stop Weights (phone not connected)


NO YES Total


9.3 CORRECTIONS FOR NON-RESPONSE

Having corrected for non-reported trips from those people who respond to thesurvey, it is now necessary to turn attention to people in the sample who do notrespond to the questionnaire at all. It is quite easy to think of a number ofreasons why a non-response to a survey might occur. In this context it isimportant to recognise that we can only speak of a true or genuine non-responsein a situation in which a response was indeed possible, e.g. the addressee simplydid not want to respond or was out of town at the time of the survey. Quite adifferent situation exists where a response was not even possible, e.g. theaddressee was deceased, the survey was sent to a non-existing address. In thiscase we have what is often called "sample loss". Wermuth (1985b) provides dataindicating the reasons for non-response to two self-administered, mail-back

Chapter 9

322

questionnaire surveys conducted in West Germany in 1981. He calls sample loss"non-genuine non-response" to distinguish it from "genuine non-response". Table9.22 shows the results of these analyses of non-response.

Table 9.22 Reasons for Non-Response in Self-Administered SurveysNumber of Households

Survey #1 Survey #2GROSS SAMPLE 5039 7688SAMPLE LOSS 370 603Reasons:

addressee deceased 24 40household moved 150 359addressee unknown 172 ---other 24 204

NET SAMPLE 4669 7085Genuine non-response 1710 2677Reasons (as far as known):

Objective non-responses 147 279- too old 71 ---- ill 36 206- out of town 40 73Subjective non-responses 249 437- non-acceptance of questionnaire 57 ---- answer refused 183 247- lack of time, other 9 190

Genuine non-responses(with known reasons)

396 716

Respondents 2959 4408Household response rate 63.4% 62.2%

It is important to understand the way in which response rate is calculated. Thishas been shown in Section 7.1.7. From the gross sample size is subtracted thosemembers of the sample from whom a response could not possibly be obtained.These forms of sample loss (i.e. invalid households, such as vacant or demolisheddwellings) do not affect the quality of the sample, and are sometimes said to bequality neutral. The resultant number is the net sample size. The response rate isthen calculated as the ratio of the number of respondents as a percentage of thisnet sample size.

While the exact composition of non-responses will vary with the survey and thepopulation, it is clear that the sample loss is directly related to the quality of thesampling frame from which the potential respondents were sampled. The moredated and inadequate the sampling frame, the more likely that there will be anundesirably large number of sample losses.

The two basic concerns with respect to non-response that need to be stressed arethe importance of recognising the existence of non-response and of the need tofind ways of assessing its impact on the quality, representativeness and reliabilityof the information derived from the survey. The analyst has to answersatisfactorily the questions as to whether the results of the survey would havebeen the same even if a one hundred percent response rate had been achieved.


323

This question translates into the recommendation that the analyst try to establishsome information about the non-respondents that will permit judgment aboutwhether the information that could have been obtained from the non-respondents would have been statistically different from that actually collected.

Ideally, it would be desirable to have available a series of adjustment factors thatcould be applied for different surveys and population groups in order to accountfor the information lost through non-response. Unfortunately, these adjustmentfactors can only be obtained through significant survey research efforts into thecharacteristics of "typical" non-respondents. As the reader can well imagine,follow-up surveys to investigate the reasons for non-response and to establishappropriate adjustment factors are costly and time-consuming. Since surveybudgets generally tend to be very tight, it is virtually impossible to advance thestate-of-the-art of adjustments for non-response through regular surveyactivities. Separately funded and carefully staffed research efforts are necessaryto achieve significant and analytically sound advancements in this area. On theother hand, it has been shown through the limited research efforts that exist inthis area (e.g., Brög and Meyburg 1980, 1981, 1982; Wermuth 1985b; Richardsonand Ampt, 1993a, Richardson and Ampt, 1994) that an understanding of non-response effects can lead to significantly more accurate and representative surveyresults.

Moser and Kalton (1971) identify five general sources of non-response:

(a) No longer at available address ("movers")(b) Physical (health-related) inability to respond(c) Refusals(d) Away from home during survey period(e) Out at time of call

Several strategies have been proposed to compensate for people whoseaddresses have changed since the sampling frame was prepared. One approach isto substitute for the moved household the new household that has moved to thataddress (if it is the household address and not the specific residents that are thesampling unit). Another strategy could be to try to "pursue" the household to itsnew address and to obtain a response at that location (if the identity of thespecific residents is important to maintain). A third strategy is to determine thenumber of households that have moved out of the survey area during the mmonths preceding the survey and to double the weight of an equal number ofrespondents who have moved into the area during that same time period. In thisway, the movers-in are included in the sample on their own behalf and also inplace of the movers-out (Gray, et al., 1950).

Chapter 9

324

Only the last four reasons for non-response are of major interest to the analystbecause the first reason could be considered as falling into the category of sampleloss, i.e. they are out of the analyst's control once the survey sample has beendrawn. It is the segment of the non-respondents that legitimately belong in thesample that is of particular interest to the analyst because, under these conditions,carefully designed survey procedures can help reduce the problem. Very little, ifanything, can be done about correcting for the non-response in the secondcategory, neither in mail-back self-administered surveys nor in home interviewsurveys. However, we ought to keep in mind with respect to the other reasonsthat non-response is a relative term. It depends very much on the surveyor'slevel of perseverance, quite aside from the quality of the overall survey designand administration. For example, in mail-back surveys, it would be very unwiseto omit follow-up reminders and to be satisfied with whatever is returned in thefirst wave (i.e. after the questionnaire has first been distributed). The use ofreminders can significantly increase the number of respondents, as shown inChapter 7, and as demonstrated in Figure 9.1.

The results in Figure 9.1 are based on surveys conducted in West Germany(Wermuth, 1985b). It can be seen that in all three sets of survey data, theresponse rate increased significantly with the use of reminders. In the surveyscarried out in three West German cities, a very extensive system of reminderswas used, consisting of the following steps:

(a) First announcement of survey by postcard(b) First mailing of questionnaires (two weeks later)(c) First reminder (postcard, one week later)(d) Second reminder (postcard, one week later)(e) Second mailing of questionnaires (one week later)(f) Third reminder (postcard, one week later)(g) Third mailing of questionnaires (one week later)(h) Fourth reminder (postcard, one week later)(i) Fifth reminder (postcard, one week later)


325

Figure 9.1 The Effect of Reminders on Questionnaire Response Rates(Source: Wermuth, 1985b)

In the survey covering nine cities, only steps (a), (b), (c), (d) and (e) wereimplemented, while in the Munich survey only steps (a), (b), (c) and (e) wereimplemented. Several points arise from consideration of Figure 9.1. First, if eachof the surveys had omitted all reminders then a response rate of only 30 to 35%would have been obtained. This, coincidentally, is the response rate often quotedfor self-administered surveys. The use of the reminders, however, increased theresponse rates to over 60% for all surveys. Secondly, it appears that only twomailings and reminders are needed. While further reminders do increase theresponse rate, they do so only marginally and are probably not very costeffective. Thirdly, the results are remarkably consistent over all of the surveys.

A similar result has been obtained in the SEQHTS survey (Richardson and Ampt,1993a) with a variation on this program of reminders, as shown in Figure 9.2.

Chapter 9

326

Response Time (= Receipt Date - Initial Travel Date)

Perc

enta

ge o

f Res

pons

es O

btai

ned

0%

10%

20%

30%

40%

50%

60%

70%

80%

90%

100%

-7 0 7 14 21 28 35 42 49 56 63 70 77 84 91

Valid Responses Sample Loss Non-Responses

Figure 9.2 The Speed of Response by Response Type(Source: Richardson and Ampt, 1993a)

In addition to the responses being stimulated by the use of reminders, commonsense alone tells us that the early respondents are likely to be different "from therest of us" in that they might have a particular interest in the topic of the survey,or they might have plenty of time available to sit down to respond to a surveyvery promptly. It is conceivable, for example, that a disproportionate percentageof retired people are among the respondents of the "first wave". Wermuth(1985b) investigated the socio-economic status of respondents in the variousresponse groups, and found the results which are summarised in Figures 9.3through 9.5.


327

Figure 9.3 The Effect of Household Size on Level and Speed of Response(Source: Wermuth, 1985b)

Chapter 9

328

Figure 9.4 The Effect of Age on Level and Speed of Response

Figure 9.5 The Effect of Employment and Sex on Speed of Response


329

In both the Munich survey and the "three cities" surveys, larger households weremore likely to respond and to respond earlier, probably because of the increasedchance of finding someone in the household willing to complete the survey.Older people are more likely to respond, probably because of their greateramounts of free time. Employed people are more likely to respond, probablybecause of their greater extent of trip making and hence the greater perceivedrelevance of the travel survey. There appears, however, to be no difference inresponse between males and females.

Similar results were found in the SEQHTS survey (Richardson and Ampt, 1993a)as shown in Tables 9.23 and 9.24.

Table 9.23 Household Characteristics of SEQHTS Respondents by WaveRESPONSE WAVE

1 2 3 4 5 6HOUSEHOLD SIZE

1 61% 17% 6% 10% 4% 2%2 64% 17% 6% 9% 3% 2%3 59% 20% 7% 7% 4% 2%4 63% 18% 6% 8% 3% 1%5 56% 21% 8% 10% 4% 2%6 53% 24% 5% 12% 4% 2%7 54% 21% 11% 11% 4% 0%8 63% 25% 0% 0% 13% 0%

Since household size, employment status, age and time availability are likely tohave an impact on trip-making characteristics, it is reasonable to assume that thetravel characteristics and data for non-respondents will be different from that forthe respondents. Brög and Meyburg (1980, 1981, 1982) have demonstrated thattrip-making characteristics do change substantially as additional response wavesdue to reminders are recorded. For example, for the nine cities survey, Figure 9.6shows that the trip frequency and the proportion of mobile persons in thepopulation (i.e. people who make at least one trip) both decrease as the time torespond increases. Thus mail-back questionnaire surveys which do not includefollow-up reminders would tend to over-estimate trip-making because of thehigher mobility of the early respondents This should be contrasted with homeinterview or telephone interview surveys where the early respondents tend to bethe "stay-at-homes" who generally have lower than average trip rates. Thuspersonal interview surveys without call-backs would tend to under-estimate triprates.

Chapter 9

330

Table 9.24 Personal Characteristics of SEQHTS Respondents by WaveRESPONSE WAVE

1 2 3 4 5 6AGE GROUP0 -> 4 58% 19% 7% 10% 4% 2%5 -> 9 60% 18% 6% 9% 4% 2%10 -> 14 61% 19% 6% 9% 3% 1%15 -> 19 60% 21% 7% 8% 3% 2%20 -> 24 52% 22% 8% 10% 4% 3%25 -> 29 54% 21% 9% 9% 5% 3%30 -> 34 60% 19% 7% 9% 4% 2%35 -> 39 61% 19% 6% 8% 4% 1%40 -> 44 61% 18% 7% 8% 3% 2%45 -> 49 64% 17% 6% 9% 2% 1%50 -> 54 61% 21% 7% 8% 3% 1%55 -> 59 71% 17% 5% 5% 1% 1%60 -> 64 75% 14% 4% 5% 1% 1%65 -> 69 76% 11% 4% 5% 2% 1%70 -> 74 74% 14% 2% 7% 2% 1%75+ 70% 16% 6% 6% 1% 1%

S E XMale 60% 19% 7% 8% 3% 2%Female 61% 18% 6% 9% 4% 2%

ACTIVITY STATUSFull-Time Employment 58% 21% 7% 9% 4% 2%Part-Time Employment 62% 18% 6% 8% 3% 2%Primary School 60% 18% 6% 9% 4% 2%Secondary School 60% 20% 8% 8% 3% 1%Tertiary College 59% 24% 6% 7% 2% 1%Not yet at School 59% 19% 7% 10% 4% 2%Pre-School 58% 20% 5% 8% 5% 3%Childcare 53% 22% 6% 10% 7% 2%Keeping House 63% 17% 7% 9% 3% 1%Currently Unemployed 59% 19% 6% 9% 4% 3%Retired or Pensioner 73% 14% 4% 6% 2% 1%Other Pensioner 67% 12% 8% 6% 4% 3%Other 77% 17% 2% 3% 2% 0%

The observation that trip rate declines with increasing time to respond to thesurvey has been interpreted as meaning that the trip-making characteristics oflate respondents are different to that of earlier respondents. However, beforethis interpretation can be accepted, we need to account for two other possibleexplanations. Firstly, it could be that late respondents simply belong to differentsocio-demographic groups to early respondents and that, while they do makefewer trips, they make no fewer trips than early respondents in the same socio-demographic group. It has been shown above that the socio-demographiccharacteristics of early and late respondents are indeed different, and therefore


331

socio-demographic expansion will tend to partially correct for the non-responseproblem.

Figure 9.6 Travel Characteristics as a Function of Response Speed

Second, while the observed (i.e. reported) trip rates are lower for laterespondents, it may be that they don't make fewer trips but simply report fewertrips (i.e. they have a higher non-reporting rate than early respondents). For theabove reasons, it is first necessary to correct reported trip rates in each responsewave for socio-demographic and non-reporting differences as described in theprevious two sections of this Chapter.

Ideally, it would then be desirable to have available a series of adjustment factorsthat could be applied for different surveys and population groups in order toaccount for the information lost through non-response. These adjustment factorscan only be obtained through significant research efforts into the characteristicsof “typical” non-respondents. This was attempted in the SEQHTS survey by thefollow-up non-response surveys in that study, and the use if these surveys will bedescribed later in this section.

In the SEQHTS survey, responses were classified into six groups according to thetime taken for them to respond. Those who respond within 7 days (1 week) oftheir original travel day are classified as Wave 1 respondents. Those who respondwithin 2 weeks of their original travel day are classified as Wave 2 respondents,up to Wave 4 respondents. Those responding of their own volition after 4 weeksare classified as Wave 5 respondents. Those who respond as a result of the non-response interviews are classified as Wave 6 respondents. These waves of

Chapter 9

332

respondents are often called "Brögian Response Waves", after Werner Brög whodid much of the early research on the calculation of corrections for non-responseeffects in postal travel surveys.

Calculation of the average, and upper and lower percentiles, of the number ofstops per person per day for respondents in each of the response waves gave riseto the curve shown in Figure 9.7.

Response Wave

Stop

s/pe

rson

/day

0.00

1.00

2.00

3.00

4.00

5.00

1 2 3 4 5 6

Average Upper 95th%ile Lower 5th%ile

Figure 9.7 Average Stop Rate as a Function of Response Wave

It can be seen from Figure 9.7 that the average number of stops per daydecreases as the time taken to respond increases. Thus the respondents in thefirst wave have the highest stop rate and those in the last waves have the loweststop rates. This trend is consistent with that found in previous work (Wermuth,1985b).

A better picture of this trend can be obtained by considering the number ofresponding households in each of these waves, and the cumulative percentage ofresponses. Thus in the SEQHTS survey, 60% of the total respondents (weightedfor demographic characteristics) responded within one week of the initial traveldate, another 19% responded in the week after this and so on. By the end ofweek two, a cumulative total of 79% of the total respondents had responded. The6th response wave consists of those respondents who were obtained from thenon-response interviews. They represent more than themselves, however, sinceonly a sample of non-responding households were included in the non-responseinterview sample. The non-response sample was found to consist of three sub-


333

groups; those who agreed to respond, those who refused to respond to thesurvey, and those who did not respond for other (perhaps travel-related)reasons. The second group have been referred to as "stubborn" non-respondents;Wermuth (1985b) has noted that approximately 10% of the net sample arestubborn non-respondents. The SEQHTS survey found that approximately 11%of the net Brisbane sample fell into this category. Therefore, it would never bepossible to get more than 89% response from the net sample, and this value hasbeen used as the upper limit of cumulative response for response wave 6. Therelationship between stop rate (stops/person/day) and cumulative percentage ofnet sample is shown in Figure 9.8.

Cumulative Percent of Net Sample

Aver

age

Stop

s/Pe

rson

/Day

0.00

1.00

2.00

3.00

4.00

5.00

0% 20% 40% 60% 80% 100%

Average Upper 95th%ile Lower 5th%ile

Figure 9.8 Stop Rate as a Function of Cumulative Response

This relationship (for the first five waves) is very similar to that obtained byWermuth (1985a) in that the stop rate falls relatively uniformly after the secondresponse wave. Using such a relationship, Brög and Meyburg (1982) havepostulated that non-respondents are more likely to have trip-makingcharacteristics like those who respond late to travel surveys than those whorespond early to travel surveys. They have assumed a linear decrease in stop rateafter the second response wave up till the last respondents to the mailedquestionnaire. They then project forward to estimate the likely stop rate of thenon-respondents.

In the case of Figure 9.8, a linear relationship is postulated as given by the dashedline overlaid on the response curve. This would give an estimate ofapproximately 2.65 stops/person/day for the non-respondents. As it happens, in

Chapter 9

334

the SEQHTS survey, there was the unusual situation of actually having anempirical measurement of the stops/person/day for the non-respondents fromthe non-response interviews. This actual value was 2.61 stops/person/day. Thisconfirms the overall validity of the approach adopted by Brög and Meyburg(1982) and Wermuth (1985).

Given that these non-respondents have a lower stop rate than the respondents,then it is necessary to apply a correction factor to all observed stops to reduce theestimated population stop rate to account for the lower stop rate of the non-respondents. While it is possible that the reductions in stop rate apply non-uniformly to various types of stop, such differentiation has not yet beenattempted; the non-response correction factor is applied equally to all stops. Laterresearch should investigate variations in non-response correction factors by stoppurpose, mode of travel etc.

The non-response correction factor is calculated by considering the three majorgroups in the net sample and the stop rates associated with each group. Thesethree groups are the respondents, the non-respondents and the stubborn non-respondents. In the SEQHTS survey, these groups make up approximately 73%,16% and 11% of the net sample. The stop rates associated with the first twogroups can be found from the data for the waves of respondents and the wave ofnon-respondents. Thus the average stop rate of the respondents is 4.38 (forrespondents in the first 5 waves), and the average stop rate of non-respondents is2.61. The average stop rate for stubborn non-respondents is assumed to be 4.38(the same as the respondents, on the assumption that their unwillingness toparticipate in the survey has nothing to do with their travel behaviour). Thus, theweighted average stop rate of the entire sample is 0.73*4.38 + 0.16*2.61 +0.11*4.38= 4.10. Since the average stop rate for the respondents would have beencalculated as 4.38, a correction factor of 0.935 (=4.10/4.38) was applied to all stopsreported by respondents in order to obtain the correct stop rate for the entire netsample. This weighting factor was applied to all records in the stop file.

To minimise the effect of possible non-response bias, it is therefore good surveypractice to send out at least one combination reminder/thank you postcard,followed a week later by another reminder postcard with a new survey form,and, if funds and time permit, followed by a third postcard after another week.This procedure will generate several response waves and it will both reduce thenon-response rate and increase the quality and representativeness of the surveyresults, and also provide information on those respondents to the later reminderswho might otherwise have been non-respondents. This information can then beused to investigate any trends in travel characteristics as a function of responsespeed, which can then be used to infer the travel characteristics of those whoremain as non-respondents.


335

For personal interview surveys the refusal rate is largely a function of the skilland experience of the interviewer. Of course the subject matter of the survey alsoplays a significant role in the respondents' willingness to answer questions on aspecific topic. Conflicting results have been reported on the desirability ofmaking interview appointments by prior telephone call or by postcardannouncement. Sudman (1967) reported that the number of calls required tocomplete an interview was reduced from 2.3 to 1.7 calls per completed interview.On the other hand Brunner and Carroll (1967) found that prior telephoneappointments had the undesirable effect of reducing the response rate. Peoplewill find it easier to refuse cooperation through the relative anonymity of atelephone contact than when confronted in a face-to-face situation at their home.

An interesting approach to dealing with the not-at-home problem in personalinterview surveys was developed by Politz and Simmons (1949). The surveypopulation is grouped into strata according to the probability of the interviewerfinding people at home on the first call. All calls are assumed to be made duringthe same period of the day, e.g. in the evening. On the basis of the respondent'sanswer to the question of how many of the previous five evenings he/she spentat home, the probability of each respondent being at home on any randomevening could be calculated. In order to derive population estimates, theinterview results for each stratum should be weighted with the reciprocal of theprobability of being at home. Of course, a slight bias is introduced due to the factthat those people who are never at home will not be considered in thisprocedure. While the Politz-Simmons method makes one-call interviews apalatable survey procedure, it is not clear whether this re-weighting procedure ismore efficient in terms of cost and quality of results than the more conventionalapproach of multiple calls until a successful contact and interview is made. Also, itis not clear whether people are necessarily willing to disclose to the interviewertheir typical behaviour about presence or absence at home (because of fearsabout security).

Finally, it ought to be noted that an "unrepaired" non-response bias might be soserious that it might be worth considering selecting a smaller initial sample andplacing all resources on a concentrated effort to obtain a higher response rate.Cochran (1977) and Deming (1953) have indeed taken this position. It is clear thatignoring the effect of non-response is a highly unprofessional and unscientificapproach to survey sampling.



10. Data Analysis

Having coded and edited the data, we should now have a "clean" datset ready foranalysis. The analysis phase is often a welcome relief after the tedium of editingthe data, in that it allows the investigator to show some creative flair.

Two types of analysis are possible. First, there may be a simple enumerative orexploratory analysis which seeks to explore the contents of a dataset and todescribe the datset in a number of ways (e.g. response rates, means and standarddeviations of responses, frequency distributions and cross-classifications). Second,one may proceed to a more complex analysis which seeks to confirm statisticalhypotheses and find causal relationships among the variables. This model-building phase is frequently the purpose of many transport-related surveys. Inthis Chapter, we shall first concentrate on simple exploratory analysis and thendescribe some of the basic methods of building causal relationships from the dataA more complete description of multivariate analysis and model building may befound in texts such as Stopher and Meyburg (1979).

For the analysis of sample data, two alternatives present themselves. You canwrite your own computer programs for the analysis of data or else use acommercially available computer package for analysis. Writing your ownprogram is usually applicable for special survey methods where manycalculations and transformations are required. However, for most sample

Chapter 10

338

surveys it is now possible to make use of one of the many statistical analysisprograms commonly available. They save a lot of time, they present results inneat easily-readable format and they don't require a great deal of computerexpertise (although knowledge of the underlying statistical techniques isessential).The one danger with using some statistical packages is that it is oftentempting to run all sorts of statistical analyses just to see what eventuates. Such"fishing trips" should generally be avoided since, apart from being inconsistentwith the objectives of scientific social surveys, they can turn out to be veryexpensive and can lead to simplistic and misleading conclusions. As mentionedmuch earlier, the type of analysis to be performed should be known before thesurvey is designed.

The choice of which package to use will often depend on the computer facilitiesavailable. In the past, much of the analysis of travel survey data was performedon mainframe or minicomputers (because of the needs for substantial datastorage and rapid computation speed). Under these conditions, the SPSS(Statistical Package for the Social Sciences) package was probably the most widelyused package. It is relatively easy to use (even though it is rather inefficient,computationally, in its operation). No computer programming expertise isrequired as it is only necessary to prepare a few commands describing the datsetand then tell SPSS what analysis to perform. SPSS can be used both to edit thedata and to carry out final statistical analysis of the data. Full details of the SPSSpackage can be found in Nie et al., (1975). Apart from SPSS, a number of otherstatistical packages are available for mainframe computers and minicomputers.The most well-known packages include SAS and IMSL.

In recent years, however, the trend has been away from the use of mainframeand minicomputers and towards the use of microcomputers and workstations (asexemplified by the IBM and Apple Macintosh range of microcomputers). The useof micros make available a vast array of analytical software previously notavailable, and enables small agencies to conduct and analyse their own surveys.While mainframes and minicomputers may still be desirable (because ofcomputational speed) for very large survey databases, the availability, ease ofuse and graphics capabilities of modern microcomputers makes them the systemof choice for most survey tasks. As noted in Chapter 5, the micro can also beused in other aspects of survey design such as the graphic design of surveyforms.

In using microcomputers for the analysis of survey data, there are three majortypes of program which are useful, depending on the size of the survey and thetype of analysis to be performed. For data entry and editing, spreadsheets anddatabase manager programs are most useful, as described in Chapter 8. Theseprograms can also be useful for some forms of data analysis which do notrequire the use of complex statistical procedures. When attempting to perform

Data Analysis

339

statistical analysis on the data, however, it is better to use a dedicated statisticalpackage (many of which have some spreadsheet and database capabilities). Thereare essentially two types of statistical package which are useful in data analysis;"exploratory" statistical packages and "confirmatory" statistical packages.Exploratory statistical analysis is most useful when you are trying to understandthe data and to get an intuitive feel for what the data are trying to tell you.Confirmatory statistical analysis is most useful when you are trying to testwhether the data supports a preconceived hypothesis (which may have beenobtained from the exploratory analysis, or which you may have had beforeconducting the survey). Confirmatory statistical analysis can take the form ofstatistical hypothesis testing or the construction of multivariate causal models.Both types of analysis will be discussed in this Chapter.

10.1 EXPLORATORY DATA ANALYSIS

The concept of exploratory data analysis is best exemplified by the works ofTukey (1977) who coined the term "exploratory data analysis" and who drew theanalogy between data analysis and "detective work". To quote Tukey:

"A detective investigating a crime needs both tools and understanding.If he has no fingerprint powder, he will fail to find fingerprints onmost surfaces. If he does not understand where the criminal is likely tohave put his fingers, he will not look in the right places. Equally, theanalyst of data needs both tools and understanding."

Tukey's tools were largely derived for manual use; he proposed a number ofinnovative graphical techniques for representing data which could be donemanually on relatively small datsets, using "graph paper and tracing paper whenyou can get it, backs of envelopes if necessary". Tukey was openly sceptical ofthese tasks being replaced by computerised methods, mainly because of theinaccessibility of computers to the average analyst (in the mid-1970's). He couldnot have foreseen the emergence and widespread adoption of microcomputersin the 1980's, although he did foresee one development in this area; "But there ishope. Paul Velleman has been putting together facilities adapted to the use ofsuch techniques in research...". Ten years later, Velleman is the author of aninnovative data analysis program for microcomputers which utilises many ofTukey's concepts. This program, Data Desk, utilises the interactive graphicscapabilities of the Macintosh to enable analysts to be "detectives" in true Tukeyfashion.

Within this section, we shall describe several of Tukey's methods of exploringdata, and we shall use Data Desk to show how these ideas can be used in practice.In particular, we will perform analysis on the flat file data file described inChapter 8. Remember that we had used FoxBASE to extract a subset of data from

Chapter 10

340

the trip and person data files and combine them in a single flat file, as shown inFigure 10.1.

Figure 10.1 A Flat File Created for Statistical Analysis

To use these data for statistical analysis, it is necessary to transfer them to astatistical analysis program (in this case, Data Desk). Generally, each programstores the data in a form which is unique to that program. However, one canalmost always store data from any of these programs in a text file, and one canalmost always import data from a text file into any one of these programs. This,in fact, is how the transfer was made from FoxBASE to Data Desk. Whenimported into Data Desk and supplied with the names of the variables, the dataappears in Data Desk as shown in Figure 10.2.

Figure 10.2 Data Desk's Representation of Variables as Icons

Data Analysis

341

Each variable appears on-screenas a separate column of data.When these icons are opened(by double-clicking on them),the data values appear asshown in Figure 10.3 for the"trips" and "vehicles" variables.

Although the data values stillappear in different windows,they are in fact part of the samelarger data file. When a recordis selected in one of the variablewindows, the correspondingvalue in the other variablewindow is also selected, asshown in Figure 10.3. Thus theseventh person in the data filehas made 7 trips and belongs toa household which owns 6vehicles.

The variable icons may be re-ordered on the screen simplyby dragging them with themouse to the position requiredin the group of variable icons.

Figure 10.3 The Contents of Data Desk Variable Icons

Many options are available for examination of the raw data, and only a few ofthem can be covered here. For a more complete coverage, see Tukey (1977),Velleman and Hoaglin (1981), and Velleman and Velleman (1988). The simplestway to examine data is to look at it, and to calculate some simple statistics. Themost conventional way to look at the data is by means of a histogram as shownin Figure 10.4.

Chapter 10

342

Figure 10.4 A Histogram of the Number of Trips per Person per Day

This histogram may also be summarised statistically by means of a number ofsummary statistics such as the mean, standard deviation, median, mode(50th!%ile) as shown in Figure 10.5.

Figure 10.5 Summary Statistics of the Number of Trips per Person per Day

Several of these statistics may also be summarised graphically by means of aboxplot (also known as a "box and whiskers") as shown in Figure 10.6. Examiningthe data in this way is very useful for detecting outliers in the data, and fordetermining whether these data are genuine or whether they are mistakes. Bysearching in the variable list for the outlier, one can then identify the person andhousehold to which it belongs (this is the reason such identifiers are alwaysattached to a datset for statistical analysis), and then examine the questionnaire orinterview schedule to determine the validity of the data point.

Data Analysis

343

In this boxplot, the boxrepresents the middle halfof the data between the25th and 75th percentiles.The line in the middle ofthe box is the median. Thewhiskers extend beyondthe box to the data pointswhich are not furtheraway from the box than1.5 times the depth of thebox. Points between hereand 3 times the depth arecircled, while pointsfurther away are markedwith a "starburst".

Figure 10.6 A Boxplot of the Number of Trips per Person per Day

Once this one-way analysis has been conducted for each variable, it is then usefulto examine the variables two-at-a-time to see, first, if there are any logicalinconsistencies that have not been discovered by the previous editing routines,and secondly to see whether any causal or correlative relationships appear toexist between the two variables. For example, it might be expected that theremay be a relationship between the number of trips per person per day and thenumber of vehicles owned by that persons household. To test this, both variableicons can be selected on the screen, and then a menu item selected to create ascatterplot as shown in Figure 10.7.

At first glance, this plot may appear to suggest that there is not likely to be arelationship between the two variables. However, it should be remembered thatthere are 904 data values in the data file and only 94 points appearing on thescatterplot. Therefore, on average, each plotted point represents about 10 datavalues. It is also unlikely that the data values will be evenly distributed across allthe visible data points. There may well be a relationship "lurking" beneath thesurface which only further exploration will uncover. One way of trying touncover this relationship is to place the data values into groups based on vehicleownership and then see whether there is any systematic variation in trips per daybetween the groups. This can be done by selecting both the variable icons, andthen selecting a menu item which will perform this grouping, as shown in Figure10.8.

Chapter 10

344

Figure 10.7 A Scatterplot of Trip Rate vs Household Vehicles

Having selected the option shown in Figure 10.8, Data Desk will now split thedata values for the first variable (no. of trips) into groups based on the secondvariable (vehicles) and will create a new variable icon for each of these groups.These data groups can then be analysed individually or in concert. For example, ifall the data groups are selected and the Boxplot menu item is then selected, acombined set of boxplots for each of the groups will be produced as shown inFigure 10.9. When presented in this fashion, it can be seen that there may be arelationship between the two variables because of the way that the "boxes" andthe medians move upward as the number of vehicles increases.

Figure 10.8 Splitting the Data into Groups Based on Vehicle Ownership

Data Analysis

345

.Figure 10.9 Boxplots of Trips per Day for the Vehicle Ownership Groups

This trend can be even more clearly established, if the mean trip rate within eachof these groups is plotted against the number of vehicles as shown in Figure10.10.

Figure 10.10 Scatterplot of Person trips vs Household Vehicles

Chapter 10

346

From Figure 10.10, it is quite clear that there is a relatively clear relationshipbetween the average number of trips per person per day and the number ofvehicles owned by that person's household.

In going beyond relationships between two variables, it has traditionally beendifficult to investigate such relationships visually. While it is at least theoreticallypossible to represent the relationship between three variables in threedimensional space, the difficulties of preparing graphs to represent this havebeen considerable. However, several microcomputer statistical programs,including Data Desk, now have the ability to not only graph the variables inthree-dimensional space, but to rotate them at the request of the user, thusenabling you to truly look at the data from many different perspectives. Whilethis feature provides an important tool for exploring data, it is, unfortunately, notpossible to present it's capabilities on the printed page.

10.2 CONFIRMATORY DATA ANALYSIS

The preceding section has dealt with the exploratory analysis of data to obtain anintuitive feel for the data and perhaps to develop some additional hypotheses.This exploratory analysis is usually followed by a more formal confirmatory dataanalysis, which firstly summarises the data in conventional statistical terms, andmay then attempt to relate elements of the data to each other in terms ofstatistical and causal models. The intention in this section is to provide some ofthe background knowledge necessary for the development of statistical models,where the use of the term statistical implies the use of sample data to producesome form of mathematical relationship. This section will not provide completedetails about the statistical model building procedures discussed; rather it willhope to provide sufficient material at an introductory level to enable you todecide whether such an analysis may be useful to you, and to interpret theoutput obtained from statistical packages which perform these analyses.

The first thing that should be emphasised is that statistical modelling techniquesshould not be regarded as a means of pulling some relationship out of thin air.This is a frequent and serious misuse of confirmatory statistical analysistechniques. Rather, the use to which confirmatory statistical modelling techniquesshould be put is that of attempting to determine the parameters of a theorisedrelationship (you should use exploratory data analysis techniques if you are justtrying to get some ideas about the data).

In other words, a hypothesis (or hypotheses) must be advanced, expressing arelationship that might be expected between certain phenomena (variables). Thestatistical techniques described in the remaining sections of this chapter representone means by which certain types of hypotheses may be tested. It is extremelyimportant that the reader understands this idea of hypothesis testing, since it

Data Analysis

347

underlies and explains much of the procedure of statistical model building andtesting. This point has been outlined in Chapter 4 of this book, and will bereiterated in specific contexts of certain statistical methods in the succeedingsections of this chapter.

Before detailing the various statistical techniques, it is appropriate to review thereasons for statistical modelling and the general aims of model building. Thework "model" is used in the purely statistical sense as invoked in many areas ofengineering and applied science. A model may be defined as an abstraction ofreality. Specifically, this means that a model is intended to be a simplification ofreality, not a replica of reality. In this sense, one may distinguish between amodel and a physical law. A physical law is reality, without approximation. ThusNewton's law of gravitation (Newton, 1713) is an exact phenomenologicalstatement (if one accepts the precepts of Newtonian physics). It is the reality.

F = GM1M2

d2 (10.1)

where F = the gravitational attraction between two bodiesG = Newton's gravitational constant

M1 ,M2 = the mass of bodies 1 and 2and d = the distance between the two bodies.

The relationship of equation 10.1 (Newton's law of gravitation) is not a modelwithin this definition. It is accurate, precise, and complete statement of arelationship that always holds. To illustrate more clearly just what a model is, onemight consider a relationship between the yield of wheat per acre (W), the rate offertiliser application (F), the number of years since wheat was last grown on thatland (Y), and the number of inches of rain since seeding (R). A simple linearrelationship might be:

W = ao + a1F + a2Y + a3R (10.2)

Indeed, this equation may represent the hypothesis to be tested. The questions tobe answered by the analyst are : can non-zero values be found for a1,, a2, and a3,and how does the right side of equation 10.2 predict the observed values of theleft side? There is no known law that relates yields of wheat to these three othervariables precisely and completely. However, it seems reasonable to expect thatsome relationship might exist between them. Herein lies the essence of a model.There may be many other variables that will affect the relationship, for example,the quality of the seed, the number of hours of sunshine, the amount of rain andsun at particular growth periods, and the type of soil. Equation 10.2 is, however,an abstraction of reality. It is incomplete, not very precise, and subject to error.But it provides insights into the relationship and may represent a usefulrelationship for predicting yields of wheat from limited available data.

Chapter 10

348

It should be clear from this illustration that a number of demands must be madeof the process for testing the hypothesis represented by equation 10.2. First, it isnecessary to be able to test the null hypothesis that the relationship adds nothingto our knowledge and understanding. This is equivalent to saying that values ofW in equation 10.2 could be predicted by guesswork (i.e., a random process) asaccurately as by use of the model. If this hypothesis can be rejected, it isappropriate to determine whether all the variables in the model are necessary forreasonable prediction of W, whether some variables add no new information,and whether other variables are necessary to obtain good predictions. As will beseen, these concerns are raised for each statistical model-building process treatedin the remainder of this chapter.

Sufficient has not yet been said about the basic properties of a model. It has beenemphasised that a model is an abstraction of reality. It is also true that a modelshould be as accurate as possible, while yet retaining its simplicity. It may be usedas a predictive or forecasting tool, or it may be used to gain understanding of aprocess. For each of these uses, the probably conflicting goals of accuracy andsimplicity are desired. For the purpose of subsequent discussions, simplicity maybe interpreted as implying parsimony of variables and the use of the leastcomplex of functional forms, consistent with the hypothesis to be tested. In thislater respect, a linear relationship (such as that shown in equation 10.2 and Figure10.11) is the simplest. A monotonic, continuous non-linear function is next (asshown in Figure 10.12) and various types of discontinuous non-linear functionsand continuous non-monotonic functions are next in their degree of complexity.

Data Analysis

349

6

5

4

3

1

2

1 2 3 4 5 6

Y

X

Linear relationship between Y and XY = 1 + 0.8X

Figure 10.11 A Simple, Bivariate Linear Relationship

Y

X

Non-Linear Relationship between Y and X

Figure 10.12 A Simple, Bivariate, Monotonic Non-linear Relationship

Chapter 10

350

While retaining these properties of accuracy and simplicity, a model must also beuseful for its purpose; that is, if predictive, it must be capable of predicting; ifexplanatory, it must provide explanation and understanding of the phenomenonbeing modelled. The model must also be economical to use. It should not needthe use of data that are extremely difficult and costly to obtain, and it should becheaper by far to use than real-world experimentation. A model must also bevalid. Many interpretations can be given to validity, although principally adichotomy of meaning may be made between descriptive and predictive models.If a model is intended to be descriptive only, validity may imply logic in themodel structure and transferability to other geographic locations or othermodelling situations (depending upon the discipline of study). Validity in apredictive model must also imply causality. In other words, the direction of"affect" must be correctly specified in the model. There are a number of otherproperties that a model should possess, but these are among the most importantand serve adequately to set the stage for the description of the techniquesthemselves.

This discussion of models should also have pointed out another propertypossessed by any statistical model; error. Clearly, the preceding statements aboutaccuracy, simplicity, and abstraction of reality all imply the existence of error.Furthermore, models in transport planning are generally built with the use ofsample data. As discussed in Chapter 4, sample data possess sampling errors. Infact, there are three primary sources of error in statistical modelling (Alonso,1968). These are specification error, measurement error, and calibration error.Specification error is generally unknowable and derives from the inclusion ofunnecessary variables or the exclusion of necessary variables. While exhaustivetesting may eliminate the former, it is never possible to determine the extent ofthe latter. Measurement error derives from the sampling error and fromimpreciseness and inaccuracy of the actual measurements made to generate thedata. The measurement error that arises from sampling can be computed, asdescribed in Chapter 4, but that arising from imprecision of actual measurementoften cannot be determined, particularly when humans are the subjects of thedata collection and model building. Calibration error occurs in the modellingprocess as a result of the use of data containing measurement errors and thebuilding of a model with specification error.

Measurement error is a property of the data and cannot be amelioratedsubstantially in the modelling process, with the exception of error propagation(i.e., the means by which a model magnifies or diminishes errors in differentvariables). One of the goals of statistical model building is to minimise error.Specification error is minimised by careful selection of variables in the modelthrough prior exploratory statistical analysis and visual examination of the data,together with a carefully reasoned hypothesis of structure and variable content.

Data Analysis

351

Calibration error is minimised in most statistical techniques as the major principleof model building. The degree to which it can be minimised, however, dependsupon a wide range of properties of the data and the model-building technique.

The trade-off between specification and measurement error provides aninteresting perspective on the relative effort which should be spent on datacollection and modelling in any transport study. Alonso (1968) has postulatedthat, although specification error cannot be measured exactly, it is likely thatspecification error will drop rapidly as complexity of a model increases as shownin Figure 10.13 (complexity is defined as being measured by the number ofrelevant explanatory variables included in the model).

However, each variable included in the model will have a degree ofmeasurement error associated with it, so that the inclusion of more variables willmean that there is a greater total amount of measurement error in the model, asshown in Figure 10.14.

Error

Model Complexity

Specification Error

Figure 10.13 Specification Error and Model Complexity

Chapter 10

352

Error

Model Complexity

Measurement Error

Figure 10.14 Measurement Error and Model Complexity

Since the total error in the model includes both specification and measurementerror (ignoring calibration error for the moment), then the total error can beobtained by:

Etotal = !e2spec!+!e2meas (10.3)

This total error is shown in Figure 10.15, wherein it can be seen that, because ofthe countervailing effects of each source of error, there is in fact an optimumdegree of model complexity in order to obtain the minimum total error in themodel. The implications of Figure 10.15 are important enough to bear repeating.The best model is not necessarily the most complex model. This is due to the factthat error arises from both specification and measurement error. While a morecomplex model will reduce the specification error, it will also increase themeasurement error. At some point, the inclusion of more variables into themodel will increase the measurement error more than it will reduce thespecification error.

This trade-off between specification error and measurement error can be furtherdemonstrated by considering the use of a datset which has a higher degree ofmeasurement error(perhaps because we spent less time and effort on qualitycontrol in the data collection process). Under these conditions, the measurementerror will be higher at all levels of model complexity, as will be the total error, asshown in Figure 10.16.

Data Analysis

353

Error

Model Complexity

espec

emeas

Etotal

Figure 10.15 Relationship between Total Error and Model Complexity

Error

Model Complexity

espec

emeas

Etotal

Etotal'

e'meas'

Figure 10.16 The Effect of Bad Data on Total Model Error

The really important feature of Figure 10.16, however, is that apart from simplybeing higher, the total error curve is minimised at a lower level of modelcomplexity (i.e. the valley in this total error curve is shifted to the left). Theimplication of this is that if you have worse data, then you should also usesimpler models. This finding runs counter to actual modelling practice in manycases, where modellers believe that they can overcome the effects of bad data by

Chapter 10

354

using more complex models. As shown in Figure 10.16, using more complexmodels with bad data simply increases the total error in the model.

The policy lesson that comes from consideration of the above arguments is thatthere must be a balance between the time and effort spent on data collection anddata analysis (model building). You cannot compensate for poor quality data bydoing better analysis (remember: GIGO, garbage-in/garbage-out). On the otherhand, having collected high quality data, you should use more complex methodsof analysis to obtain the most information out of this datset if you wish tominimise the total error.

To complete this discussion of errors, it is necessary to consider some specialisedterminology. In statistical model building, there are generally two types ofvariables: dependent and independent. (As discussed later, this taxonomy ofvariables is not always adequate, particularly in the case of more complex andspecialised model-building techniques.) A dependent variable is the variable to beexplained or predicted by the model. It therefore depends upon the othervariables. Dependency implies a strict one-way causality in almost all cases. Adependent variable is caused to change by changes in the independent variables.The reverse causality, however, must not occur. Changes in the dependentvariable cannot cause changes in the independent variables.

The example of equation 10.2 may be used to illustrate this: W, the yield ofwheat, is the dependent variable. Its value is changed by the other variables –fertiliser application, number of years since wheat was last grown, inches ofrainfall. These latter three variables are independent variables. If the rate offertiliser application is changed, the yield of wheat may be expected to change asa result. The yield of wheat, however, will not change the rate of application offertiliser (in the year) or the amount of rainfall. The unidirectional causality isupheld by this model specification and the meanings of dependent andindependent variables should be clear. It is important to note, however, thatindependence, as applied to variables, does not necessarily imply that theindependent variables do not cause changes in each other or change together.This may be a required property in some instances and a preferred property inall instances, but it is not part of the definition of independence. Thus the rate offertiliser application may depend to some extent on rainfall amounts or theperiod since wheat was last grown on the land, without violating theindependence property of these variables.

To calibrate a model, observations are needed on a sample of values of bothdependent and independent variables for the units of concern. The resultingcalibrated model would then be used to explain or predict the dependentvariable. These sample values will have errors associated with them that willaffect the estimation of model parameters in calibration. Various mathematical

Data Analysis

355

operations on the independent variables may serve to propagate (magnify)individual errors to a much larger error in the dependent variable, while otheroperations may reduce errors or at least leave them unchanged. A number ofrules can be put forward to reduce error propagation in models (Stopher andMeyburg, 1975). These may be summarised as follows: the preferredmathematical operation is addition, followed by multiplication and division, whilethe least desirable operations are subtraction and raising variables to powers.

10.2.1 Bivariate regression

The simplest form of relationship that can be hypothesised is a linear one, astypified by equation 10.2. Furthermore, the simplest linear relationship is oneinvolving one dependent and one independent variable. The statistical procedurefor developing a linear relationship from a set of observations is known as linearregression. The simple case of one dependent and one independent variable iscalled bi-variate linear regression, while the use of multiple independent variablesis called multivariate linear regression. The procedures for estimating linearrelationships can be described most economically and understandably byconsidering the simple bi-variate case.

A relationship is hypothesised of the form:

Y i = ao + a1Xi (10.4)

where Yi = ith value of the dependent variableXi = associated ith value of the independent variable

and ao , a1 = true values of the linear regression parameters

The parameter values, ao and a1, are those sought by the estimation procedure.The true values are those that can be obtained by estimating the model from theentire population of values of Y and X. In other words, the connotation "true"refers to population values as opposed to sample values, but does not refer to theexistence or non-existence of a linear relationship of the form of equation 10.4.Thus the true value of a1 may be zero, indicating that there is no relationshipbetween X and Y. This means that equation 10.4 is the equation sought, but theequation that can be estimated is one based on sample estimates, such as:

Yi = a0 + a1Xi + Œi (10.5)

Chapter 10

356

where Yi , Xi are as defined before,a0 ,a1 = sample estimates of a0 and a1

Œi = the error associated with the ith observation

The addition of the error term, Œi , is significant in several respects. First, it is aclear indication of the acceptance of sample error in the calibration process. Thiserror term, however, does not account for measurement or specification errors.Secondly, it is significant that the error is represented as being additive in themodel. This indicates that the error is seen as being effectively independent of theindependent variable, Xi, since it does not interact with it. In fact, the calibrationprocedure of linear regression expressly assumes that the independent variablesare known without error for all observations and that the error resides only inthe dependent variable, Yi. Thirdly, the value Œi is assumed to exist, but cannotbe measured until the values of a0 and a1 are obtained. The existence of Œi is,however, the mathematical basis for obtaining estimates of a0 and a1.

Before attempting to build a linear relationship, such as that presented inequation 10.5, it is necessary to determine whether a linear relationship isappropriate to describe the data. The various measures, discussed in later sectionsof this chapter, do not indicate whether a linear relationship is the best one todescribe a phenomenon. They indicate only how good a fit is obtained by thelinear function.

The first step in the analytical process, as described in Chapter 10.1, is to conductexploratory data analysis. With respect to linear regression analysis, this first stepmust always be to construct a scatter diagram. Having selected a variable to bethe dependent variable, a plot should be constructed of each independentvariable against the dependent variable. These plots will generally show fairlyclearly how appropriate a linear relationship is, or whether any relationship maybe expected.

Some typical, hypothetical scatter diagrams are shown in Figures 10.17 through10.20. Figure 10.17 shows evidence of a positive linear relationship between Yand X. The relationship is positive because increases in X give rise to increases inY. Figure 10.18 shows the reverse relationship, but still a linear one. Figure 10.19shows a more-or-less horizontal band of values, indicting no relationshipbetween Y and X. Figure 10.20 indicates a probable non-linear relationshipbetween Y and X for which a linear regression would be inappropriate. Havingconstructed a scatter diagram and finding prima facie evidence of a linearrelationship, the analyst may then proceed to construct a regression relationship.

Data Analysis

357

Y

XFigure 10.17 Scatter Diagram of a Positive Linear Relationship

Figure 10.18 Scatter Diagram of a Negative Linear Relationship

Chapter 10

358

Figure 10.19 Scatter Diagram showing No Relationship

Figure 10.20 Scatter Diagram showing Non-Linear Relationship

If either of the plots of Figures 10.19 or 10.20 were obtained, then no linear-regression procedure should be applied to the data.

Given equation 10.5, the problem is now to determine a solution method for a0and a1. Clearly, any solution should seek to minimise the error of calibration

Data Analysis

359

(measurement and specification errors must be considered as given for anassumed model and datset). By definition, the sum of the error terms for thepopulation is zero,

Âi=1

NŒi ! = 0 (10.6)

where N = population size.

This is so because the true model is defined as the one that would be obtained forthe population (equation 10.4) in which Œi is absent. Hence, both the sum andaverage of the error terms over the population must be zero. In turn, this impliesthat in any sample there must be a range of values of Œi that are distributedacross both negative and positive values.

These properties lead to the conclusion that minimising the sum of the errors forthe sample data is not an effective procedure. The absolute sum of the errors (ifthe sample is indeed drawn correctly) must approach zero as sample size inincreased. In fact, in any large sample (where large may be defined as being inexcess of 100 observations), the deviation of the sum of the error values fromzero will be very small. It may also be noted that a strict minimisation of the sumof the error values would lead to a search for values of a0 and a1 that wouldproduce extremely large negative values of Œi since minus infinity is the smallestnumber known. Such a procedure would lead to estimates of a0 and a1 thatwould be as far from the true values as possible. Indeed, the calibration wouldgenerate a value of plus infinity for a0 and zero for a1, thus generating all valuesof Œi as minus infinity.

To avoid trivial or absurd calibration results, it would appear that the bestprocedure would be to minimise the square of the error terms. By squaring theterms, all values to be summed become positive. Hence, the absolute minimumvalue becomes zero, occurring only when all values of Œi are also zero. Thusminimising the sample sum of squared error terms leads to a calibration that isconsistent with the true population model. The least-squares approach, as thisprocedure is termed, also has another important property that is discussed inChapter 10.2.5 -- that of providing maximum-likelihood estimates of theregression coefficients that are unbiased for large samples.

Suppose there are a set of n observations of the values of Yi and Xi. Thehypothesised relationship between Xi and Yi is that of equation 10.5. Thisequation may be rewritten to express Œi in terms of the observed values of Yiand Xi and the unknowns, ao and a1,

Œi = Yi - ao - a1Xi (10.7)

Chapter 10

360

The sum of the squares of the deviations, S, from the true line is:

S = Ân

Œi2 = Ân

(Yi !-!a0!-!a1X1)2 (10.8)

where n = sample size.

The least squares method then states that the linear-regression line is that whichminimises the sum of the squares of the deviations, that is, the true line is that forwhich ÂŒi2 is a minimum. Obtaining estimated values of a0 and a1 is amathematical operation of no great complexity.

To find the minimum value of S, partial derivatives of S are taken with respect toeach coefficient in the equation, as shown in equations 10.9 and 10.10.

∂S∂a0

= -2S(Yi - a0 - a1Xi) (10.9)

∂S∂a1

= -2SXi(Yi - a0 - a1Xi) (10.10)

Minimisation of S occurs when the partial derivatives of S are zero and thesecond derivatives are positive. Hence, ao and a1 may be found by settingequations 10.9 and 10.10 to zero.

S(Yi - ao - a1 Xi) = 0 (10.11)

SXi (Yi - ao - a1 Xi) = 0 (10.12)

With the necessary mathematical manipulations, the solutions for a0 and a1 maybe obtained.

a1 = SXiYi !-!nYXSXi

2!-!nX2 (10.13)

a0 = Y - a1X (10.14)

It is often helpful to visualise the linear-regression procedure as a geometricproblem. The hypothesised relationship of equation 10.5 is shown in Figure 10.21.From this, it can be seen that ao is the intercept on the Y-axis and a1 is the slope ofthe line. By virtue of equation 10.14, the point (X,Y) must be on the line, as shownin Figure 10.21.

Data Analysis

361

Yi

Y

a0

Slope = a

X

Y = a + a X0 i

1

1

X i

Figure 10.21 Geometric Representation of the Regression Hypothesis

The linear regression problem is shown in Figure 10.22 as the problem of findingthe best straight line that fits the scatter of data points, representing the nobservations of Xi and Yi. The assumption that the errors, Œi are in the Yi valuesonly and that the Xis are known without error indicates that the Œi 's are thedistances in the Y-dimension between the observed point (Xi,Yi) (Xi,Yi) and thevertical projection of that point on the regression line (Xi,Yi) as shown in Figure10.22. Thus, the least-squares solution is the one that minimises the sum of thesquares of these vertical displacements.

Chapter 10

362

RegressionLine

(X , Y )i i

(X , Y )i i

Yi

Xi

^

Figure 10.22 Scatter Diagram of n Data Points

One reason for assuming the errors to reside in the Yi values and not in the Xivalues relates to the use of the resulting model for prediction of Yi values. Sincethe errors are not known, it is necessary to assume that any new values of Xi forwhich values of Yi values are to be predicted, are known exactly. if the reverseassumption were made, that all of the error were in the Xis, it would benecessary to know the error to predict Yi values. This is shown in Figure 10.23.The value Yi would be obtained if the error were assumed to be zero. The bestestimate of Yi, however, requires a knowledge of the error hi . As shown by thecomparative positions of Yi and Yi, this knowledge of the error is critical. Clearly,the geometry of this shows that the independent variable must be assumed to beknown without the error.

This discussion leads to further insights into the definition or meaning ofdependent and independent variables. Clearly, the independent variables arethose to be used for prediction, while the dependent variable is the one to bepredicted. Hence, the direction of causality is imposed on the regression model.

Data Analysis

363

Yi

X i

FittedRegressionLine

YiYi

Yi

X

^ ~h i

h i''^

Figure 10.23 Prediction if Error Is Assumed to Be in Xi Measures

A property of regression models of particular interest concerns the sum of theresiduals or error terms for the regression model. The residuals for thecalibration data are the differences between the regression estimate Yi, and theobserved value of the dependent variable, Yi. This difference, as shownpreviously, is the error term, Œi ;

Œi = Yi - Yi (10.15)

The sum of the residuals is:

Âi

Œi = Âi

(Yi !-!Yi) (10.16)

To determine this value, it is necessary to substitute for Yi, using the regressionequation :

Yi = Y - a1X + a1Xi (10.17)

Substituting equation 10.17 into equation 10.16 yields:

Âi

Œi = Âi

(Yi !-!Y!+!a1X!-!a1Xi) (10.18)

Chapter 10

364

Summing the terms on the right side of equation 10.18 yields:

Âi

Œi = Âi

Yi !-!nY!+!a1nX - a1Âi

Xi (10.19)

However, SYi is nY and SXi is nX. Hence, it is clear that the right side of equation10.19 is zero.

Âi

Œi = 0 (10.20)

This property is important primarily because it indicates that what might appearto be a potential test of a regression model is not a very good test. Specifically, re-substituting the values, Xi, used to calibrate the model for a test of predictivepower is a very weak test. Clearly, each individual prediction, yi , can beexamined against the observed value of yi . However, summation of the yivalues will yield exactly the summation of the yi values. This will occurregardless of how good or how poor the model fit may be. Thus summedpredictions from the calibration data provide no test of goodness-of-the-fit.

What, then, can be used to measure the goodness-of-fit of a regression equation?Remember that the basis of the linear regression procedure is to minimise theerror sum of squares. Using concepts based on Analysis-of-Variance (ANOVA),Stopher and Meyburg (1979) show that it is possible to derive a measure ofgoodness-of-fit termed the coefficient of determination (R2) which may be expressedas:

R2 = Â(Yi !-!Y)2

Â(Yi !-!Y)2 (10.21)

Values of R2 must lie between 0 and 1. The closer they are to unity, the better isthe fit, while the closer they are to zero, the poorer is the fit. The coefficient ofdetermination also signifies the percentage of the variance in Y which is"explained" by the regression model.

The square root of the coefficient of determination is called the correlationcoefficient and is denoted as R (or r). While the coefficient of determination mustalways be positive, the correlation coefficient has a sign which indicates thedirection of the relationship between the observed values of Xi and Yi. A positivevalue of R indicates that as Xi increases, so too does Yi, whereas a negative valueof R indicates that as Xi increases, then Yi decreases.

Data Analysis

365

While the correlation coefficient provides a measure of the goodness-of-fit, it isby itself insufficient to assess the worth of the regression model. A simple, ifsomewhat extreme, example suffices to demonstrate this. Suppose a datsetcomprises just two observations from a much larger population of data points.The correlation coefficient for a regression estimated from this datset willobviously be unity, since the regression line must pass through all (two) datapoints. Suppose now that extra data points from the population are included inthe sample. Since it is unlikely that all the new data points will lie on the linebetween the two original data points, then the correlation coefficient must fallbelow unity. Thus the coefficients in the regression model and the measure ofgoodness-of-fit are both functions of the number of data points used to estimatethe regression model. What is needed is some way of estimating the reliability ofthese estimates as the sample size used in estimation is changed.

To assess the statistical reliability of the entire linear regression relationship, it isnecessary to determine the probability that the correlation coefficient could havebeen obtained from random un-correlated data. As noted earlier, any tworandom data points will always give a correlation coefficient of unity; theprobability of random data giving high values of the correlation coefficient willdecrease as the sample size increases. Again referring to the concepts of ANOVA,it is possible to show that the F-statistic can be used to test the probability that thecorrelation coefficient could have been obtained by chance, given the size of thesample used to estimate the regression coefficients. The calculated value of F iscompared against the tabulated value of F for the required level of confidence ofavoiding a Type-I error (a value of a at the 5% level is often used), and fordegrees of freedom of u1 = 1 and u2 = n-2 (where n is the sample size). If thecalculated value of F is greater than the tabulated value, then the overallregression can be accepted as being statistically significant (i.e. not likely to havebeen generated by random data).

To assess the reliability of the coefficients in the regression equation, use is madeof the t-statistic. If independent random samples had been drawn from thepopulation, then the resulting regression equation coefficients would have variedsomewhat. If the data in the population is actually described by some underlyingrelationship between the independent and the dependent variables, then wewould expect that essentially the same regression equation would be reproducedon these repeated samplings. However, we know that random fluctuations in thedata will cause the coefficients to vary. What we would like to estimate is theextent to which these random fluctuations contribute to the values obtained forthe coefficients. The t-statistic measures the relative magnitude of thesefluctuations by taking the ratio of the average value of the coefficient to thestandard error of the coefficient (i.e. the standard deviation of the coefficient ifrepeated sampling had been conducted). Fortunately, it is generally unnecessaryto conduct repeated sampling, and the standard error of the estimate of the

Chapter 10

366

coefficient can be obtained by ANOVA techniques from a single sampleregression. The calculated value of the t-statistic can then be compared withtabulated values of t to test the significance of the coefficient. If the calculatedvalue of t is greater than the tabulated value of t at the 100(1 - a/2) percentagelevel with (n-2) degrees of freedom, then the hypothesis that the true value of thecoefficient is equal to zero (i.e. no correlation) can be rejected with a confidence of100(1 - a)%. At a 95% level of confidence, and with a sample size of greater thanabout 50, a critical value of the t-statistic is equal to 2.00. The same t-test can beapplied to the constant in the regression model, but the interpretation of theresult is somewhat different. While a low value of t for a regression coefficientimplies that the dependent variable is unlikely to be related to the independentvariable (at least in a linear fashion), a low value of t for the constant implies thatthe regression (if one exists) is likely to pass through the origin.

To illustrate the potential application of linear regression, consider the finding inChapter 10.1 that there appeared to be a linear relationship between the numberof trips per person per day and the number of vehicles owned by that person'shousehold (see Figure 10.10). This hypothesis can be tested by confirmatoryanalysis by seeing whether a statistically significant linear regression model canbe fitted to the data.

Within Data Desk, this regression can be performed by clicking on the triangulararrow at the foot of the scatterplot (a so-called "HyperView" button), and thenselecting the regression option as shown in Figure 10.24. The regression is thenperformed with the variable on the horizontal axis being assumed to be theindependent variable and the variable on the vertical axis assumed to be thedependent variable. The results are then presented in tabular fashion in awindow on top of the scatterplot, as in Figure 10.25.

In terms of the previous discussion, the following interpretation can be applied tothe results of the regression. First, it appears from the coefficient ofdetermination (R2) that a good regression may exist because of the high value of80.5%. However, as described earlier, this high value may be misleading becauseof the small number of data points used (only 9 values of mean trip rate wereused). To test this, we need to examine the value of the F-statistic. With 7 degreesof freedom, an F value of 28.9 would indicate a significant regression at almostthe 0.1% level of confidence (where the critical value of F is 29.25). That is, wecould accept this regression and only be proved wrong (i.e. that no suchregression exists) 1 in 1000 times. We can also see that the constant and thecoefficient are also significant at better than the 5% level of confidence (i.e. thevalues of t are both greater than 2.00). The t-statistic on the independent variablemerely confirms what we have already seen from the F-statistic (because in bi-variate linear regression, the values of F and t are directly related), while the t-statistic on the constant indicates that the regression does not pass through the

Data Analysis

367

origin (i.e. people in zero-vehicle households still make some trips; in fact theymake an average of 2.698 trips!).

Figure 10.24 Selection of Linear Regression Option in Data Desk

Figure 10.25 Results of Linear Regression Analysis in Data Desk

To further check the validity of this linear regression, we can plot the residualsfrom the regression against the predicted values of the dependent variable. Theresiduals are the differences between the predicted values of the dependentvariable and the actual values. Obviously, if a perfect regression had beenobtained (i.e. R2 = 1), then all the residuals would have been equal to zero.However, in real regressions this never occurs and what we seek to check is

Chapter 10

368

whether there is any relationship between the residuals and the predicted values.To plot the residuals scatterplot, click on the HyperView button and select thatoption as shown in Figure 10.26. This will produce a scatterplot as shown inFigure 10.27.

Figure 10.26 Selecting a Residuals Scatterplot in Data Desk

Figure 10.27 Results of Residuals Scatterplot in Data Desk

It can be seen that the residuals are fairly evenly distributed around the zero linewith values of plus/minus 0.5 trips per person per day. Importantly, we can seethat there is no relationship between the residuals and the predicted values andthat the variance of the residuals does not vary with the size of the predictedvalues (this could be verified by performing a further regression on thisscatterplot using the HyperViews button at the bottom of this scatterplot). This

Data Analysis

369

lack of a relationship is important because it signifies a lack of heteroscedasticity(i.e. non-uniform variances) which would invalidate the use of simple linearregression. If such a relationship is observed, then you would be advised toconsult a regression textbook (e.g. Draper and Smith, 1968) to determine theseriousness of this effect.

Given the results described above, it would appear that a linear relationship doesexist between the two variables. this is not surprising given the scatterplot ofFigure 10.10, but it appears somewhat surprising given the scatterplot of Figure10.7, where the ungrouped data points are plotted. However, if a truerelationship exists, then it should exist at all levels of aggregation. Therefore,consider performing a linear regression on the 904 data points shown in Figure10.7. The results of this regression are shown in Figure 10.28.

Figure 10.28 Results of the Ungrouped Linear Regression

It appears, from the low value of the coefficient of determination, that thisregression is not as good as that obtained from the grouped data. That is, theregression is not explaining as much of the variance in the dependent variable.However, this should not be too surprising since we eliminated most of thevariance from the dependent variable when we performed the aggregation. Tocheck on the real significance of the regression, we need to consider the F-statistic. With 985 degrees of freedom, an F value of 24.3 would indicate asignificant regression at well beyond the 0.1% level of confidence (where thecritical value of F is 10.83). In addition, both the variable coefficient and theconstant term are highly significant as indicated by the high values of the t-statistic. Thus the regression on the ungrouped data appears to be even strongerthan the regression on the grouped data, even though the proportion of thevariance explained is far less. This example clearly demonstrates that thereporting of the correlation coefficient is meaningless without also reporting theF-statistic.

Chapter 10

370

A further test on the validity of the regression can be obtained by plotting thedistribution of the residuals, as shown in Figure 10.29. If the assumptionsunderlying the regression are upheld, then the distribution of the residualsshould be approximately normal. This would appear to be the case for theungrouped data used in the above analysis, although the distribution is skewedslightly to the right.

Figure 10.29 Distribution of the Residuals

One major difference between the two regressions is that the estimatedcoefficients and constants are different, with the regression on the ungroupeddata having a higher intercept and a lower slope. This difference serves toillustrate a major misuse of regression on survey data. As described earlier, whenlooking at the data, it always appears that the grouped data will give a betterregression model because of the reduction in scatter of the plotted points. Forthis reason, regressions are often erroneously performed on the grouped data.However, in the regression on the grouped data described above, it is implicitlyassumed that each of the group means has equal weight in the determination ofthe regression line. However, in the calculation of the group means, a differentnumber of data values has been used in the calculation. One would expect,intuitively, that those group means which have been calculated from the largernumber of data values are more reliable and hence should be given more weightwhen estimating the regression equation. This can also be shown to be the casestatistically, and gives rise to a method known as weighted regression.

The essence of weighted regression is to weight each data point by the inverse ofthe variance within that data point (i.e. give more reliable data points moreweight). This can also be shown to be equivalent to weighting each data point bythe number of data values contributing to each of the data points. Thus if there

Data Analysis

371

are mi data values grouped into the ith group, then the independent anddependent variables for each group are entered into the regression mi times.Many regression packages have the facility to automatically carry out thisprocedure simply by specifying the number in each group (Data Desk lacks thisfacility, but SYSTAT is one package which can be used in this way). The input dataand results from a weighted regression using SYSTAT is shown in Table 10.1.

Table 10.1 Input Data and Results for Weighted RegressionTrip Rate Vehicles Group Size Regression Results

1.95 0 40 R2 0.6223.42 1 267 a0 3.1584.01 2 267 tconstant 142.44.39 3 126 a1 0.2913.85 4 86 tcoeff. 38.44.36 5 45 F 1473.74.60 6 50 d.o.f 8956.00 7 75.44 8 9

It can be seen that the regression coefficients are now in agreement with those ofthe ungrouped regression shown in Figure 10.28. However, the t-statistics arelarger and the regression appears more significant than it really is (by means ofthe very high F-statistic) because an implicit assumption has been made that all ofthe data values within a group are identical, whereas in reality there is still aconsiderable degree of variability within each group as shown in Figure 10.7.Nonetheless, weighted regression at least overcomes the biasing effect on theestimated coefficients and hence should always be used with grouped data if theoriginal data values are not available.

10.2.2 Multivariate regression

The last section provided a detailed treatment of bi-variate regression, that is, alinear relationship between only two variables. However, it is relatively unlikelythat a useful model can be obtained from a bivariate relationship, because thiswould generally suggest a much too close relationship of variables to beplausible. For example, in the case of the ungrouped regression model of tripsper person as a function of vehicles per household, only a relatively small portionof the variance in the observed trips per household was captured by the variablevehicles per household. It is clearly not very satisfactory to be able to account forsuch a small proportion of the variance in the model. Thus it appears extremelyprobable that real modelling efforts will require a more complex relationship tobe established.

Additional complexity can involve two basic alternatives: a more complexrelationship between the variables, that is, a nonlinear relationship; or the use of

Chapter 10

372

more than one independent variable. The second alternative is both simple andthe subject of this chapter. The extension of bivariate linear regression tomultivariate linear regression is principally a matter of simple mathematicalgeneralisations of the relationships developed in Chapter 10.2.1.

Before undertaking these generalisations, it is useful to consider the concepts andvisualisation of the multivariate linear relationship. Bivariate regression involvedfitting a straight line to data in a two dimensional space. Multivariate regressioninvolves the fitting of an (n-1) dimensional surface to data in n dimensional space.It should be noted that the surface is a plane, since the relationship is linear.Illustrations of higher dimensionality problems cannot, of course, be provided.However, the extension is conceptually relatively simple.

The three-dimensional problem may be examined in another light. Therelationship postulated is:

Yi = a0 + a1X1i + a2X2i + Œi (10.22)

If one of the two independent variables, Xi or X2, were to be held constant, therelationship could be written:

Yi = a0' + a1X1i + Œi (10.23)

(X2i = fixed value).

Alternatively, equation 10.24 could be used:

Yi = a0" + a2X2i + Œi (10.24)

(X2i = fixed value).

In these two equations, a0' and a0" represent adjusted constant terms that takeaccount of the effect of the variable whose value is fixed. Thus in equation 10.23,assuming that the fixed value of X2i is some arbitrary value g, then a0' is:

a0' = a0 + a2g (10.25)

Similarly, if the fixed value of X1i in equation 10.24 is 0, then a0" is:

a0" = a0 + a10 (10.26)

The relationships of equations 10.23 and 10.24 could be represented then as afamily of straight lines in a two dimensional space. Note that the lines are parallel,since at each value of X2 that variable has no further effect on the value of Yi asX1i changes in value. In engineering terms, the lines represent a series of

Data Analysis

373

projections of the surface onto the two-dimensional surface formed by the Y andX1 axes.

It is important at this point to reconsider the concepts of dependence anindependence. As stated in Chapter 10.2.1, the basic concept of dependence andindependence is one of causality. Thus changes in an independent variable causechanges in the dependent variable, while changes in the dependent variable donot cause changes in the independent variable. Thus, in the weather-forecastingexample, humidity may not be an appropriate independent variable, sinceprecipitation will cause changes in humidity (although it may not be strictlypossible to say that the likelihood of precipitation will cause changes in humidity).In the context of multivariate regression, independence does not mean that theindependent variables are unrelated to each other. Hence it is not a statementthat the variables are independent of each other. However, if any two or moreindependent variables are strongly related to each other, problems will arise.First, one may consider that if there are two or more highly related independentvariables, the use of them in a model creates a redundancy. Conceptually, such aredundancy is undesirable and may contribute to some problem inunderstanding how the phenomenon being modelled behaves in reality.Statistically (discussed later in the chapter), the presence of such redundancy willlead to incorrect estimates of the coefficients of all such interrelated variables.Hence it may be stated that high intercorrelations are conceptually andstatistically undesirable, but the principle of independence by itself does notexclude such intercorrelations. Such variables should therefore be checked forand, if found, excluded from the proposed linear relationship.

Given the basic concepts of multivariate linear regression, attention can now begiven to the estimation of the coefficients of the regression equation. In general, amodel of the form of equation 10.27 may be postulated as representing themultivariate linear-regression equation.

Yi = a0 + a1X1i + a2X2i + ... + amXmi + Œi (10.27)

To calibrate such a model, the procedure is to minimise the sum of the squarederror terms, Œi , to find the values of the coefficients and the constant.Rearranging equation 10.27 to isolate Œi on one side produces:

Œi = Yi =a0 - a1X1i - a2X2i - ... - amXmi (10.28)

The process is then completely analogous to that described for bivariate linearregression. Equation 10.28 is squared and summed over all observations, i, anpartial differentials taken with respect to each of the unknown values, ak. Theresult of this procedure is a set of normal equations of the following form:

S(Yi - a0 - a1X1i - a2X2i - ... - amXmi) = 0

Chapter 10

374

SX1i(Yi - a0 - a1X1i - a2X2i - ... - amXmi) = 0

.. .. .. .. .. .. .. .. .. .. .. ..

.. .. .. .. .. .. .. .. .. .. .. ..

SXmi(Yi - a0 - a1X1i - a2X2i - ... - amXmi) = 0 (10.29)

The set has (m + 1) equations, all of which are linear in the unknown values, aothrough am . Thus a unique solution may be found by solving this set of linearequations as was done before. As an example, consider the case in which mequals 2. The normal equations are:

S(Yi - a0 - a1X1i - a2X2i) = 0 (10.30)

SX1i(S(Yi - a0 - a1X1i - a2X2i) = 0 (10.31)

SX2i(S(Yi - a0 - a1X1i - a2X2i) = 0 (10.32)

As before, the first normal equation provides an identity in terms of the means:

Y - a0 - a1X1 - a2X2 = 0 (10.33)

Rearranging this equation to yield a definition of the constant, ao , produces:

a0 = Y - a1X1 - a1X1 - a2X2 (10.34)

The remainder of the solution becomes algebraically tedious and is best solvedby matrix algebra. Defining the vector of observations of Yi as Y, the matrix ofobservations of X1i and X2i as X, the vector of coefficients as a, and the vector oferror terms, ei , as e, the solution for the coefficients is:

a = (X'X)-1 X'Y (10.35)

This is the general solution of the multivariate regression.

In Chapter 10.2.1, a number of properties of the bivariate linear-regressionmodel were described and discussed. Without exception, the multivariate linear-regression model holds these properties.

Thus, the sum of the residuals is zero, regardless of the number of independentvariables used. This follows from the first of the normal equations. The sum ofthe residuals is given by equation 10.36, which is a rearrangement of equation10.27, summed over all observations, i.

SŒi = S(Yi -a0 - a1X1i - a2X2i - ... - amXmi) (10.36)

Data Analysis

375

However, the first normal equation is:

S(Yi -a0 - a1X1i - a2X2i - ... - amXmi) = 0 (10.37)

Hence it follows that the sum of the errors (residuals) is zero.

S(Yi - a0 - a1X1i - a2X2i - ... - amXmi) = 0 (10.38)

The implications of this property are again the same as those discussed forbivariate linear regression.

To test the significance of each individual variable in the regression equation, t-tests can again be conducted on each coefficient, as described for the bivariateregression, and the computation is exactly the same. The constant may be testedby the same procedure as before. The degrees of freedom of the t tests are all (n-2).

10.2.3 Factor analysis

Factor analysis is a procedure with roots in the basic principals of multivariatelinear-regression analysis. One of the problems noted in Chapter 10.2.2, withrespect to multivariate linear regression, is that multicollinearity of two or moreindependent variables causes mis-estimation of the coefficients of such variables.Frequently, exclusion of one of the multicollinear variables will result in a lessthan satisfactory relationship, while inclusion leads to counter-intuitive values forthe coefficients. The analyst is thus placed on the horns of a dilemma. Factoranalysis and, more particularly, principal components or principal factors (aspecial case of factor analysis) provide a means to resolve this dilemma.

In very broad terms, factor analysis is a method for reformulating a set of naturalor observed independent variables into a new set (usually fewer in number, butnecessarily not more in number) of independent variables, such that the latter sethas certain desired properties specified by the analyst. Kendall (1965) hassuggested a difference between factor analysis and principal-componentsanalysis. He suggests that principal-components analysis is the search throughdata to try to find factors or components that "may reduce the dimensions ofvariation" and may "be given a possible meaning". On the other hand, hesuggests that factor analysis starts with the hypothesis of a model and tests itagainst the data to see if it fits. This distinction seems useful and is pursued in thelatter part of this section. (It is notable that other authors, e.g., Harman (1960)and Tintner (1952), do not appear to make use of this distinction.) This sectionconcentrates on principal-components analysis, where among other things thelatter set of variables has the property of zero correlations between all newvariables. In all versions of factor analysis, the variables (factors) in the new setare formed as linear combinations of all variables in the original set.

Chapter 10

376

The method was originally developed in mathematical psychology as a means toreduce a large number of measures (probably collinear) obtained inpsychological measurement to a small number of largely uncorrelated salientcharacteristics, capable of describing character traits or other concepts inpsychology. For readers familiar with psychological measurement, factoranalysis was originally developed as the principal means of constructingmultidimensional scales (Harman, 1960). Although the method has been criticisedby many psychologists, it still remains one of the most tractable andunderstandable methods of undertaking multidimensional scaling (Koppelmanand Hauser, 1977).

Some examples may be useful to illustrate the purpose and procedure of factoranalysis. Consider, first, a situation in marketing cars. A sample of people havebeen asked to rate some of the aspects of the finish and comfort of a number ofcar models. The ratings were made by allocating points on each attribute to eachmodel, where the points may be selected from zero to 100. Fourteen attributesare used, as shown in Table 10.2.

Table 10.2 Attributes Used to Rate Various Car Models

ColourSeat widthSeat supportInterior finish (panels, fascia)Radio/tape playerWindow controlsAir conditioning

HeatingInterior noiseCarpetingEntry/exit spaceSafety beltsSeat adjustment

After obtaining the scores of each person, one could obtain the means for eachmodel on each attribute (see Figure 10.30). There is a great deal of usefulinformation in Figure 10.30. However, it is very hard to absorb the informationand even harder to gain an impression of which model is superior. Furthermore,many of the attributes may be related, either in the minds of the raters or in thesurvey design. For example, the efficiency and effectiveness of the airconditioning and heating are likely to be related, while these same items areprobably quite unrelated to the exterior finish.

Applying factor analysis to these ratings will lead to greater insights into theimages of the various models and will also show how the various attributes linktogether. The grouping of attributes is shown by the coefficients of the originalvariables in the composition of the factors. Those with large (negative orpositive) coefficients make a major contribution to the factor, while those withsmall (near zero) coefficients make little contribution to the factor. Suppose afactor analysis of the ratings has yielded the groupings of attributes shown inTable 10.3, where the three factors account for more than 95% of the variance of

Data Analysis

377

the original attribute ratings. Factor 1 might be termed exterior design, factor 2interior comfort, and factor 3 environmental controls.

Color

Seat Width

Seat Support

Interior Finish

Exterior Finish

Radio-Tape Player

Window Controls

Air Conditioning

Heating

Interior Noise

Carpeting

Entry-Exit Space

Safety Belts

Seat Adjustment

0 100

Figure 10.30 Raw Attribute Scores for Four Car Models

For each factor, an average score for each car model could be obtained andplotted, as shown in Figure 10.31. The information provided by Figure 10.31 is at

Chapter 10

378

once much easier to comprehend, and has also shown how certain items arecorrelated to produce a major effect, that is, the three factors.

Table 10.3 Factor Groupings for Attributes of Table 10.2Factor Attributes

1 ColourExterior finishEntry/exit space

2 Space widthSeat supportInterior finishInterior noiseCarpetingSeat adjustment

3 Radio-tape playerWindow controlsAir conditioningHeatingSafety belts

It should be noted that factor analysis is not a modelling technique. Rather, it is aprocedure for manipulating data prior to developing models. The results offactor analysis can be used directly in some forms of decision making orappraisal, but such use does not conform to the usual definitions of modelling.

ExteriorDesign

InteriorDesign

EnvironmentalDesign

0 100

Figure 10.31 Factor Scores for Four Car Models

10.2.4 Discriminate analysis

Discriminant analysis was originally developed for use in the biological sciencesby Fisher (1936).The basic hypothesis of discriminant analysis runs as follows: it isassumed that a population is made up of two distinct subpopulations. It is furtherassumed that it is possible to find a linear function of certain measures orattributes of the population that will allow an observer to discriminate between

Data Analysis

379

the two subpopulations. Originally, the technique was devised to assist biologistsin identifying subspecies. In this context, suppose a researcher has two subspeciesof a particular plant species, say a daffodil. In general appearance the twosubspecies are so alike that one cannot with any certainty state which is which.However, the length, width, and thickness of the leaves and the maximumdiameter of the bulb can be measured. It can be hypothesised that a linearcombination of these measures can be devised which will allow the analyst todiscriminate between the two subspecies with the least possible likelihood oferror.

This technique appears to have many possible applications in transportationplanning. An application frequently used pertains to the choice of transportmode, where the two populations are considered as being car users and transitusers, and a linear combination of system and user characteristics is sought as abasis of discriminating between the two populations. Using this as an example,the frequency distributions of the two populations--car users and transit users--can be plotted against some function z . Now over the range z1 to z2 (see Figure10.32), it is not certain whether an individual with a z value in that range is an caror a transit user. Suppose it has to be decided whether each person in this totalpopulation is an car or a transit user. How can one proceed so as to make as fewmistakes as possible? To put it in a slightly different way, how can one minimisethe number of mis-classified people?

Z1 Z2 Z

AutoUsers

TransitUsers

Figure 10.32 Frequency Distribution of Car versus Transit Users

Two important things should be noted here, namely, that each member has to beassigned to one or the other subpopulation, and that the decision as to how todivide the population has already been made, that is, into auto and transit users.

Chapter 10

380

In other words, discriminant analysis is not designed as a procedure for seekingpopulation groupings, like cluster analysis.

It should be clear from this that discriminant analysis has many potential usesbeyond the original biological ones. Whenever it is desired to find relationshipsthat would permit classifying human populations into groups, discriminantanalysis may be an applicable method. Likewise, transport engineering often callsfor classification of various physical items into groups having distinct properties.Whenever such grouping cannot be made on obvious grounds, discriminantanalysis may be used to find relevant compound measures to permit appropriateclassification.

The problem is one of determining the function z that will best permit thediscrimination between members of the two populations. Let z be defined as afunction of a number of factors x1, x2, x3,.....,xk, and the subpopulation bedesignated by subscript i and the members of the subpopulations by subscript j.Then,

zij = a1x1ij + a2x2ij + ...... + akxkij (10.39)

For convenience, equation 10.39 may be abbreviated to:

zij = Âp=1

k!apxpij (10.40)

where i = 1, 2 j = 1, 2,..., n

and ap = the weighting coefficient of the pth factor, xpij

The task is to determine the values of the set of weighting coefficients, [ap], suchthat one can best discriminate between the two subpopulations. To set aboutdetermining these coefficients, it is necessary to define what is meant bydiscrimination between two subpopulations.

Two alternative definitions could be postulated. Consider Figure 10.33. One maypostulate that the rule sought is to state that all members of the population with avalue of z less than z' are to be classified as being in subpopulation 1, while thosegreater than z' are classified in subpopulation 2. Clearly, those members ofpopulation 2 who fall in the shaded area M2, will be mis-classified by this rule, aswill those of population 1 who fall in M1.

Data Analysis

381

Figure 10.33 Definition of Discrimination between Two Subpopulations

Neyman and Pearson (1933) suggested that discrimination be treated as anattempt to minimise the total number of mis-classifications, that is, the sum of M1and M2. The task is to state this mathematically so as to define the coefficients ofthe discriminant function, z.

Fisher (1936) suggested, alternatively, that discrimination could be defined asachieving the maximum separation of the two subpopulations. This is, of course,equivalent in effect to minimising mis-classifications, but it generates a differentmathematical statement of the problem and is the derivation used here. Ofcourse, with Fisher's definition of discrimination, some care is needed in selectingthe measure of separation of the two subpopulations. If one were to measure thisas the distance, Dm between the two measures, z1 and z2, one could clearlyincrease D simply by multiplying the z function by some factor greater than one.This, however, would not increase the separation in any meaningful way.Therefore, Fisher proposed that separation be measured as the distance betweenthe means, D, relative to the within-subpopulation variances. This would make itclear that scaling the discriminant function has no effect upon the separation ofthe two subpopulations. Finally, the simple distance between the two means maynot be the best measure, since one may experience sign problems. (A priori, onemay not know which subpopulation has the small z values. Hence, by measuringfrom the second subpopulation mean, say, from Figure 10.13, the distance wouldbe negative and the largest negative distance would be desired.) This can beovercome by maximising the square of the distance between the twopopulations, where this distance is also the between-population variance.

Chapter 10

382

Stopher and Meyburg (1979) describe the procedures involved in estimating a setof coefficients for use in the discriminant function (eqn. 10.58). The next step is todetermine whether the discriminator is significant. There are three factors toconsider in this determination. First, there may be a real difference between thepopulations, but they are so close together that a discriminator is not veryeffective. This is measured by the errors of misclassification which, though theminimum obtainable, may still be large. Second, there may be a real differencebetween the populations but the sample is not large enough to produce a veryreliable discriminator; this is really a matter of setting confidence limits on theinterpretation of the sample results as described in Chapter 4. Third, it may bethat the parent populations are identical and that a discriminant function isillusory. For the purposes of the use of discriminant analysis in this context, thelatter is unlikely to occur, since this technique is only suggested when separatepopulations can in fact be readily identified. However, the first two questions ofsignificance are very relevant. Considering the first point, the populations maybe interpreted as being too close together in two ways. Either the wrong factorswere chosen or else not all the significant factors were used to build thediscriminant function. Alternatively, the assumptions of rationality andconsistency implicit in the modelling technique are so tenuous that no set offactors can discriminate effectively between the two populations. In either case,the significance of the observed discrimination can be measured in terms of theprobability of the observed discrimination occurring at random. This can be doneusing a variance ratio (Kendall, 1965). Because the variance ratio involvesmeasures of both the distance between the populations and the numbers in thesample, it can also be used as an indicator of the reliability of the discriminatordue to sample size. One may also wish to test the significance of each factor in thediscriminant function, and this can be done using the t-statistic described inChapter 10.2.2.

As an example of the use of discriminant analysis, consider the following studytaken from Lisco (1967) relating to travel to work in Chicago. For 159 people,information was obtained on the costs of their travel to work, the time it took,the mode of travel used, income, age and sex of the respondent, and informationon the availability of a car for the trip to work. The problem was to see if adiscriminant function could be used to separate transit and auto users. A total of61 respondents used autos and 98 used transit.

Data Analysis

383

Car availability was entered as a dummy variable (i.e. a Yes/No response), whileincome, sex and age were entered as values. time and cost were entered asdifferences between transit and auto. the results of the discriminant functionanalysis are shown in Table 10.4. For this model, the variance ratio was significantbeyond the 99% confidence level, indicating that a significant discriminatorappeared to exist. The total number of misclassifications was 37 out of the 159respondents, where 11 car users were assigned to transit and 26 transit userswere assigned to auto.

Table 10.4 Results of Discriminant Analysis on Chicago DataVariable Coefficient t-scoreTravel Time 0.0330 2.48Travel Cost 0.0050 3.15Car Availability 1 0.0631 0.36Car Availability 2 -0.5039 2.98Income -0.8167 3.36Sex -0.2713 1.81Age -0.0203 2.39Constant 1.1790 ----

Examining the results of Table 10.4, it may be seen that all but two of thecoefficients are significantly different from zero at the 95% confidence level (i.e.only two t-scores are less that 2.00). Car users are found to be associated withpositive values of the discriminant function and transit users with negativevalues. It appears therefore that car users are associated with positive travel timeand travel cost differences. For classification purposes, since these differences areexpressed as transit minus auto times and costs, auto drivers are more likely,ceteris paribus, to be associated with shorter travel times and costs than for thetransit trip. Similarly, transit users are likely, ceteris paribus, to be older and havehigher incomes than car users and are more likely to be female (sex was enteredas a 1, 2 variable, such that 2 indicated female). Transit users are also more likelyto have more favourable time and costs by their chosen mode than by car.

The interpretation of this example illustrates the correct use of discriminantanalysis as a classification tool. Thus, given data on a new individual, one couldclassify that person as a transit or auto user by evaluating their value of thediscriminant function. Suppose, for example, that data are obtained on threeadditional people in the study area, as shown in Table 10.5 (note that age andincome were coded as normalised variables, i.e. unit mean and standarddeviation).

Chapter 10

384

Table 10.5 Additional Chicago Commuters to be ClassifiedPerson Time

Diff.CostDiff.

CarAvail.1

CarAvail.2

Income S e x A g e Z

1 10 5 0 1 0.50 1 0.5 0.3402 -20 -10 0 0 0.95 2 3.2 -0.9143 -15 25 1 0 0.76 2 2.2 -0.336

The value of the discriminant function, Z, for each individual is also shown inTable 10.5. The discriminant function was defined in Table 10.4 such that zerorepresents the boundary between the two populations, with negative values of Zindicating a transit user. Hence the discriminant function suggests that individual1 is likely to be a car user, while individuals 2 and 3 are likely to be transit users.These classifications are intuitively satisfying. The time and cost differences forindividual 1 show that for this person the car is quicker and cheaper. Hence thecar is a logical choice and the classification is reasonable. Similar reasoning appliesto individual 2. Individual 3 has a time advantage by transit and a costdisadvantage. While the cost disadvantage is larger, the relative sizes of thecoefficients in Table 10.4 mean that the travel time advantage has far more effecton the classification. This is reinforced by the fact that this individual is female andolder than average, and hence the transit classification is intuitively correct.

10.2.5 Maximum likelihood estimation

The regression techniques described in Chapters 10.2.1 and 10.2.2 have both usedleast-squares regression to obtain their estimates of the coefficients of theregression equation. However, the use of least-squares is based on assumptionsof normality and independence, and these conditions may not always prevail. Insuch circumstances, it has been found that the use of Maximum LikelihoodEstimation (MLE) techniques proves to be a valuable methodology. MLE can alsobe used in more general situations related to the likelihood of outcomes.

The concept of likelihood is derived from the ideas of probability andexperimental outcomes (Ehrenfeld and Littauer, 1964). Consider an experimentto measure the level of carbon monoxide in an urban area. Because of localvariations in concentration, dispersion, proximity of sources, and so forth, onewill obtain a range of different values from these different points. Before or afterundertaking the experiment, one may postulate a general level of carbonmonoxide in the urban area. For example, the experiment may have generatedthe data of Table 10.6.

One may postulate that the average level of carbon monoxide in the area is 20parts per million (ppm). Now one would like to know how likely it is that thesevalues could have been obtained if the average urban level is 20 ppm. This canbe determined by calculating the likelihood of the values of Table 10.6, given theassumption the average level is 20 ppm. One may also calculate the likelihood of

Data Analysis

385

these values, given some other assumed average level, say, 25 ppm. As is seenlater, additional assumptions must be made to estimate the value of thelikelihood. For the moment, however, it is appropriate to consider further themeaning of the likelihood.

Table 10.6 Observations on Carbon Monoxide Level (parts per million)Location Carbon Monoxide Level

1 262 133 184 325 236 257 17

For any assumed average level of carbon monoxide, it is possible to define aprobability that each of the values in Table 10.6 could have occurred. Thelikelihood is then defined as the joint probability under the assumed averagevalue. Suppose the probability of measuring 26 ppm at location 1 under theassumed value of 20 ppm is denoted p(1/20), the value of 13 ppm at location 2denoted p(2/20), and so forth. Then the likelihood of the values of Table 10.6 is:

L(CO = 20) = p(1⁄20).p(2⁄20).p3⁄20). ... .p(7⁄20) (10.41)

Similarly, equation 10.42 defines the likelihood for the assumption of a meanlevel of 25 ppm.

L(CO = 25) = p(1⁄25).p(2⁄25).p3⁄25). ... .p(7⁄25) (10.42)

One might wish now to consider which of the assumptions is more likely: anaverage level of 20 ppm or 25 ppm. The more likely value is clearly going to bethe one that yields the higher value of the likelihood. This notion leads to theextension of the concept of likelihood to hypothesis testing.

To test a hypothesis, you may recall from Chapter 4.6.2 that one may calculatethe likelihood under the hypothesis and under some alternative hypothesis.Clearly, if the likelihood under the first hypothesis is larger, one would beinclined to accept that hypothesis in preference to the alternative hypothesis. Thisnotion leads to several very important considerations.

First, it is clearly of considerable importance to construct two hypotheses for suchtests to determine which hypothesis is the more likely and hence the moreacceptable. Second, the choice of the alternative hypothesis is as important as thatof the principal hypothesis. It is clear that if, in the example, one chose thealternative hypothesis as 200 ppm against the principal hypothesis of 20 ppm, theprincipal hypothesis would be accepted. However, if one chose 21 or 19 ppm, it is

Chapter 10

386

no longer clear that the principal hypothesis would be accepted. The test is nowmore stringent.

Third, it is not sufficient simply to determine which of the two likelihoods islarger. One may raise the question of how much larger one likelihood must be toconsider that the difference could not have been caused by chance. This is, ofcourse, the principle underlying most statistical hypothesis testing, much ofwhich has been utilised in earlier portions of the book. It is worthwhile exploringthese considerations in more detail here since they led to the notion ofmaximum-likelihood estimation and its accompanying statistical tests, which arethe concern of subsequent sections in this chapter.

Before developing the notions of likelihood and hypothesis testing further, it isworthwhile to see how one may calculate likelihoods. Return to the data of Table10.6. It was noted that additional assumptions must be made to compute thelikelihood. Essentially, it is necessary to assume there is some underlyingdistribution of the values of carbon monoxide, from which the samples of Table10.6 were drawn. Suppose the distribution is assumed to be normal with astandard deviation of 6 ppm. Then the probability of obtaining onemeasurement, y, of carbon monoxide is:

p(1,y) = e((y1!-!y)2/2s2)

s 2p (10.43)

Assuming the value of y to be 20 ppm, s to be 6 ppm, and the values of Table10.5, the likelihood can be estimated from:

L(y1,y2,y3,...,y7 Áy = 20) = e((y1!-!y)2/2s2)

s 2p . e

((y2!-!y)2/2s2)

s 2p . ...........

..... . e((y7!-!y)2/2s2)

s 2p (10.44)

By gathering the terms, this may be simplified to:

L(y1,y2,y3,...,y7 Áy = 20) = e(Â

i=1

7(yi !-!y)2/2s2)

[(s 2p)]7 (10.45)

Using the values listed above, the value of equation 10.45 is found to be 1.2435 x10-10. In a similar manner, one can calculate the likelihood under the assumptionthat y is 25 ppm, rather than 20 ppm. This yields an estimate of 7.2322 x 10-11

which is smaller than the previous estimate. So one may conclude that it is less

Data Analysis

387

likely that the values of Table 10.6 would have been obtained if the mean valuewas 25 ppm than if it was 20 ppm. The unresolved question here is whether thedifference in the two values is significant. It is also important to note that, in thisexample, if one rejects the hypothesis of a mean value of 25 ppm, one accepts (ordoes not reject) the hypothesis of 20 ppm. However, this hypothesis may still bewrong, simply because both hypotheses chosen were wrong.

Two further points are worth noting here. It has been shown that one additionalassumption is needed to compute the likelihood, that is, the distribution of thephenomenon being measured. Second, the likelihood is shown to be a very smallvalue. This follows since the likelihood is a joint probability obtained bymultiplying together all the individual probabilities for each observation. Sinceprobabilities must lie between zero and one, it follows that the likelihood will bevery small, decreasing in size as the sample size increases.

As noted earlier, maximum likelihood methods can be used to estimate thevalues of coefficients in an equation. The use of the methods in this way is knownas Maximum Likelihood Estimation (MLE). The basic concept underlying MLE isto find the set of coefficients for the independent variables in the equation whichis most likely to have given rise to the observed values (y1,y2,y3,...,yn) of thedependent variable. Thus we wish to find:

max L[(y1,y2,y3,...,yn) | Ω ] (10.46)

where Ω defines a region of possible sets of coefficients.

To find the maximum value of the likelihood function, L, the standard procedurewould be to differentiate the likelihood function with respect to Ω and thenequate this to zero:

dL[(y1,y2,y3,...,yn)!|!Ω!]dΩ = 0 (10.47)

Naturally, it is important to ascertain that the solution of this differential equationis a maximum and not a minimum or a point of inflection.

It is reasonable at this point to ask why MLE estimates of coefficients are usefulor desirable. To understand this, it is necessary to consider the desired propertiesof parameter estimates. In general, estimates of parameters should have fourproperties: consistency, lack of bias, efficiency, and sufficiency. A consistentestimator has an accuracy which increases with the size of the sample used tocompute the estimate. thus an infinite sample would yield the exact value of theparameter. An unbiased estimator has a sampling distribution with a mean of thetrue value of the estimator. In other words, if several samples are taken andestimates made of the parameter from each separate sample, the mean of the

Chapter 10

388

estimates will be the true value, or at least not statistically significantly differentfrom the true value, of the parameter if the estimates are all unbiased. Anefficient estimator exhibits a small variance for the sampling distribution of theunbiased estimator. In fact, the efficient estimator has the smallest variance ofany estimator. The idea of a sufficient estimator is a complex concept, butessentially it means that it uses all the available information from a sample for theestimation of the parameter. It has been shown that MLE estimates areconsistent, efficient, and sufficient, but not always unbiased (Ehrenfeld andLittauer, 1964); however, the bias decreases rapidly with increasing sample size.In addition, it is known that the distribution of the MLE estimate approaches thatof a normal distribution with a known variance as the sample size increases, thuspermitting the use of significance tests on the coefficient estimates obtained.

To illustrate the application of MLE to the estimation of a set of coefficients,consider a linear regression as described earlier in Chapter 10.2.1. Suppose, ashypothesised, that the error terms Œi are independent and drawn from a normaldistribution. The likelihood function for the yi's (y1,y2,y3,...,yn) is therefore:

L[(y1,y2,y3,...,yn)| Œi ] = ÎÍÈ

˚˙˘1

s 2p n

expÎÍÈ

˚˙˘

- 12s2ÂŒi2 (10.48)

This follows because Œi is assumed to have a zero mean. Taking logs of bothsides, the log likelihood is:

ln L = n ln ÎÍÈ

˚˙˘1

s 2p - !1

2s2ÂŒi2 (10.49)

For a bivariate regression between y and x, ÂŒi2 is:

ÂŒi2 = Â(yi!-!a1xi!-!a0)2 (10.50)

Substituting eqn. 10.50 into eqn. 10.49 yields:

ln L = n ln ÎÍÈ

˚˙˘1

s 2p - !1

2s2Â(yi!-!a1xi!-!a0)2 (10.51)

To obtain MLE estimates of a0 and a1, the log-likelihood is differentiated withrespect to each parameter, and the differentials set to zero.

d!ln!Ld!a0

= 12s2 2Â[-(yi!-!a1xi!-!a0) ] = 0 (10.52)

Data Analysis

389

d!ln!Ld!a1

= 12s2 2Â[(-xi)(yi!-!a1xi!-!a0) ] = 0 (10.53)

Assuming that s2 is not equal to zero, these can be solved to yield:

S(yi - a1 xi - ao) = 0 (10.54)

Sxi (yi - a1 xi - ao) = 0 (10.55)

These are the identical estimating equations to those derived on the basis of leastsquares (see eqns. 10.11 and 10.12). Hence provided that Œi is a random normalvariable with mean zero and variance s2, the least-squares estimators are alsothe MLE estimators. In the above example, the second derivatives should alsohave been found to ensure that a maximum and not a minimum likelihood hadbeen reached. In this example, since the likelihood function was linear, it wasrelatively easy to compute an analytical derivative. However, MLE is not boundto simple linear functions. It can be used with highly non-linear functions (as willbe seen in the next section) provided that an iterative search procedure is used tofind the maximum. In such situations, it is particularly important to check thesecond partial derivatives to see if one is climbing towards a maximum, awayfrom it, or has landed at a point of inflection.

10.2.6 Logit analysis

There are many situations in transport data analysis and modelling where asimple linear model is not appropriate. This is particularly the case when trying toconstruct a probability model of choice (e.g. mode choice, route choice etc.). Insuch cases, the dependent variable (i.e. probability) must be bound between zeroand one, and this necessitates the use of a non-linear model structure. Two majortypes of model have been proposed for this purpose; the probit model and thelogit model.

The logit model appears to have been put forward originally by Berkson (1944)as a simplified version, in essence, of the probit model (which describes theresponse to a stimulus in terms of a cumulative normal distribution). At the timeof Berkson's work, probit models could be calibrated only by the tediousgraphical methods described at length by Finney (1965). No further justificationof the model form was given at the time, it being noted that the logit and probitmodels produced very similar symmetrical ogives.

In recent years, however, there has been a upsurge of interest in the logit model,particularly in consumer economics and transport planning. In this new work,theoretical developments of the logit model have been put forward, showing themodel to be derivable from certain micro-economic assumptions of behaviourand from specific distributional assumptions on an error term. Since these

Chapter 10

390

derivations are well-documented, they are not repeated in detail here (see, forexample, McFadden and Domencich, 1975; Stopher and Meyburg, 1975). Sufficeit to say that the model is developed in a utility context, in which it is assumedthat people maximise their own utility in making a choice from a set of possiblepurchases or courses of action, and the utility is expressed by:

U(Xj, Si) = U'(X'j,Si) + Œji (10.56)

where U(Xj, Si) = the utility of alternative j to individual i as afunction of characteristics, xj, of the alternativeand characteristic, si, of the individual

U'(X'j, Si) = the observable utility for alternative j to individual i,

Œji = the unobservable utility (or error)

The error term, Œji, is assumed to be identically and independently distributed asa Weibull distribution (Weibull, 1951). Based on this assumption, the general formof the multinominal logit model can be derived as:

pji = !!e[U'(X'j,Si)]

Âk

!e[U'(X'k,Si)] (10.57)

where pji = the probability that individual i chooses alternative j

and k = subscript denoting any alternative from the set K

So far, the derivation does not specify the functional form of the observableutility, U'(X'j, Si). In fact, the analyst may choose any functional form that seemsuseful or applicable. In the remainder of the chapter, however, the simplestfunctional form will be used, that is, a linear form.

U'(X'j,Si) = a0j + Âs

asX'sj + Ât

atjSi (10.58)

Thus the observable utility is assumed to consist of an alternative-specific dummyvariable (coefficient a0j), a set of alternative-specific characteristics with genericcoefficients (asX'sj), and a set of individual specific characteristics with alternative-specific coefficients (aij, Si) (see Stopher and Meyburg, 1975).

The model developed above is known as the multinominal logit (MNL) model.Much of the early work in these models used a simpler form of the multinominallogit model. The MNL model is a general model that holds for any number(finite) of alternatives, K. A simple form that is also useful for exploring many of

Data Analysis

391

the properties of the model is the binary logit model, shown for choice ofalternative 1 as:

p1i = e[U'(X'1,Si)]!e[U'(X'1,Si)]!+!e[U'(X'2,Si)] (10.59)

Similarly, the binary logit model for the choice of alternative 2 is:

p2i = e[U'(X'2,Si)]!e[U'(X'1,Si)]!+!e[U'(X'2,Si)] (10.60)

This model form is used in much of the discussion in the remainder of the chapterbecause of its simple form.

Before proceeding further on the statistical properties of the model, there arecertain other properties that should be explored since they affect the conceptualvalue of the logit model. First, both the multinominal and binary forms of themodel, (see equations 10.57 and 10.59) are symmetrical. That is, the probability ofan alternative being chosen is directly proportional to a function of the utility ofthe alternative and is inversely proportional to a sum of functions of all theutilities. This latter term will be the same for all alternatives. Hence the logitmodel is symmetrical.

However, the model is frequently expressed in an asymmetrical form. Suppose,in the binary case, one were to divide the numerator and the denominator ofequation 10.59 by e[U'(X'2, Si )] . This would produce equation 10.61 which is stillidentical in fact to equation 10.59.

p1i = e[U'(X'1,Si)!-!U'(X'2,Si)]e[U'(X'1Si)-U'(X'2,Si)]!+!1

(10.61)

A similar manipulation of equation 10.60 yields:

p2i = 1e[U'(X'1Si)-U'(X'2,Si)]!+!1

(10.62)

While equations 10.61 and 10.62 are in fact identical to equations 10.59 and 10.60,respectively, the appearance of symmetry has been lost. Given the equivalence ofthe models, it should also be noted that one could equally well have chosen todivide numerator and denominator by e[U'(X'1, Si )]. For the multinominal case,any alternative may be selected as the base (i.e., the one used to dividenumerator and denominator) and a general form produced as equations 10.63and 10.64 where t is the arbitrary base alternative.

Chapter 10

392

pji = e[U'(X'j,Si)!-!U'(X't,Si)]

1!+! Âk≠ t

!e[U'(X'kSi)-U'(X't,Si)]! j≠t (10.63)

pti = 11!+! Â

k≠ t!e[U'(X'kSi)-U'(X't,Si)]

(10.64)

Several important concepts emerge from this process of manipulation. First, itreveals that the logit model is structured around the concept that choice (orprobability of occurrence of an event) is determined by the difference betweenthe respective utilities. Thus the logit model is based on an implicit decision rulethat is revealed by this manipulation. Second, given a linear utility function, itshould now be clear why the coefficients of the Si characteristics would have noeffect on the probabilities, since all would cancel out the difference formulation.The third concept derives from the model form used for calibration. As discussedlater in the chapter, the model form for calibration is always that of equations10.63 and 10.64. From this, it can be seen that all the alternative-specific dummyvariables will be in difference form, that is, (aok - aot). Hence there are Kalternatives in the datset.

The second property of interest for the logit model concerns the shape of thecurve that it yields. For the binary case, this curve is shown in Figure 10.34,where the model is:

pji = e(Uji)e(Uki)+e(Uki) (10.65)

Data Analysis

393

1.00

0.50

0.00Difference in Utilityin Favour of Alternative A

0

Probit ModelLogit Model

Figure 10.34 Logit and Probit Curves for Equations 10.65 and 10.66

If a probit curve is drawn for equation (10.66), this is found to produce a verysimilar curve, as shown in Figure 10.34.

pji = 12p

ıÛ

-•

uj!!e-1/2u2 du (10.66)

In fact, calibrating probit and logit curves on the same data produces curves thatare generally statistically indistinguishable from each other (Stopher andLavender, 1972). It must be noted, however, that the assumptions and conceptsbehind these two models are significantly different and these differences mayhave far-reaching effects upon the performance and use of the models. A morecomplete description of the calibration of the logit model using MLE and the useand interpretation of this type of model is beyond the scope of this book. Theinterested reader is referred to texts such as Stopher and Meyburg (1979),Hensher and Johnson (1980), and Ben-Akiva and Lerman (1985).



11. Finalising the Survey

Once the data has been collected and the analysis has been completed, there is anatural tendency for you to think that the project is nearly finished, and that it istime to move on to something new. However, there are two major tasks still tobe completed. First, the results of the survey must be communicated effectivelyto the sponsors and other interested parties. Obviously there is little point toconducting surveys if this task is not handled well. Second, there is a need for youto make sure that the data are stored in a manner which ensures easy retrievalby others. While you may not want to make further use of the data right now, itis most likely that either you, or more likely someone else, will want to use thedata in the future; therefore it has to be easily accessible. This ease of accessibilitycovers both the physical media on which the data is stored (tapes, disks) plusdocumentation which will allow someone else to make sense of the contents ofthe data set.

This chapter will address the issues and methods involved in the presentation ofthe results of the survey, in the storage of the data, and in the documentation ofthe survey method.

Chapter 11

396

11.1 PRESENTATION OF RESULTS

It is a well-worn phrase that data is not the same as information. Data must beanalysed, interpreted and presented properly before it can be classified asinformation. The analysis of data and the interpretation of results have beencovered in the previous chapter; this chapter describes some of the techniques bywhich the information content of data may be effectively presented. In doing so,it draws upon two of the most readable and interesting books on the topic ofgraphical presentation of quantitative data that it has been my pleasure to read(Tufte, 1983, 1990), and which should be on the bookshelf of anyone involved indata collection and analysis. Tufte (1983) describes "data graphics" as deviceswhich "visually display measured quantities by means of the combined use ofpoints, lines, a coordinate system, numbers, symbols, words, shading, and color".Although data graphics are a relatively recent invention (about 200 years old),the recent explosion in multi-media computer graphics capabilities has givenmany people the illusion that they can create attractive and meaningful graphics.If you are one of these people, then you should buy and read Tufte's books -quickly.

As an underpinning to data graphics, Tufte (1983) lists several principles ofgraphical excellence which help to communicate complex ideas with clarity,precision and efficiency. These principles are:

• show the data

• induce the viewer to think about the substance rather than aboutmethodology, graphic design, the technology of graphic production,or something else

• avoid distorting what the data have to say

• present many numbers in a small space

• make large data sets coherent

• encourage the eye to compare different pieces of data

• reveal the data at several levels of detail, from a broad overview tothe fine structure

• serve a reasonably clear purpose: description, exploration,tabulation, or decoration

• be closely integrated with the statistical and verbal descriptions of adata set.

Finalising the Survey

397

Tufte states that "graphics reveal data", and in this he is closely aligned with theideas of Tukey (1977) and the role of graphics in exploratory data analysis; thefirst rule of any data analysis is "plot the data". Thus, good graphics can be usefulboth in analysis and presentation; indeed, good graphical presentationsencourage the reader (viewer) to conduct exploratory analysis beyond what isreported by the original analyst.

11.1.1 Avoid Distorted Graphics

One of Tufte's primary rules is to avoid the use of graphics which distort the datathey are trying to represent. There are three classic distortions whichunfortunately appear far too often in practice; the three-dimensional (3-D) pie-chart, the use of non-zero origins, and the use of multi-dimensional objects torepresent uni-dimensional data values.

The 3-D pie-chart is often used to add "glamour" to a presentation which wouldnormally only use a 2-D pie-chart. A comparison of 3-D and 2-D pie-charts whichuse the same data is shown in Figure 11.1. It is clear from the 2-D pie-chart thatthe values at the top, the right and the bottom are all the same size (they are each8% of the total). However, in the 3-D pie-chart, it looks as though the 8%segment at the front (bottom) is bigger than the 8% segment at the back (top)which is, in turn, much bigger than the 8% segment on the right. The perspectiveview used by the 3-D pie-chart has distorted the data in the pie-chart (which isexactly what perspective views are supposed to do!). Unfortunately, manyanalysts use 3-D pie-charts without realising the distortion involved, because theyknow the values that they put into the chart and they don't see the distortionwith an unbiased set of eyes. The distortion is particularly severe when valuelabels are not printed next to the pie segments, as in Figure 11.1.

Figure 11.1 Comparison of 3-D and 2-D Pie-Charts

Chapter 11

398

The second major type of distortion is the use of a non-zero value as the origin ofthe vertical axis of charts. The non-zero origin distortion is used so frequentlythat one would think that it is an accepted part of graphical presentations. Anexample is shown in Figure 11.2. By having the vertical axis start at a non-zerovalue (in this case, 396), the trend in the data is much more pronounced in thechart on the left than when it is plotted at natural scale on the right. While closeinspection will show that the data in the two charts are the same, that is not theimpression given at first glance. When such charts are used in audio-visualpresentations, where the viewer does not have much time to read the axis labelsin detail, a very false impression can be conveyed. Unfortunately, the chart onthe left is the one that was produced automatically by a popular spreadsheetpackage; the chart on the right was only obtained by manual intervention.

396

398

400

402

404

406

408

410

0

100

200

300

400

500

Figure 11.2 Comparison of Non-Zero and Zero Origin Charts

The third major type of distortion is the use of multi-dimensional objects, such ascircles, cylinders and spheres to represent a one-dimensional value. Tufte (1983,pp 69-73) gives some lovely examples, but a simple example will suffice. Intransport, one often tries to represent populations or trip destinations in agraphical format. A method that is often used is to use circles centred on thelocation, the size of which are proportional to the number of trips (or thepopulation of an area). An example is shown in Figure 11.3.

In the diagram on the left, the diameter of the circles is proportional to thenumber of trips, whereas in the diagram on the right the area of the circles isproportional to the number of trips. In both cases, the biggest circle is five timesthe smallest circle (either in diameter or area). In neither case, however, does thebiggest circle look five times bigger. When diameters are used, it looks muchmore than five times bigger, and when areas are used it does not look as thoughit is five times bigger. Neither representation is correct, however, because we aretrying to use a two-dimensional object (a circle) to represent a one-dimensionalquantity (the number of trips). This problem is becoming more widespread as


399

spreadsheet and graphing packages now allow the user to substitute multi-dimensional graphical objects for simple shading in histograms and other graphs.

Figure 11.3 Representing One-Dimensional Quantities with 2-D Objects

11.1.2 Maximise the Data-Ink Ratio

Tufte (1983) invented the concept of the data-ink ratio as a means of measuringwhether a graphic was concentrating on the data content or on decoration. Hedefined "data-ink" as the "non-erasable core of a graphic, the non-redundant inkarranged in response to variation in the numbers represented". He then definesthe data-ink ratio as:

Data-ink Ratio = data-inktotal!ink!used!to!print!the!graphic

Two principles which then follow are to:

• maximise the data-ink ratio, within reason

• erase non-data-ink, within reason

Awareness of the concept of the data-ink ratio leads to the generation of graphicswhich are elegant in their simplicity, and which eschew the use of decoration fordecoration's sake. It leads one to question whether or not certain lines, shadings,borders and text need to be on a graphic or not. If they can be removed withoutremoving meaning from the graphic, then remove them. Compare the highdata-ink ratio graph on the right of Figure 11.4 with the low data-ink ratio graphon the left. Both convey the same amount of information, but the graph on theright does so with much less ink. Its appearance is clean and distinctive.

Chapter 11

400

0

5

1 0

1 5

2 0

2 5

1 2 3 4 5 60

5

10

15

20

25

1 2 3 4 5 6

Figure 11.4 Comparison of Low Data-ink and High Data-ink Ratio Graphs

11.1.3 Minimise Chart Junk

One of the major causes of a low data-ink ratio is the proliferation of "chart junk"on graphs and charts. Chart junk, or graphical decoration, has proliferated sincethe widespread availability of computer graphics packages. Major sources ofchart junk are shading patterns, gridlines, and superfluous decorations (whatTufte terms "ducks"). Poor choice of shadings can result in severe Moiré vibrationeffects, where the patterns appear to be continually moving. Some examples ofpoor shading are shown in Figure 11.5.

Figure 11.5 Poor Choices of Shading Patterns

Figure 11.6 shows a combination of superfluous gridlines and a 3-D "duck" in thefigure on the left. The gridlines on the side, rear and floor of the three-dimensional space do little to improve the clarity of the graph. This iscompounded by the use of a 3-D ribbon graph which adds nothing to the


401

presentation of the data. Firstly, it is difficult without close inspection todetermine which of the ribbons is at the front of the graph (i.e. which databelongs to series S1 and which belongs to S2). Secondly, it looks like the dataseries move in sympathy with each other in that they both appear to dip in themiddle of the graph. However, examination of the simpler graph on the rightshows that at the fourth data point along the horizontal axis, the two data setsare moving in opposite directions (series 1 is dipping while series 2 is peaking). Amore accurate and easier to understand picture of the data is obtained by use ofthe simpler graphic.

1 2 3 4 5 6

S1S2

0

10

20

30

0

5

10

15

20

25

1 2 3 4 5 6

Series1 Series2

Figure 11.6 Superfluous Gridlines and 3-D Effects

11.1.4 Shape and Orientation of Graphs

Tufte (1983) notes that graphics should tend towards the horizontal, greater inlength than in height. If there is a cause-effect relationship being displayed, thenthe cause should be plotted on the horizontal axis and the effect on the verticalaxis (unlike an economist's demand curve). Labels should read from left to right,rather than having words stacked vertically, or having the words run from top tobottom or vice versa. While some prefer the ratio of the horizontal width to thevertical height to be that of the "Golden Section" (viz. 1.618), which is such thatthe ratio of the height to the width is the same as the width to the sum of theheight and width, Tufte merely suggests that the width should be about 50%greater than the height.

11.1.5 The Friendly Data Graphic

Tufte (1983) encapsulates many of his principles of graphic design in what heterms the "friendly data graphic"; that is, one that looks as though the designerhad the viewer in mind at every point while constructing the graphic. Thefeatures of the "friendly data graphic" are summarised in Table 11.1 by way ofcontrast with the unfriendly data graphic.

Chapter 11

402

Table 11.1 Characteristics of the Friendly Data Graphic

Friendly Unfriendly

words are spelled out, mysterious codes areavoided

abbreviations abound, requiring the readerto find the explanations in the text

words run from left to right words run vertically, particularly along theY-axis; words run in several directions

messages are attached to explain the data graphic is overly cryptic, requiring repeatedreferences to the text

elaborate shadings, cross-hatchings andcolours are avoided; labels are placeddirectly on the graphic, removing the needfor a legend

obscure codings and shadings requirerepeated cross-referencing between thelegend and the graphic

graphic attracts the viewer, promotescuriosity

graphic is overpowering, filled withchartjunk

if colours are used, they are chosen withthe colour blind in mind; blue can generallybe distinguished by most colour-deficientpeople

design is insensitive to colour-deficientviewers; red and green are used for essentialcontrasts

type is clear, precise and modest, withminimum number of fonts used

type is all bold, large and in many fonts

type is upper and lower case, with serifs type is all capitals, sans serif

11.2 DOCUMENTATION OF SURVEY

The preparation and distribution of reports is the investigator's primary meansof conveying the results and methodology of the survey to interested parties. Inpreparing the survey report, four principles must be kept in mind. First, it is upto the investigator to explain fully the purpose and scope of the survey. It may becompletely obvious to the investigator why the survey was carried out.However, it is almost certain that it is not as obvious to the reader of the reportwho is generally not as deeply immersed in the subject as the investigator.Without this background, it will be difficult for the reader to place the remainderof the report in its proper context.

Second, the investigator must keep in mind the type of reader(s) for whom thereport is being written, the extent of their knowledge, the type of problem that islikely to be of interest to them and the kind of language to which they areaccustomed. In many cases, a question-and-answer format for the reports willassist readers to find the information they seek from the report.


403

Third, and related to the above, it is the responsibility of the investigator totranslate statistical technicalities into language which will be understood by thereader who is primarily interested in the substantive results of the survey.Masses of standard errors, significance levels and the like are of no help if theyare not understood in the context of the subject matter of the survey.

Fourth, non-quantified verbal descriptions of results are very useful in helping todevelop understanding of the survey results. To many readers, statistical tablesare dull and difficult to comprehend. A certain amount of verbatim quotation(obtained directly from open-ended questions), as well as verbal summaries oftables, helps to make the report easier to digest.

As regards the format and content of survey reports, there are a number ofdifferent formats that can be adopted. The United Nations recommendations forthe preparation of sample survey reports serve as a useful guide. Theyrecommend three types of report - a preliminary report, a general report and atechnical report. Full details of these reports are described in Moser and Kalton(1979) but the major features of each are summarised below.

The preliminary report is often required to make available data of currentinterest as soon as possible. It should contain a brief statement concerning thesurvey methods and the limitations of the data. As a minimum requirement,information should be given concerning the size of the sample, the method ofselecting the sample and any discrepancies observed between the sample andexternal data sources.

The general report should contain a general description of the survey for the useof those who are primarily interested in the results of the survey rather than inthe technical aspects of the sample design, survey execution and analysis.Nonetheless, there must be sufficient description of the survey methodology toensure that the survey results are not taken out of context or misinterpreted. Thegeneral report should include information on the following aspects of the survey:

(a) Statement of purposes of survey;(b) Description of the sample coverage;(c) Method of collection of information;(d) Repetition details (if a continuing survey);(e) Numerical results;(f) Date and duration of survey;(g) Description of accuracy;(h) Cost of survey;(i) Assessment of survey success;(j) Responsibility for survey;(k) References.

Chapter 11

404

The technical report should be issued for surveys of particular importance andthose using new techniques and procedures of special interest. The report shoulddeal in detail with technical statistical aspects of the sampling design, surveyexecution and analysis. It should include information on the following points:

(a) Specification of the sampling frame;(b) Design of the sample;(c) Personnel and equipment;(d) Statistical analyses and computational procedures;(e) Accuracy of results including discussion of random sampling errors

and sampling bias;(f) Adequacy of sampling frame;(g) Comparisons with other sources of information;(h) Costing analysis;(i) Efficiency of survey method;(j) Observations on survey staff.

Although the above procedures provide a useful way in which to document thesurvey, the authors believe that the framework adopted in this book provides anequally effective means of documenting a survey. Since the survey should havebeen designed using the procedures outlined in the preceding chapters, then thefollowing outline should provide a useful means of writing up the surveydocumentation.

Chapter 1 Preliminary Planning

• Administrative Details of the Survey- the name of the survey?- who sponsored the survey?- who designed the survey?- who collected the survey data?- who analysed the survey data?- was there an Advisory Committee or Panel?- dates and duration of the survey?

• Overall Study Objectives- what were the objectives of the project to which this survey

contributed?- why was a survey needed?

• Specific Survey Objectives- what were the specific objectives of this survey?

• Review of Existing Information


405

- what prior information was available?- what secondary information was available for sample expansion?

• Formation of Hypotheses- what specific hypotheses, if any, were to be tested?

• Definition of Terms- what definitions are being used by the survey team for key

items such as trip, household, mode, income etc. (as relevant tothe specific survey)?

• Determination of Survey Resources- what time was available for completion of the survey?- how much money was available for the survey?- what people were available to work on the survey?

Chapter 2 Selection of Survey Method

• Selection of Survey Time Frame- was the survey cross-sectional or time-series (and why)?

• Selection of Survey Technique- what methods were considered for the survey technique?- what testing was performed on the different methods?- what method was finally selected (and why)?

Chapter 3 Sample Design

• Definition of Target Population- what was the population for the survey?- how was this population defined and identified?

• Sampling Units- what unit was used for sampling?

• Sampling Frame- what sampling frame was used?- where was the sampling frame obtained from?- how was the sampling frame obtained?- why was the sampling frame first compiled?- how did the sampling frame perform in term of:

- accuracy- completeness- duplication- adequacy- up-to-dateness

Chapter 11

406

• Sampling Method- what sampling methods were considered?- what sampling method was finally chosen (and why)?- was the selected sample representative of the population"

- if not, how will this be corrected later?- what was the specific sampling procedure (full details)?

• Consideration of Sampling Bias- what sources of sampling bias were considered?- how serious were these biases considered to be?- what steps were taken to overcome these sources of bias?

• Sample Size and Composition- what was the final sample size?- what stratifications were used in the sample design?- how was the sample size calculated?

- what were the key variables considered?- what was the variability of these variables?- what confidence limits were used?- what levels of confidence were used?

• Estimation of Parameter Variances- how are parameter variances to be estimated in the data analysis?

• Conduct of Sampling- what procedure was used in selecting the sample?- was random sampling used at all stages of sampling?

Chapter 4 Survey Instrument Design

• Question Content- what are types of information being sought in the survey?

• Trip Recording Techniques- how are trips and activities being sought from respondents?

• Physical Nature of Forms- what is the physical nature of the survey forms?

- what paper size and weight was used?- what colours and printing methods were used?

• Question Types- what classification questions were asked?

- where did the classification categories come from?- what attitude questions were asked?

- what testing was performed on the attitude scales?


407

• Question Format- which questions were asked as open questions (and why)?- which questions were asked as closed questions (and why)?

- where did the closed question categories come from?

• Question Wording- how has the question wording been tested for:

- simple vocabulary- words appropriate to the audience- length of questions- ambiguous questions (get someone else to read them)- double-barrelled questions- vague words- loaded questions- leading questions- double negatives- stressful questions- grossly hypothetical questions- the effect of response styles- periodicity questions

• Question Ordering- what reasons are there for the question ordering?

• Question Instructions- what instructions were provided for

respondents/interviewers?

Chapter 5 Pilot Survey(s)

• Description of Pilot Surveys- what pilot testing was performed?- if no pilot testing was done, why not?

• Size of the Pilot Survey

• Lessons from the Pilot Survey- how adequate was the sampling frame?- what was the variability within the survey population?- what response rate was achieved?- how suitable was the survey method?- how well did the questionnaire perform?- how effective was the interviewer training?- did the coding, data entry, editing and analysis procedures

work satisfactorily?

• Cost and Duration of Pilot Surveys

Chapter 11

408

Chapter 6 Administration of the Survey

• Survey Procedures- Self-Completion Questionnaires

- pre-contact procedures- mail-out procedures- response receipt procedures- phone enquiry procedures- postal reminder regime- telephone follow-ups- validation interviews- non-response interviews

- Personal Interviews- pre-contact procedures- call-back procedures- maintenance of survey logs- interviewer payment methods- field supervisor tasks- work distribution procedures

- Telephone Interviews- sampling procedures- dealing with non-response- use of CATI systems

- Intercept Surveys- procedures for obtaining respondents- distribution of surveys- collection of surveys

- In-depth Interview Surveys- pre-contact procedures- call-back procedures- maintenance of survey logs- recording methods- transcription methods- interpretation of responses

Chapter 7 Data Processing

• Selection of Coding Method- what physical method was used for data coding?

• Preparation of Code Format- what coding frame was used?

(provide full coding frame in Appendix)- what location-coding method was used?


409

• Development of Data Entry Programs- what special data entry programs were developed?

(provide screen-shots of data entry screens in Appendix)

• Coder and Data Entry Training- what training was provided for coders and data enterers?

(provide training manual in Appendix)

• Coding Administration- how was the coding administered?- what quality control procedures were implemented?- how were changes made to coding frames?

Chapter 8 Data Editing

• Initial Questionnaire Editing- what in-field checking was done by interviewer/supervisor?- what checking was done on receipt in survey office?

• Verification of Data Entry- was data entry verified for accuracy?

• Development of Editing Computer Programs- were special data editing programs developed?

• Consistency and Range Checks- what permissible range checks were applied?

(provide full list of checks in Appendix)- what logic checks were applied?

(provide full list of checks in Appendix)

• Missing Data- how was missing data coded?- were estimates made of missing values?

Chapter 9 Data Correction and Expansion

• Editing Check Corrections- what procedures were used for office edits?

• Secondary Data Comparisons- what secondary data was used for sample expansion?- what variables were used for expansion purposes?- was expansion based on cross-tabulations or marginal totals?- what were the final expansion factors?- how are they to be applied when using the data?

Chapter 11

410

• Corrections for Internal Biases- what recognition was there of non-reported data?- were non-reporting factors calculated?- if so, how are they to be applied to the data?- what recognition was there of non-response?- were non-response factors calculated?- if so, how are they to be applied to the data?

Chapter 10 Data Analysis and Management

• Exploratory Data Analysis- what EDA methods were used?

• Model Building- is the data to be used to build specific models?

• Interpretation of Results- are any limitations on the data clearly stated?- how is the sampling error expressed?

• Database Management- is the structure of the datafiles clearly described?- are the relationships between data files clear?

• Provision of Data Support Services- what support is available for users of the data?- is it clear where such support can be obtained?

Chapter 11 Presentation of Results

• Presentation of Results of Analysis- are the major descriptive results presented:

- in a clear visual manner?- with accompanying written explanations?- with appropriate interpretations?- and with clear statement of any qualifications?

• Publication of Results- are the results of the survey or the survey methodology written up

in concise form, and available in the general literature?


411

Chapter 12 Tidying-Up

• Storage and Archival of Data- where is the data stored?- who is the contact person?- are telephone, fax and e-mail numbers provided?- is this documentation stored electronically with the data?- has the data been lodged with any other archival service?

• Completion of Administrative Duties- have all survey staff been fully paid?- have all outstanding bills been paid?- what arrangements have been made for destroying original

questionnaires?

These same headings are reproduced, using a question and answer format, inAppendix 3 in the form of a Survey Design Checklist document that can becopied and used to assist in the planning and execution of travel and activitysurveys.

In addition to the preparation of reports by conventional means, considerationshould also be given to the preparation of verbal, visual and electronicpresentations describing the survey and its results. Recent advances in "desktoppresentation" techniques on widely available microcomputers promise a muchmore effective means of communication than conventional written reports.Programs such as Hypercard on the Apple Macintosh and presentation softwaresuch as Microsoft Powerpoint enable the compilation of audio-visualpresentations which can effectively and dramatically convey the results of thesurvey to decision-makers in a readily understandable and flexible fashion.Wigan (1988) has taken steps in this direction with a prototype system based onAustralian transportation surveys; further work is currently being performed bythe authors. It is expected that in the next few years, graphics-basedmicrocomputers will be playing a major role in the design, analysis andpresentation of results from most transportation surveys.

Chapter 11

412

11.3 TIDYING-UP

The final task in the survey process is the often thankless task of tidying-up. Theeffective tidying-up of the data is the only way in which the data will be availablefor secondary analysis at a later date (Hyman 1972; Wigan 1985).

In discussing the requirements of data for secondary analysis, Wigan (1985)raises, among many other issues, the following points:

(a) The distribution of data for secondary analysis has been greatlyimproved by the establishment of the Inter-University Consortium forPolitical and Social Research (ICPSR) at the Institute for Social Research inAnn Arbor, Michigan. This institute gathers data sets and arranges fortheir storage, maintenance and distribution according to pre-setstandards.

(b) The availability of data for secondary analysis is often restricted becauseof confidentiality requirements imposed on the data by the originalsponsor of the survey. Key data items may be omitted from the data setafter primary analysis, or the data may be aggregated in such a way thatprevents meaningful secondary analysis.

(c) As mentioned earlier, considerable effort can be expended in theestimation of weighting factors which enable the sample data to moreclosely represent the population from which it was drawn. In manycases, these weighting factors are not stored with the original data, withthe result that the person doing secondary analysis must repeat this time-consuming process.

(d) Because of the many adjustments which may be made to a data fileduring primary analysis, it is usual for several versions of a data file toexist at any one time. It is essential to know which version has beenarchived, and to ensure that the documentation which has been archivedmatches the version of the data file which has been archived.

(e) It is essential that the code books and other documentation be archivedwith the data tape, so that definitions of variables and the codes used areclearly defined.

The importance of tidying-up is only really appreciated when you attempt to usesomeone else's data and realise how much you wish they had documented theirsurvey technique, arranged for storage of the data with enough information topermit easy access, and let someone know where the data was stored and whohad continuing responsibility for its upkeep. The golden rule in tidying-up is

"Do unto others as you would have them do unto you".


List of References

Abbess C, Jarrett D & Wright CC (1981). 'Accidents at Blackspots: Estimating theEffectiveness of Remedial Treatment, with Special Reference to the 'Regression-to-the-Mean' Effect'. Traffic Engineering And Control, 22, 10, 535-542.

Abrams MA (1972). A Short Guide to Social Survey Methods. Bedford Square Press of theNational Council of Social Science: London.

Ackoff RL (1965). Individual Preferences for Various Means of Transportation. ManagementScience Center, University of Pennsylvania, Pennsylvania.

Akcelik R (1983). Progress in Fuel Consumption Modelling for Urban Traffic Management.ARR Report 124, Australian Road Research Board, Melbourne.

Alonso W (1968). 'The Quality of Data and Choice and Design of Predictive Model'.Highway Research Board , Special Report 97 , 178-192.

Ampt Applied Research & Transport Studies Unit Macquarie University (1989). IntercityTravel Survey 1987-88. Report for the Very Fast Train Consortium, Canberra.

Ampt Applied Research (1988). Very Fast Train Feasibility Study. Report for the Very FastTrain Consortium, Melbourne.

Ampt Applied Research (1990). 1989/90 New Zealand Travel Survey. Final Report for theNew Zealand Ministry of Transport, Wellington, NZ.

Ampt E, Bradley M & Jones PM (1987). Development of an Interactive, Computer-AssistedStated Preference Technique to Study Bus Passenger Preferences. Presented at the 66thAnnual Meeting of the Transportation Research Board, Washington, DC.

Ampt ES & Jones PM (1992). Attitudes and Responses to Traffic Congestion and PossibleFuture Counter-measures: An Exploratory Study of Household Travel in Bristol.Reference 683, Transport Studies Unit, Oxford University.

Ampt ES & Richardson AJ (1994). The Validity of Self-Completion Surveys for CollectingTravel Behaviour Data. PTRC European Transport Forum, Transportation PlanningMethods 2.

Ampt ES & Waters F (1993). Melton-Bacchus Marsh Public Transport Study. TRC WorkingPaper TWP93/1, Transport Research Centre, Melbourne.

Ampt ES & West L (1985). The Role of the Pilot Survey in Travel Studies.. In ES Ampt, AJRichardson & W Brög (Eds), New Survey Methods in Transport. VNU Science Press:Utrecht, The Netherlands.

Ampt ES (1981). Some Recent Advances in Large Scale Travel Surveys. Presented at thePTRC 9th Summer Annual Meeting, University of Warwick, UK.

Ampt ES (1989). Comparison of Self-Administered and Personal Interview Methods for theCollection of 24-Hour Travel Diaries. Selected Proceedings of the Fifth WorldConference on Transport Research, Vol 4, Contemporary Development in TransportModelling, D195-D206.

Ampt ES (1992). Quality and Cost-Effectiveness in Recording a 24-Hour Travel Diary.Forum Papers, 8th Australian Transport Research Forum, Part 1, 77-88.

Ampt ES (1992). Werribee Travel Survey Report Final Report. TRC Working PaperTWP92/14, Transport Research Centre, Melbourne.

Ampt ES (1993). The Victorian Activities and Travel Survey (VATS) Pilot SurveyObjectives. VITAL Working Paper VWP93/2, Transport Research Centre, Melbourne.

Ampt ES (1994). The Victorian Activities and Travel Survey (VATS) Manual. VITALWorking Paper VWP94/1, Transport Research Centre, Melbourne.

Ampt ES, Blechinger W, Brög W & Messelink H (1983). Understanding Travel Survey DataValidation. Forum Papers, 17th Australian Transport Research Forum, Part 1, 344-359.

Ampt ES, Richardson AJ & Brög W (1985). New Survey Methods in Transport. VNUScience Press: Utrecht, The Netherlands.

Anderson J & Berdie D (1975). 'Effects on Response Rates of Formal and InformalQuestionnaire Follow-up Techniques'. Journal of Applied Psychology , 60, 2, 255-257.

Anderson J,F Niebuhr M,A Braden A & Alderson SR (1986). 'Telephone Interviews: Cost-effective Methods for Accurate Travel Surveys'. Transportation Research Record ,1097, 4-7.

Anderson NH (1970). 'Functional Measurement and Psychophysical Judgment'. PsychologicalReview , 77, 153-170.

Anderson NH (1971). 'Integration Theory and Attitude Change'. Psychological Review , 78,171-206.

Anderson NH (1972). Algebraic Models in Perception. Report No 30, Center for HumanInformation Processing Department of Psychology University of California, San Diego.

Anderson NH (1974). Information Integration Theory: A Brief Survey. In DH Krantz, RCAtkinson, RD Luce & P Suppes (Eds) , Contemporary Developments in MathematicalPsychology (Vol. 2). Freeman: San Francisco.

Anderson NH (1976). 'How Functional Measurement Can Yield Validated Interval Scales ofMental Quantities'. Journal of Applied Psychology, 61, 6, 677-692.

Aplin WN & Flaherty HM (1976). Sampling Processes for the National Travel Survey.Occasional Paper 5, Bureau of Transport Economics Australian Government PrintingService, Canberra.

Aplin WN & Hirsch NA (1978). National Travel Survey 1977-78: Geographic Zoning andCoding System. Occasional Paper 21, Bureau of Transport Economics AustralianGovernment Printing Service, Canberra.

APTA (1985). Transit Fact Book. American Public Transit Association: Washington, DC.Armstrong JS & Overton TS (1977). 'Estimating Non-response Bias in Mail Surveys'. Journal

of Marketing Research , 14, 3, 396-402.Armstrong JS (1978). Long-range Forecasting: From Crystal Ball to Computer. John Wiley &

Sons: New York.Attanucci J Burns I & Wilson N (1981). Bus Transit Monitoring Manual 1. Report No

UMTA-IT-09-9008-81-1, Urban Mass Transportation Administration, Washington DC.

Axhausen KW (1994). Travel Diaries: An Annotated Catalogue. Working Paper, Centre forTransport Studies, University of London.

Baass KG (1981). 'Design of Zonal Systems for Aggregate Transportation Planning Models'.Transportation Research Record, 807, 1-7.

Babbie ER (1973). Survey Research Methods. Wadsworth: Belmont, California.Baker RP & Lefes WL (1988). The Design of CATI Systems: A Review of Current Practice.

In RM Groves, PP Biemer, LE Lyberg, JT Massey, WL Nicholls & J Waksberg (Eds) ,Telephone Surveying Methodology. John Wiley & Sons: New York.

Barnard PO (1981). Adelaide Travel Demand and Time Allocation Study: QuestionnaireForms Interview and Coding Manuals. Internal Report AIR 352-2, Australian RoadResearch Board, Melbourne.

Barnard PO (1985). Evidence of Trip Under-Reporting in Australian Transportation StudyHome Interview Surveys and its Implications for Data Utilisation. In ES Ampt, AJRichardson & W Brög (Eds), New Survey Methods in Transport. VNU Science Press:Utrecht, The Netherlands.

Bates JJ (1979). 'Sample Size and Grouping in the Estimation of Disaggregate Models - ASimple Case'. Transportation , 8, 347-369.

Baughan CJ (1979). Public Attitudes to Alternative Sizes of Lorry - a Study in a ResidentialArea. Supplementary Report 509, Transport & Road Research Laboratory, CrowthorneUK.

Bayley CA (1983). Vehicle Classification Algorithm. Internal Report AIR 376-2, AustralianRoad Research Board, Melbourne.

Beadle A & Paulley NJ (1979). A Survey of Long-Distance Journeys Made by ManchesterResidents in 1974. Supplementary Report 487, Transport & Road Research Laboratory,Crowthorne UK.

Beardwood JE (1981). Sampling Errors in Grossed-up or Manipulated Survey Data. Presentedat the Conference on Transport Surveys - Design for Efficiency, The University ofLeeds, UK.

Ben-Akiva ME & Lerman S (1985). Discrete Choice Analysis. MIT Press: Cambridge, Mass.Ben-Akiva ME, Macke PP & Hsu PS (1986). 'Alternative Methods to Estimate Route-Level

Trip Tables and Expand On-Board Surveys'. Transportation Research Record , 1037, 1-11.

Berkson J (1944). 'Application of the Logistic Function to Bio-Assay'. Journal AmericanStatistical Association , 39, 357-365.

Bethlehem JG & Hundepool AJ (1992). Integrated Survey Processing on Microcomputers.Presented at the Survey and Statistical Computing Conference, Bristol, UK.

Bhat C (1994). Estimation of Travel Demand Models with Grouped and Missing IncomeData. Presented at the 7th International Conference on Travel Behaviour Research,Santiago, Chile.

Blankenship AB (1977). Professional Telephone Surveys. McGraw-Hill : New York.Blechinger W, Brög W & Messelink HWB (1985). 'Comments on Airport Survey Methods

using Schiphol Airport in Amsterdam as an Example'. Transportation Research Record ,1025, 22-26.

Bock RD & Jones LV (1968). The Measurement and Prediction of Judgement and Choices.Holden Day: San Francisco.

Bonsall PW & McKimm J (1993). 'Non-Response Bias in Roadside Mailback Surveys'.Traffic Engineering and Control, December , 582-591.

Bonsall PW & Parry T (1991). 'Using an Interactive Route-Choice Simulator to InvestigateDrivers' Compliance with Route Guidance Advice'. Transportation Research Record ,1036, 59-68.

Bonsall PW (1980). A Survey of Attitudes to Car Sharing: A Data Base for Microsimulation.Supplementary Report 563, Transport & Road Research Laboratory, Crowthorne UK.

Bonsall PW (1985). Transfer Price Data - Its Definition, Collection and Use. In ES Ampt, AJRichardson & W Brög (Eds), New Survey Methods in Transport. VNU Science Press:Utrecht, The Netherlands.

Bottom CG & Jones PM (1982). Don't Forget the Driver - A Pilot Study of Attitudes toSchedules and Working Conditions. Reference (187/CP), Transport Studies Unit,Oxford University.

Bovy PHL & Bradley MA (1986). 'Route Choice Analyzed with Stated-PreferenceApproaches'. Transportation Research Record , 1037, 11-21.

Bowyer DP (1980). User Group Identification. In Paratransit, Changing Perceptions of PublicTransport. Australian Government Printing Service: Melbourne.

Bradburn NM & Sudman S (1979). Improving Interview Methods and Questionnaire Design.Jossey-Bass: San Francisco.

Bradley M & Daly A (1994). 'Use of the Logit Scaling Approach to Test for Rank-Order andFatigue Effects in Stated Preference Data'. Transportation , 21, 2, 167-184.

Brillinger DR (1966). 'The Application of the Jackknife to the Analysis of Sample Surveys'.Journal of the Market Research Society , 8, 74-80.

Brög W & Ampt ES (1983). 'State of the Art in the Collection of Travel Behavior Data'.Transportation Research Board , Special Report 201 , 48-62.

Brög W & Erl E (1980). 'Interactive Measurement Methods: Theoretical Bases and PracticalApplications'. Transportation Research Record , 775, 1-6.

Brög W & Erl E (1981). Development of a Model of Individual Behaviour to Explain andForecast Daily Activity Patterns in Urban Areas (Situational Approach). Presented atthe PTRC 9th Summer Annual Meeting, University of Warwick, UK.

Brög W & Meyburg AH (1980). 'Nonresponse Problem in Travel Surveys: an EmpiricalInvestigation'. Transportation Research Record , 775, 34-38.

Brög W & Meyburg AH (1981). 'Consideration of Non-Response Effects in Large-ScaleMobility Surveys'. Transportation Research Record , 807, 39-46.

Brög W & Meyburg AH (1982). Influence of Survey Methods on the Results ofRepresentative Travel Surveys. Presented at the 61st Annual Meeting of theTransportation Research Board, Washington, DC.

Brög W & Neuman K (1977). The Interviewee as a Human Being. Presented at the XLIVESOMAR Seminar on "Ways and New Ways of Data Collection", Jouy-en-Josas,France.

Brög W & Otto K (1981). Potential Cyclists and Policies to Attain this Potential. Presented atthe PTRC 9th Summer Annual Meeting, University of Warwick, UK.

Brög W & Zumkeller D (1983). An Individual Behavioural Model Based on the SituationalApproach (SINDIVIDUAL). Presented at the World Conference on TransportResearch, Hamburg, West Germany.

Brög W (1982). 'Application of the Situational Approach to Depict a Model of PersonalLong-Distance Travel'. Transportation Research Record , 890, 24-33.

Brög W, Erl E, Meyburg AH & Wermuth MJ (1982). 'Problems of Non-Reported Trips inSurveys of Nonhome Activity Patterns'. Transportation Research Record , 891, 1-5.

Brög W, Erl E, Meyburg AH & Wermuth MJ (1983). 'Development of Survey InstrumentsSuitable for Determining Nonhome Activity Patterns'. Transportation Research Record, 944, 1-12.

Brög W, Fallast K, Katteler H, Sammer G & Schwertner B (1985). Selected Results of aStandardised Survey Instrument for Large-Scale Travel Surveys in Several EuropeanCountries. In ES Ampt, AJ Richardson & W Brög (Eds), New Survey Methods inTransport. VNU Science Press: Utrecht, The Netherlands.

Brög W, Meyburg AH, Stopher PR & Wermuth MJ (1985). Collection of Household Traveland Activity Data: Development of an Instrument. In ES Ampt, AJ Richardson & WBrög (Eds), New Survey Methods in Transport. VNU Science Press: Utrecht, TheNetherlands.

Brown HP (1977). Attitudinal Measures in Models of Mode Choice. Forum Papers, 3rdAustralian Transport Research Forum, (un-numbered).

Brownlee KA (1957). 'A Note of the Effects of Nonresponse on Surveys'. AmericanStatistical Association Journal , 29-32.

Brunner GA & Carroll SJ (1967). 'The Effect of Prior Telephone Appointments onCompletion Rates and Response Content'. Public Opinion Quarterly , 31, 652-654.

Brunso JM & Hartgen DT (1984). 'An Update on Household-Reported Trip-GenerationRates'. Transportation Research Record , 987, 67-75.

Bureau of Transport & Communications Economics (1994). Transport and CommunicationsIndicators. Quarterly, Bureau of Transport and Communications Economics, Canberra.

Bureau of Transport Economics (1981). National Travel Survey 1977-78: StatisticalAdjustments and Final Results. Bureau of Transport Economics, AustralianGovernment Printing Service, Canberra.

Burnett KP & Hanson S (1982). 'The Analysis of Travel as an Example of Complex HumanBehaviour in Spatially-Constrained Situations: Definition and Measurement Issue'.Transportation Research A, 16, 87-102.

Burton TL & Cherry GE (1970). Survey Research Techniques for Planners. Allen and Unwin:London.

Cannell C, Groves R, Magilavy L, Mathiowetz N & Miller P (1987). An ExperimentalComparison of Telephone and Personal Health Surveys. Presented at the AnnualMeeting of the American Association for Public Opinion Research, St Petersburg,Florida.

Carpenter S & Jones PM (Eds) (1983). Recent Advances in Travel Demand Analysis. Gower:Aldershot, England.

Carroll JD & Chang JJ (1970). 'Analysis of Individual Differences in MultidimensionalScaling via an N-way Generalization of Eckhart-Young Decomposition'. Psychometrika, 35, 3, 283-319.

Carterette EC & Friedman MP (Eds) (1974). Handbook of Perception, Volume II, Psycho-physical Judgment and Measurement . Academic Press: New York.

Chamberlain G (1978). 'Omitted Variable Bias in Panel Data: Estimating the Returns toSchooling'. Annales De L'Insee , 30, 31, 49-63.

Chipman ML, Lee-Gosselin M, MacGregor C & Clifford L (1992). The Effects of Incentivesand Survey Length on Response Rates in a Driver Exposure Survey. In ES Ampt, AJRichardson & AH Meyburg (Eds), Selected Readings in Transport SurveyMethodology. Eucalyptus Press: Melbourne.

Church J (Ed) (1994). Social Trends 24. Central Statistical Office, HMSO , London.Clark AC & Goldstucker C (1986). 'Mail-Out/Mail-Back Travel Survey in Houston Texa'.

Transportation Research Record , 1097, 13-20.Clarke L Phibbs M Klepez A & Griffiths D (1987). General Household Advance Letter

Experiment. Survey Methodology Bulletin, UK Office of Population Censuses andSurveys, London.

Clarke MI (1984). An Activity Scheduling Model of Travel Behaviour. TSU Ref 263,Transport Studies Unit, Oxford University.

Clarke MI, Dix MC & Jones PM (1981). 'Error and Uncertainty in Travel Surveys'.Transportation , 10, 2, 105-126.

Clarke MI, Dix MC, Jones PM & Heggie IG (1981). 'Some Recent Developments in Activity-Travel Analysis and Modelling'. Transportation Research Record , 794, 1-8.

Cochran WG (1977). Sampling Techniques, (3rd Edn). John Wiley & Sons: New York.Cohen GS, McEvoy F & Hartgen DT (1981). 'Who Reads the Transportation Planning

Literature?'. Transportation Research Record , 793, 33-40.Collins M (1978). 'Interviewer Variability: The North Yorkshire Experiment'. Journal of

Market Research Society , 20, 59-72.Collins M, Sykes W, Wilson P & Blackshaw N (1988). Non-Response: The U.Kexperience.

In RM Groves, PP Biemer, LE Lyberg, JT Massey, WL Nicholls & J Waksberg (Eds) ,Telephone Surveying Methodology. John Wiley & Sons: New York.

Comrey AL (1950). 'A Proposed Method for Absolute Ratio Scaling'. Psychometrika , 317-325.

Consterdine G (1988). Readership Research and the Planning of Press Schedules. Gower:Aldershot, England.

Converse JM & Schuman H (1974). Conversations at Random: Survey Research asInterviewers See It. John Wiley & Sons: New York.

Cooper BR & Layfield RE (1978). Nottingham Zones and Collar Study - Results of the"after" Surveys. Supplementary Report 365, Transport & Road Research Laboratory,Crowthorne UK.

Couch A & Keneston K (1960). 'Yeasayers and Naysayers: Agreement Response Set as aPersonality Variable'. Journal of Abnormal and Social Psychology , 60, 2, 151-174.

Cuddon A (1993). The VATS Administration and Data Entry Programs. Presented at the 15thConference of Australian Institutes of Transport Research, Melbourne.

Cundill MA (1979). A Comparative Analysis of Goods Vehicle Survey Data. SupplementaryReport 465, Transport & Road Research Laboratory, Crowthorne UK.

Curtin RT (1981). The University of Michigan Survey of Consumers. Survey ResearchCenter: University of Michigan.

Dalkey N & Helmer 0 (1963). 'An Experimental Application of the Delphi Method to the Useof Experts'. Management Science , 458-467.

Damm D (1980). 'Interdependencies in Activity Behavior'. Transportation Research Record, 750, 33-40.

Das M (1978). Collection and Dissemination of Transport Information in a Local Authority.Laboratory Report 815, Transport & Road Research Laboratory, Crowthorne UK.

Dasgupta M (1980). Manchester Travel-to-Work Survey: Survey Method and PreliminaryResults. Supplementary Report 538, Transport & Road Research Laboratory,Crowthorne UK.

Davies P & Salter DR (1983). 'Reliability of Classified Traffic Count Dat'. TransportationResearch Record , 905, 17-27.

Davis JA (1971). Elementary Survey Analysis. Prentice-Hall: Englewood Cliffs, NJ.Dawson RFF (1980). Before and After Studies in Roads and the Environment. Supplementary

Report 537, Transport & Road Research Laboratory, Crowthorne UK.deBono E (1967). The Use of Lateral Thinking. Penguin Books: Middlesex, UK.deBono E (1972). Po: Beyond Yes and No. Penguin Books: Middlesex, UK.deBono E (1988). Tactics: The Art and Science of Success. Fontana/Collins: London.Deming WE (1953). 'On the Probability Mechanism to Attain an Economic Balance Between

the Resultant Error of Non-Response and the Bias of Non-Response'. Journal of theAmerican Statistical Association , 48, 743-772.

Deming WE (1960). Sample Design in Business Research. John Wiley & Sons: New York.Di Renzo JF, Ferlis RA & Hazen PI (1977). 'Sampling Procedures for Designing Household

Travel Surveys for Statewide Transportation Planning'. Transportation Research Record, 639, 37-43.

Dickey JW, Stuart RC, Walker RD, Cunningham MC, Winslow AG, Diewald WJ & Ding GD(1975). Metropolitan Transportation Planning. McGraw Hill: New York.

Dickinson KW & Waterfall RC (1984). 'Image Processing Applied to Traffic'. TrafficEngineering and Control , 25, 1, 6-67.

Difiglio C & Disbrow JA (1981). 'Impact of Travel Survey Sampling Error on Travel DemandForecasting'. Transportation Research Record , 815, 31-40.

Dillman DA (1978). Mail and Telephone Surveys, The Total Design Method. John Wiley &Sons: New York.

Divey ST & Walmsley DA (1981). The Variability of Bus Passenger Flows and the Accuracyof Roadside Estimation. Presented at the Conference on Transport Surveys - Design forEfficiency, The University of Leeds, UK.

Dix MC (1975). Application of In-Depth Interviewing Techniques to the Study of TravelBehaviour: Some Preliminary Results. Working Paper 9, Transport Studies Unit,Oxford University.

Dix MC (1976). Unstructured Household Interviews as a means of Investigating the TravelDecision Process. Presented at the 8th University Transport Studies Group Conference,Aston, UK.

Dix MC (1981). Structuring our Understanding of Travel Choices: The Use of Psychometricand Social Research Techniques. In PR Stophe,r AH Meyburg & W Brög (Eds) , NewHorizons Travel-Behaviour Research. Lexington Books, D.C. Heath & Co.: Lexington,Massachusetts.

Dixon CJ & Leach B (1979). Questionnaires and Interviews in Geographic Research.Concepts and Techniques in Modern Geography No 18, Study Group in QuantitativeMethods Institute of British Geographers University of East Anglia, Norwich UK.

Dobson R (1976). Uses and Limitations of Attitudinal Modeling. In PR Stopher & AHMeyburg (Eds) , Behavioral Travel-Demand Models. Lexington Books, D.C. Heath &Co.: Lexington, Massachusetts.

Donald MN (1960). 'Implications of Nonresponse for the Interpretation of Mail QuestionnaireData'. Public Opinion Quarterly , 24, 99-114.

Downes JD (1980). Life Cycle Changes in Household Structure and Travel Characteristics.Laboratory Report 930, Transport & Road Research Laboratory, Crowthorne UK.

Doxsey LB (1983). 'Respondent Trip Frequency Bias in On-Board Surveys'. TransportationResearch Record , 944, 54-57.

Draper NR & Smith H (1968). Applied Regression Analysis. John Wiley & Sons: New York.Dreisbach AF (1980). 'Small-scale, Ongoing Home-interview Survey in Pennsylvani'.

Transportation Research Record , 779, 10-16.Dumble PL (1978). An Appraisal of Australian Urban Transport Study Data - Home

Interview Survey Data. Internal Report AIR 289-8, Australian Road Research Board,Melbourne.

Dumble PL (1978). An Initial Appraisal of Australian Urban Transport Study Data - HomeInterview Survey Data. Proc. 9th Australian Road Research Board Conference, 6, 89-111.

Dumble PL (1980). Improving Home Interview Travel Survey Techniques. Proc. 10thAustralian Road Research Board Conference , 5, 252-266.

Dunbar FC (1979). 'Use of Before-and-After Data to Improve Travel Forecasting Methods'.Transportation Research Record , 723, 39-46.

Dunphy RT (1979). 'Travel Data from the US Census: A New Foundation for TransportationPlanning'. Transportation Research Record , 701, 22-25.

Dunphy RT (1979). 'Workplace Interviews as an Efficient Source of Travel Survey Dat'.Transportation Research Record , 701, 26-28.

Durbin J & Stuart A (1954). 'An Experimental Comparison Between Coders'. Journal ofMarketing , 19, 54-66.

Eash R (1987). 'Management of a Small Home Interview Travel Survey Using aMicrocomputer'. Transportation Research Record , 1134, 49-56.

Effron B (1981). 'Nonparametric Estimates of Standard Error: The Jackknife, the Bootstrapand other Methods'. Biometrika , 68, 3, 589-599.

Effron B (1983). 'A Leisurely Look at the Bootstrap, the Jackknife and Cross-Validation'.American Statistician, 37, 1, 36-48.

Ehrendfeld S & Littauer S (1964). Introduction to Statistical Method. McGraw-Hill : NewYork.

Englisher LS (1986). 'Job Satisfaction and Transit Operator Recognition Programs: Results ofa Survey of Mini Operators'. Transportation Research Record , 1078, 23-31.

Fahey RM & Chuck B B (1992). License Plate Videotaping as an Origin and DestinationSurvey Method Along Highway Corridors. In ES Ampt, AJ Richardson & AH Meyburg(Eds), Selected Readings in Transport Survey Methodology. Eucalyptus Press:Melbourne.

Faulkner HW (1982). Simulation Games as a Technique for Information Collection: A CaseStudy of Disabled Transport Use in Canberra. In T Beed & R Stimson (Eds), TheTheory and Techniques of Interviewing. Hale and Iremonger: Sydney.

Ferlis RA (1979). 'Field Data Collection and Sampling Procedures for Measuring RegionalVehicle Classification and Occupancy'. Transportation Research Record , 701, 1-6.

Festinger L (1957). A Theory of Cognitive Dissonance. Stanford University Press: Stanford.Fielding P (1985). Obtaining Bus Transit Service Consumption Information through

Sampling Procedures. In ES Ampt, AJ Richardson & W Brög (Eds), New SurveyMethods in Transport. VNU Science Press: Utrecht, The Netherlands.

Filion FL (1975). 'Estimating Bias due to Nonresponse in Mail Surveys'. Public OpinionQuarterly , 39, 4, 482-492.

Finney DJ (1965). Probit Analysis. Cambridge University Press: Cambridge, England.Fishbein M (Ed) (1967). Readings in Attitude Theory and Measurement. John Wiley & Sons:

New York.Fisher GW (1986). 'Trip-Monitoring Survey: Downtown Travel Data Obtained by Following

a Small Sample of Vehicles'. Transportation Research Record , 1050, 35-46.Fisher RA (1936). 'The Use of Multiple Measurements in Taxonomic Problems'. Annals of

Eugenics , 7, 2, 179-188.Fishman GS (1973). Concepts and Methods in Discrete Event Digital Simulation. John Wiley

& Sons: New York.Fouracre PR & Sayer IA (1976). An Origin-Destination Survey in Malawi. Supplementary

Report 232, Transport & Road Research Laboratory, Crowthorne UK.Fouracre PR Maunder DAC Pathek MG & Rao CH (1981). Studies of Bus Operations in

Delhi India. Supplementary Report 710, Transport & Road Research Laboratory,Crowthorne UK.

Frankel MR & Frankel LR (1979). 'Some Recent Developments in Sample Survey Design'.Journal of Marketing Research , 14, 3, 280-293.

Frankel MR (1989). 'Current Research Practices; General Population Sampling IncludingGeodemographics'. Journal of the Market Research Society , 31, 4, .

Frey JH (1983). Survey Research by Telephone. Sage Publications: Beverley Hills,California.

Fuller CH (1974). 'Weighting to Adjust for Survey Nonresponse'. Public Opinion Quarterly ,38, 239-246.

Fulton PN (1984). 'Allocating Incomplete Place-of-Work Responses in the 1980 CensusUrban Transportation Planning Package'. Transportation Research Record , 981, 15-21.

Galer M (1981). A Survey among Lorry Drivers about the Striking of Low Bridges.Supplementary Report 633, Transport & Road Research Laboratory, Crowthorne UK.

Galin D (1975). 'How Reliable are Mail-Surveys and Response Rates: A Literature Review'.Journal of Marketing Research , 12, 440-453.

Gallup GH (1947). 'The Quintamensional Plan of Question Design'. Public OpinionQuarterly, 11, 85-393.

Garber NJ & Bayat-Mokhtari F (1986). 'A Computerized Highway Link ClassificationSystem for Traffic Volume Counts'. Transportation Research Record , 1090, 1-8.

Gardner GJ (1976). Social Surveys for Social Planners. Holt, Rinehart and Winston: Sydney.Gilbert D & Jessop A (1979). Error and Uncertainty in Transport Models. PTRC Summer

Annual Meeting, Transportation Models Seminar, 122-157.Glock VY (1967). Survey Research in the Social Sciences. Russell Sage Foundation: New

York.Golob TF & Dobson RP (1974). 'Assessment of Preferences and Perceptions Towards

Attributes of Transportation Alternatives'. Transportation Research Board , SpecialReport 149 , 58-81.

Golob TF & Golob JM (1989). Practical Consideration in the Development of a Transit UsersPanel. Presented at the International Conference on Dynamic Travel BehaviourAnalysis, Kyoto University, Japan.

Golob TF (1970). 'The Survey of User Choice of Alternative Transportation Modes'. HighSpeed Ground Transportation Journal , 4, 103-116.

Golob TF (1973). 'The Development of Attitudinal Models of Travel Behavior'. HighwayResearch Board , Special Report 143, .

Golob TF, Canty ET, Gustafson RL & Vitt JE (1972). 'An Analysis of Consumer Preferencesfor a Public Transportation System'. Transportation Research , 6, 81-102.

Golob TF, Dobson R & Sheth JN (1973). Perceived Attribute Importance in Public andPrivate Transportation Proceedings. Presented at the 5th Annual Meeting of theAmerican Institute for Decision Sciences, Washington, DC.

Golob TF, Horowitz AD & Wachs M (1979). Attitude-Behavior Relationships in TravelDemand Modeling. In DA Hensher & PR Stopher (Eds), Behavioural Travel Modelling.Croom Helm: London.

Goodwin PB (1989). 'Family Changes and Public Transport Use 1984-87: A DynamicAnalysis Using Panel Data'. Transportation , 16, 2, 121-154.

Gordon W & Langmaid R (1988). Qualitative Market Research: A Practitioner's and Buyer'sGuide. Gower: Aldershot, England.

Goulias KG. Pendyala RM & Kitamura R (1992). Updating a Panel Survey Questionnaire. InES Ampt, AJ Richardson & AH Meyburg (Eds), Selected Readings in Transport SurveyMethodology. Eucalyptus Press: Melbourne.

Gray PG Corlett T & Frankland P (1950). The Register of Electors as a Sampling Frame.Government Social Survey No M59, HMSO, London.

Greening PAK & Smith PG (1977). Conversion of Mapped Route Information into DigitizedData on Magnetic Tape. Supplementary Report 272, Transport & Road ResearchLaboratory, Crowthorne UK.

Greening PAK & Smith PG (1980). A Survey of Recreational Traffic in the Yorkshire Dales.Supplementary Report 539, Transport & Road Research Laboratory, Crowthorne UK.

Grigg AO & Huddart L (1978). Three Surveys of the Visual Intrusion of Roads in RuralLandscapes. Laboratory Report 861, Transport & Road Research Laboratory,Crowthorne UK.

Grigg AO & Huddart L (1979). An Opinion Survey of the Yorkshire Dales Rail Service in1977. Laboratory Report 906, Transport & Road Research Laboratory, Crowthorne UK.

Grigg AO (1978). A Review of Techniques for Scaling Subjective Judgements.Supplementary Report 379, Transport & Road Research Laboratory, Crowthorne UK.

Grigg AO (1980). 'Some Problems concerning the Use of Rating Scales for VisualAssessmen'. Journal of the Marketing Research Society , 22, 1, 29-43.

Griliches Z, Hall BH & Hausman JA (1978). 'Missing Data and Self-Selection in LargePanels'. Annales De L'Insee , 30, 31, 137-176.

Groves RM, Biemer PP, Lyberg LE, Massey JT, Nicholls WL & Waksberg J (Eds) (1988).Telephone Surveying Methodology. John Wiley & Sons: New York.

Guilford JP (1956). Psychometric Methods. McGraw-Hill: New York.Gulliksen H (1956). 'A Least Squares Solution for Paired Comparisons with Incomplete Data'.

Psychometrika , 21, 125-134.Gunn HF & Whittaker JC (1981). Survey Design for Partial Matrix Data Sets. Presented at

the Conference on Transport Surveys - Design for Efficiency, The University of Leeds,UK.

Gur YJ (1983). 'Estimating Trip Tables from Traffic Counts: Comparative Evaluation ofAvailable Techniques'. Transportation Research Record , 944, 113-118.

Gurin DB (1976). 'Methods for Collecting Information about Traveler Subgroups'.Transportation Research Record , 617, 1-7.

Gustafson RL & Navin FPD (1973). 'User Preference for Dial-A-Bus'. Highways ResearchBoard Special Report , 136, 85-93.

Guttman L (1950). The Basis for Scalogram Analysis. In AS Stouffer (Ed) , Measurement andPrediction. Princeton University Press: Princeton, NJ.

Hamilton TD (1984). Public Transport Surveys and the Role of Micro Computers. Presentedat the PTRC 12th Summer Annual Meeting, Brighton, UK.

Hanes RM (1949). 'The Construction of Subjective Brightness Scales from Fractional Data :A Validation'. Journal of Experimental Psychology , 39, 719-728.

Hansen MJ, Hurwitz WN & Madow WG (1966). Sample Survey Methods and Theory. JohnWiley & Sons: New York.

Hardman EJ & Walker MJ (1979). Anglo-Scottish Car Travel from Surveys on the A74 Roadnear Lockerbie. Laboratory Report 856, Transport & Road Research Laboratory,Crowthorne UK.

Harman HH (1960). Modern Factor Analysis. University of Chicago Press: Chicago.Hartgen DT & Keck CA (1974). Forecasting Dial-A-Bus Ridership in Small Urban Areas.

Preliminary Research Report 60, New York State Department of Transportation,Albany.

Hartgen DT & Lemmerman JH (1983). 'Streamlining Collection and Processing of TrafficCount Statistics'. Transportation Research Record , 928, 11-20.

Hartgen DT & Tanner GH (1971). 'Investigation of the Effect of Traveller Attitudes in aModel of Mode-Choice Behavior'. Highway Research Record , 369, .

Hartgen DT (1970). Mode Choice and Attitudes : A Literature Review. Preliminary ResearchReport 21, New York State Department of Transportation, Albany.

Hauer E & Persuad B (1983). 'Common Bias in Before-and-After Accident Comparisons'.Transportation Research Record , 905, 164-174.

Hausman JA & Wise DA (1979). 'Attrition Bias in Experimental and Panel Data: the GaryIncome Maintenance Experiment'. Econometrica , 42, 7, 455-473.

Haywood PJ & Blackledge DA (1980). The West Midlands Passenger Transport ExecutiveContinuous On-bus Survey: How the Data is Used. Presented at the PTRC 8th SummerAnnual Meeting, University of Warwick, UK.

Heanue KE, Hamner LB & Hall RM (1965). 'Adequacy of Clustered Home InterviewSampling for Calibrating a Gravity Model Trip Distribution Formula'. HighwayResearch Record , 88, 116-136.

Heathcote E (1983). Survey Data Correction and Expansion: A Case Study. Presented at theSecond International Conference on Survey Methods in Transport, Hungerford Hill,Australia.

Heberlain TA & Baumgartner R (1978). 'Factors Affecting Response Rates to MailedQuestionnaires: A Quantitative Analysis of the Published Literature'. AmericanSociological Review , 43, 4, .

Hedges B & Hopkin JM (1981). Transport and the Search for Work: A Study in GreaterManchester. Supplementary Report 639, Transport & Road Research Laboratory,Crowthorne UK.

Hensher DA & Johnson LW (1980). Applied Discrete Choice Modelling. Croom Helm:London.

Hensher DA & Louviere JJ (1979). 'Behavioural Intentions as Indicators of Very SpecificBehaviour'. Transportation , 8, 167-182.

Hensher DA & McLeod PB (1977). 'Towards an Integrated Approach to the Identificationand Evaluation of the Transport Determinants of Travel Choice'. TransportationResearch , 11, 77-93.

Hensher DA (1972). The Consumer's Choice Function: A Study of Traveller Behaviour andValues. PhD Thesis (unpublished), School of Economics, University of New SouthWales.

Hensher DA (1982). The Automobile and the Future: Some Issues. Forum Papers, 7thAustralian Transport Research Forum, 2, 727-772.

Hensher DA (1985). Longitudinal Surveys in Transport: An Assessment. In ES Ampt, AJRichardson & W Brög (Eds), New Survey Methods in Transport. VNU Science Press:Utrecht, The Netherlands.

Hensher DA (1986). 'Dimensions of Automobile Demand: An Overview of an AustralianResearch Project'. Environment and Planning A , 18, 1336-1374.

Hensher DA (1991). 'Hierarchical Stated Response Designs and Estimation in the Context ofBus User Preferences: A Case Study'. Logistics and Transportation Reviews, 26, 4, 299-323.

Hirsch NA & Aplin WN (1978). National Travel Survey 1977-78: Preliminary statisticalsummary December Quarter 1977. Occasional Paper 22, Bureau of TransportEconomics Australian Government Printing Service, Canberra.

Hirsch NA (1979). National Travel Survey 1977-78: Preliminary statistical summary JuneQuarter 1978. Occasional Paper 31, Bureau of Transport Economics AustralianGovernment Printing Service, Canberra.

Hitlin RA, Spielberg F, Barber E & Andrle SJ (1987). A Comparison of Telephone and Door-to-Door Survey Results for Transit Market Research. Presented at the 66th AnnualMeeting of the Transportation Research Board, Washington, DC.

Hoang LT & Poteat VP (1980). 'Estimating Vehicle Miles of Travel by Using RandomSampling Techniques'. Transportation Research Record , 779, 6-10.

Hoinville G & Jowell R (1978). Survey Research Practice. Gower: Aldershot, England.Holcomb MC & Kosky S (1984). Transportation Energy Data Book (7th Edn). Oak Ridge

National Laboratory: Tennessee.Horowitz JL (1981). Sources of Error and Uncertainty in Behavioral Travel-Demand Models.

In PR Stophe,r AH Meyburg & W Brög (Eds) , New Horizons in Travel-BehaviorResearch . Lexington Books, D.C. Heath & Co.: Lexington, Massachusetts.

Huddart L (1979). A Survey of Transport on Pleasure Trips from Newport Gwent.Supplementary Report 504, Transport & Road Research Laboratory, Crowthorne UK.

Huddart L (1981). Response to a Bus Service for Countryside Recreation: a Home InterviewSurvey. Laboratory Report 976, Transport & Road Research Laboratory, CrowthorneUK.

Hutchinson BG (1974). Principles of Urban Transport Systems Planning. Scripta Book Co:Washington, DC.

Hyman HH (1963). Survey Design and Analysis: Principles, Cases and Procedures. FreePress: New York.

Hyman HH (1962). Interviewing in Social Research. University of Chicago Press: Chicago.

Hyman HH (1972). Secondary Analysis of Sample Surveys: Principles, Procedures andPotentialities. John Wiley & Sons: New York.

Jenkins JJ, Russell WA & Suci GJ (1958). 'An Atlas of Semantic Profiles for 360 Words'. TheAmerican Journal of Psychology , 71, 688-699.

Jennings V Richardson AJ Gannon CA Maher CJ & Kok J (1983). Analysis of Student TravelPatterns at Monash University. Report to Council, Monash University, Melbourne.

Jessop A & Gilbert D (1981). Sample Determination: A Note on some Problems ofSpecifying Precision Requirements. Presented at the Conference on Transport Surveys -Design for Efficiency, The University of Leeds, UK.

Johnson NL & Smith H (1969). New Developments in Survey Sampling. John Wiley & Sons:New York.

Jones C, Burich MC & Campbell B (1986). Motivating Interviewers and Respondents inLongitudinal Research Designs. Presented at the International Symposium on PanelStudies at American Statistical Association, Washington, DC.

Jones FN (1974). Overview of Psychophysical Scaling Methods . In EC Carterette & MPFriedman (Eds) , Handbook of Perception, Vol. II, Psychophysical Judgment andMeasurement. Academic Press: New York.

Jones PM & Polak JW (1992). Collecting Complex Household Travel Data by Computer. InES Ampt, AJ Richardson & AH Meyburg (Eds), Selected Readings in Transport SurveyMethodology. Eucalyptus Press: Melbourne.

Jones PM (1977). Assessing Policy Impacts using the Household Activity-Travel Simulator.Working Paper 18, Transport Studies Unit, Oxford University.

Jones PM (1979). 'HATS': A Technique for Investigating Household Decisions'. Environmentand Planning A , 11, 1, 59-70.

Jones PM (1979). 'Methodology for Assessing Transportation Policy Impacts'. TransportationResearch Record , 723, 52-58.

Jones PM (1980). 'Experience with Household Activity-Travel Simulator (HATS)'.Transportation Research Record , 765, 6-12.

Jones PM (1985). Interactive Travel Survey Methods: The State-of-the-Art. In ES Ampt, AJRichardson & W Brög (Eds), New Survey Methods in Transport. VNU Science Press:Utrecht, The Netherlands.

Jones PM, Bradley M & Ampt E (1989). Forecasting Household Response to PolicyMeasures Using Computerised, Activity-Based Stated Preference Techniques. InInternational Association for Travel Behaviour (Ed) , Travel Behaviour Research.Gower: Aldershot, England.

Jones PM, Dix MC, Clarke MI & Heggie IG (1983). Understanding Travel Behaviour.Gower: Aldershot, England.

Jones TSM (1977). Young Children and their School Journey: a Survey in Oxfordshire.Supplementary Report 342, Transport & Road Research Laboratory, Crowthorne UK.

Kahn RL & Cannell CF (1957). The Dynamics of Interviewing - Theory, Technique andCases. John Wiley & Sons: New York.

Kaiser Transit Group (1982). The Dade County On-Board Transit Survey. Final Report toDade County Transportation Administration, Miami, Florida.

Kanuk L & Berenson C (1975). 'Mail Surveys and Response Rates: A Literature Review'.Journal of Marketing Research , 12, 454-473.

Keller WJ & Metz KJ (1989). On the Impact of New Data Processing Techniques at theNetherlands Central Bureau of Statistics. BPA No 14707-88-M3, Central Bureau ofStatistics, Netherlands.

Kendall MG & Smith BB (1939). Tables of Random Sampling Numbers. CambridgeUniversity Press: Cambridge, England.

Kendall MG (1965). A Course in Multivariate Analysis. Charles Griffen & Co: London.Kinsey AC Pomeroy WB & Martin CE (1948). Sexual Behavior in the Human Male.

Saunders: Philadelphia.Kinsey AC Pomeroy WB & Martin CE (1953). Sexual Behavior in the Human Female.

Saunders: Philadelphia.Kirby HR & Leese MN (1978). 'Trip-Distribution Calculations and Sampling Error: Some

Theoretical Aspects'. Environment and Planning A , 10, 837-851.Kish L (1965). Survey Sampling. John Wiley & Sons: New York.Kisk L & Frankel MR (1968). Balanced Repeated Replications for Analytical Statistics.

Proceedings, American Statistical Association, Social Statistics Section,2-10.Kisk L & Frankel MR (1970). 'Balanced Repeated Replications for Standard Errors'. Journal

of the American Statistical Association , 65, 1071-94.Knuth DE (1969). The Art of Computer Programming. Addison-Wesley: Reading,

Massachusetts.Koppelman FS & Chu C (1983). 'Effect of Sample Size on Disaggregate Choice Model

Estimation'. Transportation Research Record , 944, 60-70.Koppelman FS (1981). Uncertainty in Methods and Measurements for Travel-Behavior

Models. In PR Stophe,r AH Meyburg & W Brög (Eds) , New Horizons in Travel-Behavior Research. Lexington Books, D.C. Heath & Co.: Lexington, Massachusetts.

Krantz DH & Tversky A (1971). 'Conjoint measurement Analysis of Composition Rules inPsychology'. Psychological Review , 78, 151-169.

Kroes E & Sheldon R (1988). Are there any Limits to the Amount Consumers are Prepared toPay for Product Improvements?. Presented at the 15th PTRC Summer Annual Meeting,The University of Bath, UK.

Kurth DL (1986). 'A Small Sample Mail-Out/Telephone-Collection Travel Survey'.Transportation Research Record , 1097, 7-13.

Kuzmyak JR & Prensky S (1979). 'Use of Travel Diaries in Collection of Travel Data on theElderly and Handicapped'. Transportation Research Record , 701, 36-38.

Lange J & Richardson C (1984). 'Identification of Transportation Systems Problems: PublicInvolvement through a Telephone Survey'. Transportation Research Record , 991, 9-15.

Layfield RE & Bardsley MD (1977). Nottingham Zones and Collar Study - Results of theBefore Surveys. Supplementary Report 343, Transport & Road Research Laboratory,Crowthorne UK.

Lenntorp B (1978). A Time Geographic Simulation Model of Individual ActivityProgrammes. In Carlstein Parkes & Thrift (Eds), Human Activity and Time Geography(Vol. 2). Edward Arnold: London.

Lepkowski JM (1988). Telephone Sampling Methods in the United States. In RM Groves, PPBiemer, LE Lyberg, JT Massey, WL Nicholls & J Waksberg (Eds) , TelephoneSurveying Methodology. John Wiley & Sons: New York.

Lerman SR & Manski CF (1979). 'Sample Designs for Discrete Choice Analysis of TravelBehavior: The State of the Art'. Transportation Research A, 13, 1, 29-44.

Levin IP (1979). Application of Attitude Measurement and Attitudinal Modeling Techniquesin Transportation Research. Presented at the 4th International Conference onBehavioural Travel Modelling, Grainau, Germany.

Levin IP (1979). The Development of Attitudinal Modeling Approaches in TransportationResearch. In DA Hensher & PR Stopher (Eds), Behavioural Travel Modelling. CroomHelm: London.

Levin IP Louviere JJ Meyer RJ & Henley DH (1979). Perceived versus Actual Modal TravelTimes and Costs for The Work Trip. Technical Report 120, Institute of Urban andRegional Research, The University of Iowa.

Linsten D (1980). The NAASRA Data Bank System Study. Proc. 9th Australian RoadResearch Board Conference, 6, 3-16.

Lisco TE (1967). The Value of Commuter's Travel Time: A Study in Urban Transportation.PhD Thesis (unpublished), Department of Economics, University of Chicago.

Liss S (1984). 'Standard Census Products Related to Transportation Planning'. TransportationResearch Record , 981, 5-11.

Liss S (1986). 'Nationwide Personal Transportation Study: Experiences with PreviousSurveys and Options for the Future'. Transportation Research Record , 1097, 31-33.

Loeis M & Richardson AJ (1994). Estimation of Missing Incomes in Household TravelSurveys. TRC Working Paper TWP94/10, Transport Research Centre, Melbourne.

Lomax DE & Downes JD (1977). Patterns of Travel to School and Work in Reading in 1971.Laboratory Report 808, Transport & Road Research Laboratory, Crowthorne UK.

London County Council (1963). 1961 London Travel Survey Report. London CountyCouncil, London.

Louviere J,J Wilson EM & Piccolo JM (1979). Psychological Modeling and Measurement inTravel Demand: A State-of-the-Art with Applications. In DA Hensher & PR Stopher(Eds), Behavioural Travel Modelling. Croom Helm: London.

Louviere JE, Henley D, Woodworth G, Meyer R, Levin I, Stoner J, Curry D & Anderson R(1981). 'Laboratory-Simulation versus Revealed-Preference Methods for EstimatingTravel Demand Models'. Transportation Research Record, 794, 42-51.

Luce RD & Tukey JW (1964). 'Simultaneous Conjoint Measurement: A New Type ofFundamental Measurement'. Journal of Mathematical Psychology , 1, 1-27.

Lynn J & Jay A (1989). Yes Prime Minister. BBC Books: London.Mackie AM & Griffin LJ (1977). Before and After Study of the Environmental Effects of

Tring Bypass. Laboratory Report 746, Transport & Road Research Laboratory,Crowthorne UK.

MacKinder IH & Evans SE (1981). The Predictive Accuracy of British Transport Studies inUrban Areas. Supplementary Report 699, Transport & Road Research Laboratory,Crowthorne UK.

Maunder DAC Fouracre PR Pathak MG & Rao CH (1981). Household and TravelCharacteristics in Two Residential Areas of Delhi. Supplementary Report 673,Transport & Road Research Laboratory, Crowthorne UK.

McCarthy PJ (1969). 'Pseudo-Replication: Half Sample'. Review of International StatisticalInstitute , 37, 239-264.

McDonnell JJ (1984). 'Transportation-Related Questions on the Decennial Census'.Transportation Research Record , 981, 3-5.

McFadden D & Domencich T (1975). Urban Travel Demand. North Holland Press:Amsterdam.

McGrath WR & Guinn C (1963). 'Simulated Home Interview by Television'. HighwayResearch Record , 41, 1-6.

McPherson LW Heimbach CL & Goode LR (1983). 'Computerized Method for UpdatingPlanning Data bases used in Travel Demand Forecasting'. Transportation ResearchRecord , 928, 27-35.

Melway (1982). Greater Melbourne Street Directory. Melway Publishing Pty Ltd: Melbourne.Memmott FW (1963). 'Home-Interview Survey and Data-Collection Procedures'. Highway

Research Record , 41, 7-12.Menneer P (1978). 'Retrospective Data in Survey Research'. Journal of the Market Research

Society , 20, 3, 182-195.Meurs HJ van Wissen L & Visser J (1989). 'Measurement Biases in Panel Data'.

Transportation , 16, 2, 175-194.Meyburg AH & Brög W (1981). 'Validity Problems in Empirical Analyses of Non-home

Activity Pattern'. Transportation Research Record , 807, 46-51.Meyer RJ, Levin IP & Louviere JJ (1978). Functional Analysis of Mode Choice. Presented at

the 57th Annual Meeting of the Transportation Research Board, Washington, DC.Michaels RM (1974). 'Behavioral Measurement: An Approach to Predicting Travel Demand'.

Transportation Research Board , Special Report 149 , 51-57.Miles JC & Hammond JM (1977). A Survey of Routes Taken by Motor Vehicles in the Lake

District. Supplementary Report 264, Transport & Road Research Laboratory,Crowthorne UK.

Miles JC Mitchell CGB & Perrett K (1981). Monitoring the Effects of the Tyne and WearMetro. Supplementary Report 680, Transport & Road Research Laboratory, CrowthorneUK.

Miller GA (1956). 'The Magical Number Seven, Plus or Minus Two: Some Limits on ourCapacity for Processing Information'. Psychological Review , 63, 2, 81-97.

Ministry Of Transport Victoria (1981). Melbourne Home Interview Travel Survey 1978-1979. F.D. Atkinson, Government Printer: Melbourne, Australia.

Moll JW & Russell DA (1978). National Travel Survey 1977-78 Determination of RegionalSample Sizes. Occasional Paper 18, Bureau of Transport Economics AustralianGovernment Printing Service, Canberra.

Moll JW (1978). National Travel Survey 1977-78: Objectives and Strategies. OccasionalPaper 10, Bureau of Transport Economics Australian Government Printing Service,Canberra.

Morlok EK (1978). Introduction to Transportation Engineering and Planning. McGraw-Hill:New York.

Morrell D (1978). Survey Sample Size and Planning Data Requirements for Trip GenerationModels. Laboratory Report 821, Transport & Road Research Laboratory, CrowthorneUK.

Morris JM & Wigan MR (1979). 'A Family Expenditure Perspective on Transport Planning:Australian Evidence in Context'. Transportation Research A, 13, 4, 249-286.

Morton-Williams J (1993). Interviewer Approaches, SCPR, Social and Community PlanningResearch. Dartmouth Publishing Co: Aldershot, England.

Moser CA & Kalton G (1979). Survey Methods in Social Investigation (2nd Edn).Heinemann Educational Books: London.

Mostyn BJ & Sheppard D (1980). A National Survey of Drivers' Attitudes and Knowledgeabout Speed Limits. Supplementary Report 548, Transport & Road ResearchLaboratory, Crowthorne UK.

Mulligan PM & Horowitz AJ (1986). 'Expert Panel Method of Forecasting Land Use Impactsof Highway Projects'. Transportation Research Record , 1079, 9-16.

Murakami E & Pethick DR (1986). 'Puget Sound Council of Governments Origin-DestinationTravel Survey, 1985'. Transportation Research Record , 1097, 23-31.

Murakami E & Watterson WT (1992). 'The Puget Sound Transportation Panel after TwoWaves'. Transportation , 19, 2, 141-158.

Neffendorf H (1981). Advances in Computing for Survey Management and Processing.Presented at the Conference on Transport Surveys - Design for Efficiency, TheUniversity of Leeds, UK.

Newton I (1713). Philosophiae Naturalis Principia Mathematica (2nd Edn). : London.Neyman J & Pearson ES (1933). 'On the Problem of the Most Efficient Test in Statistical

Hypothesis'. Philosophical Transactions Acta , 231, 289-337.Nicholson FJ (1979). Cycle Routes in Portsmouth II - Traffic Studies. Laboratory Report 874,

Transport & Road Research Laboratory, Crowthorne UK.Nicolaidis GC (1977). 'Psychometric Techniques in Transportation Planning: Two Example'.

Environment and Behavior , 9, 4, 459-486.Nie NH, Hull CH, Jenkins JG, Steinbrenner K & Bent DH (1975). SPSS: Statistical Package

for the Social Sciences (2nd Edn). McGraw-Hill : New York.Norris BB & Shunk GA (1986). 'Special-Purpose Travel Surveys'. Transportation Research

Record , 1097, 1-23.Ochoa DL & Ramsey GM (1986). 'Estimating Sampling Error for Cluster Sample Travel

Surveys by Replicated Subsamplin'. Transportation Research Record , 1090, 36-43.

Ohstrom EG Ohstrom JB & Stopher PR (1984). 'Successful Administration of a Mailed 24-hour Travel Diary: A Case Study'. Transportation Research Record , 987, 14-20.

Oksenberg L & Cannell C (1988). Effects of Interviewer Vocal Characteristics on Non-Response. In RM Groves, PP Biemer, LE Lyberg, JT Massey, WL Nicholls & JWaksberg (Eds) , Telephone Surveying Methodology. John Wiley & Sons: New York.

Olleman T Howe S Kloeber K & Cohen G (1978). 'Marginal Weighting of TransportationSurvey Data'. Transportation Research Record , 677, 73-76.

Olssen ML & Kemp MA (1976). 'Urban Transportation and the Press: A Survey of EditorialOpinio'. Transportation , 5, 407-418.

Olzewski P, Wong Y-D, Polak JW & Jones PM (1994). Analysis of Travel Behaviour inSingapore. Centre of Transportation Studies, Nanyang Technical University,Singapore.

O'Neil MJ (1979). 'Estimating the Non-Response Bias due to Refusals in Telephone Surveys'.Public Opinion Quarterly , Summer, 218-232.

Openshaw S (1977). 'Optimal Zoning Systems for Spatial Interaction Models'. Environmentand Planning A , 9, 169-184.

Oppenheim AN (1992). Questionnaire Design, Interviewing and Attitude Measurement.Pinter Publishers: London.

Ortuzar J deD & Willumsen LG (1994). Modelling Transport, (2nd Edn). John Wiley &Sons: Chicester, UK.

Osgood CE Suci GJ & Tannenbaum P H (1957). The Measurement of Meaning. University ofIllinois Press: Urbana.

Owen DB (1962). Handbook of Statistical Tables. Addison-Wesley: Reading, Massachusetts.Paine FT Nash AN Hille SJ & Brunner GA (1969). 'Consumer Attitudes towards Auto versus

Public Transport Alternatives'. Journal of Applied Psychology , 53, .Pant PD & Bullen AGR (1980). 'Urban Activities, Travel and Time: Relationships from

National Time-Use Surve'. Transportation Research Record , 750, 1-6.Paravataneni R Stopher PR & Brown C (1982). 'Origin-Destination Travel Survey for

Southeast Michigan'. Transportation Research Record , 886, 1-8.Parten MB (1965). Surveys, Polls and Samples: Practical Procedures. Harper and Row: New

York.Payne SL (1951). The Art of Asking Questions. Princeton University Press: Princeton, NJ.Pearmain D, Swanson J, Kroes E, & Bradley M (1991). Stated Preference Techniques: A

Guide to Practice. Steer Davies and Gleave and Hague Consulting Group: London.Peterson EB & Hamburg JR (1986). 'Travel Surveys: Current Options'. Transportation

Research Record , 1097, 1-4.Phifer SP (1982). 'Use of Sampling in Bus Line Data Collection'. Transportation Research

Record , 877, 41-44.Phifer SPM Neveu AJ & Hartgen DT (1980). 'Family Reactions to Energy Constraints'.

Transportation Research Record , 765, 12-16.

Phillips G & Blake P (1980). 'Estimating Total Annual Traffic Flow from Short PeriodCounts'. Transportation Planning and Technology , 6, 3, 69-174.

Pile B (1991). Setting up and Managing a Large-Scale Telephone Interviewing Facility. InJoint Centre for Survey Methods Newsletter, Telephone Surveys: The Current State ofthe Art. SCPR: London.

Plackett RL & Burmann JP (1943). 'The Design of Optimum Multifactorial Experiments'.Biometrika , 43, 353-360.

Polak J & Axhausen KW (1990). The Birmingham CLAMP stated Preference Survey.Working Paper 502, Transport Studies Unit, Oxford University.

Polak J (1994). Towards an Experimental Paradigm for Travel Behaviour Research. Presentedat the 26th University Transport Studies Group Conference, The University of Leeds,UK.

Polak J, Jones P, Stokes G, Payne G, & Strachan J (1989). Computer-based PersonalInterviews: A Practical Tool for Complex Travel Surveys. Presented at the PTRC 17thSummer Annual Meeting, Brighton, UK.

Politz A & Simmons W (1949). 'An Attempt to get the Not-at-homes into the Sample withoutCallbacks'. Journal of American Statistical Association , 44, 245, 9-32.

Poorman JP & Stacey DJ (1984). 'Self-Administered Mailback Household Travel Survey'.Transportation Research Record , 955, 10-17.

Pratkanis AR, Breckler SJ & Greenwald AG (Eds) (1989). Attitude Structure and Function.Lawrence Erlbaum Associates: Hove, East Sussex.

Prescott-Clarke P (1980). People and Roads in the Lake District: A Study of the A66 RoadImprovement Scheme. Supplementary Report 606, Transport & Road ResearchLaboratory, Crowthorne UK.

Quenault SW (1979). Cycle Routes in Portsmouth III - Attitude Surveys. Laboratory Report875, Transport & Road Research Laboratory, Crowthorne UK.

Ramsey B & Openshaw S (1980). A Method for Assessing the Risks in MakingTransportation Investments due to Sampling and Modelling Errors. Presented at thePTRC 8th Summer Annual Meeting, University of Warwick, UK.

Rand Corporation The (1955). A Million Random Digits with 1,000,000 Normal Deviates.Free Press: New York.

"Richardson AJ & Ampt ES (1993). South East Queensland Household Travel Survey - FinalReport 4: All Study Areas. TRC Working Paper TWP93/6, Transport ResearchCentre, Melbourne."

Richardson AJ & Ampt ES (1993). The Victorian Integrated Travel Activities and Land-UseToolkit. VITAL Working Paper VWP93/1, Transport Research Centre, Melbourne.

Richardson AJ & Ampt ES (1994). Non-Response Effects in Mail-Back Travel Surveys.Presented at the 7th International Conference on Travel Behaviour Research, Santiago,Chile.

Richardson AJ & Cuddon A (1994). Sample Selection for the Victorian Activities and TravelSurvey. VITAL Working Paper VWP94/2, Transport Research Centre, Melbourne.

Richardson AJ & Graham NR (1982). 'Validation of a Signalised Intersection SurveyMethod'. Transportation Research Record, 841, 41-47.

Richardson AJ & Young W (1981). 'Spatial Relationships between Carpool Members' TripEnds'. Transportation Research Record , 823, 1-7.

Richardson AJ (1974). A Multi-Modal Marginal Disutility Model for Journey to Work TravelMode Choice. In DA Hensher (Ed), Urban Travel Choice and Demand Modelling.Australian Road Research Board Special Report No. 12: Melbourne.

Richardson AJ (1980). A Survey of Travel Patterns of the Elderly. Forum Papers, 6thAustralian Transport Research Forum, 123-142.

Richardson AJ (1986). The Correction of Sample Bias in Telephone Interview TravelSurveys. Presented at the 65th Annual Meeting of the Transportation Research Board,Washington, DC.

Richardson AJ Young W & Bowyer DP (1980). Carpooling and Geographic Structure:Survey Design and Administration. Civil Engineering Working Paper 80/6, MonashUniversity, Melbourne.

Richardson LF (1929). 'Imagery, Conation and Cerebral Conductance'. Journal of GeneralPsychology , 2, 324-352.

Rigby JP & Hyde PJ (1977). Journeys to School: a Survey of Secondary Schools in Berkshireand Surrey. Laboratory Report 766, Transport & Road Research Laboratory,Crowthorne UK.

Rigby JP (1977). An Analysis of Travel Patterns using the 1972/73 National Travel Survey.Laboratory Report 790, Transport & Road Research Laboratory, Crowthorne UK.

Rigby JP (1979). A Review of Research on School Travel Patterns and Problems.Supplementary Report 460, Transport & Road Research Laboratory, Crowthorne UK.

Ritchie SG (1986). 'A Statistical Approach to Statewide Traffic Counting'. TransportationResearch Record , 1090, 14-22.

Rowe BC & Scheer M (1976). Computer Software for Social Science Data. Social ScienceResearch Council: London.

Rule SJ (1971). 'Discriminability Scales of Number for Multiple and Fractional Estimates'.Acta Psychologica, 35, 328-333.

Sammer G & Fallast K (1985). Effects of Various Population Groups and of Distribution andReturn Methods on the Return of Questionnaires and the Quality of Answers in Large-scale Travel Surveys. In ES Ampt, AJ Richardson & W Brög (Eds), New SurveyMethods in Transport. VNU Science Press: Utrecht, The Netherlands.

Scott C (1961). 'Research on Mail Surveys'. Journal of Royal Statistical Society Series A 124, 2, 149-151.

Segal MN (1982). 'Reliability of Conjoint Analysis: Contrasting Data Collection Procedures'.Journal of Marketing Research , 19, 139-144.

Shaw RN & Richardson AJ (1987). The Importance of Scanning in Supermarket Selection: aComparison of Methods of Assessment. Presented at the 3rd Bi-Annual WorldMarketing Congress, Barcelona, Spain.

Sheffi Y & Tarem Z (1983). 'Experiments with Optimal Sampling for Multinomial LogitModel'. Transportation Research Record , 944, 141-148.

Sherret A (1971). Structuring an Econometric Model of Mode Choice. PhD Thesis(unpublished), Department of Environmental Engineering, Cornell University.

Sheskin IM & Stopher PR (1980). The Dual Survey Mechanism as a Device for Gauging theNon-Response Bias. Presented at the Annual Meeting of the Southeastern Division ofthe Association of American Geographers, Blacksburg, Virginia.

Sheskin IM & Stopher PR (1982). Pilot Testing of Alternative Administrative Procedures andSurvey Instruments. Presented at the 61st Annual Meeting of the TransportationResearch Board, Washington, DC.

Sheskin IM & Stopher PR (1982). 'Surveillance and Monitoring of a Bus System'.Transportation Research Record , 862, 9-15.

Sheskin IM, Spivack GS & Stopher PR (1981). 'The Dade County On-board Survey'. TransitJournal , Spring , 15-28.

Silvey J (1975). Deciphering Data: the Analysis of Social Surveys. Longman: London.Simon J & Furth PG (1985). 'Generating a Bus Route O-D Matrix from On-Off Data'. Journal

of Transportation Engineering Division ASCE , 111, 6, 583-593.Singer B & Spillerman S (1976). 'Some Methodological Issues in the Analysis of

Longitudinal Surveys'. Annals of Economic and Social Measurement , 5, 4, 447-474.Skelton N (1982). 'A Method for Determining Minimum Sample Sizes when Two Means are

to be Compared'. Traffic Engineering and Control , 23, 1, 29-37.Slavik MM (1986). 'Errors in Origin-Destination Surveys done by Number-Plate Technique'.

Transportation Research Record , 1050, 46-53.Slonim MJ (1960). Sampling in a Nutshell. Simon and Schuster: New York.Smith ME (1979). 'Design of Small-Sample Home-Interview Travel Surveys'. Transportation

Research Record , 701, 29-35.Smith RS & Wood JEA (1977). Memory - its Reliability in the Recall of Long Distance

Business Travel. Supplementary Report 322, Transport & Road Research Laboratory,Crowthorne UK.

SOCIALDATA Australia (1987). A Data Base for the Evaluation of Road User Risk inAustralia. CR51, Federal Department of Transport Federal Office of Road Safety,Canberra.

Sonquist JA (1977). Survey and Opinion Research: Procedures for Processing and Analysis.Prentice-Hall: Englewood Cliffs, NJ.

Sosslau AB & Brokke GE (1960). 'Appraisal of O-D Sample Size'. Public Roads, 31, 5, 114-119.

Spear BD (1977). Application of New Travel Demand Forecasting Techniques toTransportation Planning - A Study of Individual Choice Models. U.S. Department ofTransportation: Washington, DC.

Spendlove J (1981). An Enquiry into a Short Term Road Safety Campaign in Schools.Supplementary Report 640, Transport & Road Research Laboratory, Crowthorne UK.

Stevens SS & Galanter EH (1957). 'Ratio and Category Scales for a Dozen PerceptualContinua'. Journal of Experimental Psychology , 54, 377-411.

Stevens SS (1956). 'The Direct Estimation of Sensory Magnitudes Loudness'. AmericanJournal of Psychology , 69, 1-25.

Stevens SS (1967). 'On the Psychophysical Law'. Psychological Review , 64, 153-181.

Stevens SS (1974). Perceptual Magnitude and its Measurement. In EC Carterette & MPFriedman (Eds), Handbook of Perception, Vol. II, Psychophysical Judgment andMeasurement. Academic Press: New York.

Stipak B (1973). 'An Analysis of the 1968 Rapid Transit Vote in Los Angeles'. Transportation, 2, 71-86.

Stopher PR & Banister D (1985). Total Design Concepts. In ES Ampt, AJ Richardson & WBrög (Eds), New Survey Methods in Transport. VNU Science Press: Utrecht, TheNetherlands.

Stopher PR & Lavender JO (1972). Disaggregate Behavioral Demand Models: EmpiricalTests of Three Hypotheses. 13th Transportation Research Forum Proceedings, 1, 321-336.

Stopher PR & Meyburg AH (1975). Urban Transportation Modeling and Planning. LexingtonBooks, D.C. Heath & Co: Lexington, Massachusetts.

Stopher PR & Meyburg AH (1979). Survey Sampling and Multivariate Analysis for SocialScientists and Engineers. Lexington Books, D.C. Heath & Co: Lexington,Massachusetts.

Stopher PR & Sheskin IM (1982). 'A Method for Determining and Reducing Non-ResponseBias'. Transportation Research Record , 886, 35-41.

Stopher PR & Sheskin IM (1982). 'Toward Improved Collection of 24-Hour Travel Records'.Transportation Research Record , 891, 10-17.

Stopher PR (1982). 'Small Sample Home-interview Travel Surveys: Applications andSuggested Modifications'. Transportation Research Record , 886, 41-47.

Stopher PR (1983). 'Data Needs and Data Collection - State of the Practice'. TransportationResearch Board , Special Report 201 , 63-71.

Stopher PR (1985). The State-of-the-Art in Cross-Sectional Surveys in Transportation. In ESAmpt, AJ Richardson & W Brög (Eds), New Survey Methods in Transport. VNUScience Press: Utrecht, The Netherlands.

Stopher PR (1985). The Design and Execution of On-Board Bus Surveys: Some Case Studies.In ES Ampt, AJ Richardson & W Brög (Eds), New Survey Methods in Transport. VNUScience Press: Utrecht, The Netherlands.

Stopher PR (1992). 'Use of an Activity-Based Diary to Collect Household Travel Data'.Transportation , 19, 2, 159-176.

Stopher PR, Shillito L, Grober DT & Stopher HMA (1986). 'On-Board Bus Surveys: NoQuestions Asked'. Transportation Research Record , 1085, 50-57.

Stuart A (1976). Basic Ideas of Scientific Sampling, (2nd Edn). Griffin: London.Stuart RC (1980). 'Commercial Data Sources for Urban Transportation Planning'.

Transportation Research Record, 779, 1-6.Sudman S & Bradburn NM (1974). Response Effects in Surveys: a Review and Synthesis.

Aldine: Chicago.Sudman S (1967). Reducing the Cost of Surveys. Aldine: Chicago.Sudman S (1976). Applied Sampling. Academic Press: New York.

Talvitie A P & Kirschner D (1978). 'Specification, Transferability, and the Effect of DataOutliers in Modeling the Choice of Mode in Urban Travel'. Transportation , 7, 3, 311-331.

Talvitie AP (1981). Inaccurate or Incomplete Data as a Source of Uncertainty in Econometricor Attitudinal Models of Travel Behavior. In PR Stophe,r AH Meyburg & W Brög(Eds) , New Horizons in Travel-Behavior Research. Lexington Books, D.C. Heath &Co.: Lexington, Massachusetts.

Tanner JC (1981). Methods of Forecasting Kilometres per Car. Laboratory Report 968,Transport & Road Research Laboratory, Crowthorne UK.

Taylor MAP & Young W (1988). Traffic Analysis: New Technologies and New Solutions.Hargreen Publishing Co: North Melbourne, Australia.

Thomas R & Eastman C (1981). A National Survey of Commercial Vehicle Movements:Sample Size and Reliability. Presented at the Conference on Transport Surveys - Designfor Efficiency, The University of Leeds, UK.

Thoresen T (1983). Australian Road Statistics. Special Report 26, Australian Road ResearchBoard, Melbourne.

Thurstone LL (1927). 'Psychophysical Analysis'. American Journal of Psychology , 38, 368-389.

Thurstone LL (1959). The Measurement of Values. University of Chicago Press: Chicago.Tinter G (1952). Econometrica. John Wiley & Sons: New York.Tischer ML (1981). Attitude Measurement: Psychometric Modeling. In PR Stophe,r AH

Meyburg & W Brög (Eds) , New Horizons in Travel-Behavior Research. LexingtonBooks, D.C. Heath & Co.: Lexington, Massachusetts.

Torene R & Cannon J (1980). 'The 1977 Census of Transportation: an Update'. TransportationResearch Record , 779, 16-21.

Torgerson WS (1958). Theory and Methods of Scaling. John Wiley & Sons: New York.Transport Australia (1982). Transport Indicators. March Quarter 1982, Bureau of Transport

Economics Australian Government Printing Service, Canberra.Trent RB & Pollard CR (1983). 'Individual Responses to Rising Gasoline Prices: A Panel

Approach'. Transportation Research Record , 935, 33-45.Tufte ER (1983). The Visual Display of Quantitative Information. Graphics Press: Cheshire,

Connecticut.Tufte ER (1990). Envisioning Information. Graphics Press: Cheshire, Connecticut.Tukey JW (1977). Exploratory Data Analysis. Addison-Wesley: Reading, Massachusetts.UK Department of Transport (1993). National Travel Survey 1989/91. HMSO: London.UK Department of Transport (1993). Transport Statistics Great Britain 1993. HMSO:

London.University of Michigan (1976). Survey Research Center Interviewers Manual, (2nd Edn). The

Institute for Social Research, University of Michigan: Ann Arbor, Michigan.

USDOT (1982). National Urban Mass Transportation Statistics Second Annual ReportSection 15 Reporting System. Report No UMTA-MA-06-0107-82-1, Urban MassTransportation Administration, Washington DC.

USDOT (1985). Aircraft Operating Cost and Performance Report. Vol XIX, US Dept ofTransportation, Washington DC.

USDOT (1986). National Transportation Statistics Annual Report. Report No DOT-TSC-RSPA-86-3, Transportation Systems Center, Cambridge Mass.

van de Pol F (1987). Panel Sampling Designs. Proceedings of the Round Table Conference onthe Longitudinal Travel Study, The Hague, The Netherlands, 51-81.

van Wissen LJG & Meurs HJ (1989). 'The Dutch Mobility Panel: Experiences andEvaluation'. Transportation , 16, 2, 99-120.

Velleman PF & Hoaglin DC (1981). Applications, Basics, and Computing of ExploratoryData Analysis. Duxbury Press: Boston.

Velleman PF & Velleman AY (1988). Data Desk Handbook. Odesta Corporation:Northbrook, Illinois.

Vollozo D & Attanucci J (1982). An Assessment of Automatic Passenger Counters - InterimReport. Report No DOT-I-82-43, Urban Mass Transportation Administration,Washington DC.

Waksberg J (1978). 'Sampling Methods for Random Digit Dialing'. Journal of the AmericanStatistical Association , 73, 40-46.

Wall WD & Williams HL (1970). Longitudinal Studies and the Social Sciences. HeinemannEducational Books: London.

Walpole RE & Myers RH (1978). Probability and Statistics for Engineers and Scientists.Collier Macmillan: London.

Waltz EW & Grecco WL (1973). 'Evaluation of a Mailed Planning Survey'. HighwayResearch Record , 472, 92-107.

Warwick DP & Lininger CA (1975). The Sample Survey: Theory and Practice. McGraw-Hill:New York.

Wayne I (1975). 'Non-Response, Sample Size and the Allocation of Resources'. PublicOpinion Quarterly , 39, 557-562.

Webb EJ, Campbell DT, Schwartz RD & Sechrest L (1966). Unobtrusive Measures:Nonreactive Research in the Social Sciences. Rand McNally: Chicago.

Webber JR (1980). Commercial Vehicle Surveys. Proc. 9th Australian Road Research BoardConference, 6, 312-319.

Webster SP (1989). 'If I Survey You Again Today, Will You Still Love Me Tomorrow'.Academic Computing , 3, 6, 14-18, 46-48.

Wermuth MJ (1981). Effects of Survey Methods and Measurement Techniques on theAccuracy of Household Travel-Behavior Surveys. In PR Stophe,r AH Meyburg & WBrög (Eds) , New Horizons in Travel-Behavior Research. Lexington Books, D.C. Heath& Co.: Lexington, Massachusetts.

Wermuth MJ (1985). Errors Arising from Incorrect and Incomplete Information in Surveys ofNon-home Activity Patterns. In ES Ampt, AJ Richardson & W Brög (Eds), New SurveyMethods in Transport. VNU Science Press: Utrecht, The Netherlands.

Wermuth MJ (1985). Non-sampling Errors due to Non-response in Written Household TravelSurveys. In ES Ampt, AJ Richardson & W Brög (Eds), New Survey Methods inTransport. VNU Science Press: Utrecht, The Netherlands.

Wickstrom GV (1984). 'Supplementing Census Data for Transportation Planning'.Transportation Research Record , 981, 82-86.

Wigan MR & Cullinan M (1984). Machine Vision and Road Research: New Tasks, OldProblems. Proc. 12th Australian Road Research Board Conference, 4, 76-86.

Wigan MR (1985). The Secondary Use of Transport Survey Data. In ES Ampt, AJRichardson & W Brög (Eds), New Survey Methods in Transport. VNU Science Press:Utrecht, The Netherlands.

Wigan MR (1988). Australian Personal Travel Characteristics. Special Report 38, AustralianRoad Research Board, Melbourne.

Wilbur Smith & Associates & P-E Consulting (1977). Hull Freight Study: Collection of Dataand Construction of Computer Model. Supplementary Report 315, Transport & RoadResearch Laboratory, Crowthorne UK.

Wright CC (1980). A Traffic Routing Survey: Results and Evaluation of Technique.Supplementary Report 568, Transport & Road Research Laboratory, Crowthorne UK.

Yates FS (1971). Sampling Methods for Censuses and Surveys. Charles Griffen & Co:London.

Yelich BJ, Erlbaum NS & Koeppel KWP (1983). 'Transportation Energy and Related DataCollection at State and Substate Level'. Transportation Research Record , 928, 20-27.

Young PV (1966). Scientific Social Surveys and Research, (4th Edn). Prentice-Hall:Englewood Cliffs, NJ.

Young W Morris JM & Ogden KW (1980). Developing and Administering a Home InterviewSurvey. Internal Report AIR301-1, Australian Road Research Board, Melbourne.

Survey Design Checklist

1

Appendix A

Survey Design ChecklistThis document should be used as a checklist to ensure that all aspects of the surveydesign process have been addressed.

In most cases, a short answer will suffice to demonstrate that the particular point has,indeed, been addressed. In other cases, more detailed information may be provided, orattached, and this checklist may then serve as a useful shorthand form of surveydocumentation. It may also be used as a guideline for writing more comprehensivedocumentation for the survey.

Where appropriate, page references are given to a text on Travel Survey Methods(Richardson, Ampt and Meyburg (1995), Survey Methods for Transport Planning,Eucalyptus Press: Melbourne) to help you understand what is required to be answeredin this Checklist.


2

Section 1 Preliminary PlanningAdministrative Details of the Survey

The name of the survey?

Who sponsored the survey?

Who designed the survey?

Who collected the survey data?

Who analysed the survey data?

Was there an advisory committee? YES NO

If so, who was on the Committee?

Dates and duration of the survey?


3

Overall Study Objectives (p17)

What were the objectives of the project to which this survey contributed?

Why was a survey needed?

Specific Survey Objectives (p24)

What were the specific objectives of this survey?


4

Review of Existing Information (p26)

What prior information was available?

What secondary information was available for sample expansion?

Formation of Hypotheses (p28)

What specific hypotheses, if any, were to be tested?


5

Definition of Terms (p28)

What definitions are being used by the survey team for key items such as:

trip, household, mode, income etc. (as relevant to the specific survey)?

Determination of Survey Resources (p30)

What time was available for completion of the survey?

How much money was available for the survey?

What people were available to work on the survey?


6

Section 2 Selection of Survey MethodSelection of Survey Time Frame(p34)

Was the survey cross-sectional or time-series (and why)?

Cross-sectional Time Series

Why?

Selection of Survey Technique (p42)

What methods were considered for the survey technique?

What testing was performed on the different methods?

What method was finally selected (and why)?


7

Section 3 Sample DesignDefinition of Target Population (p75)

What was the population for the survey?

How was this population defined and identified?

Sampling Units (p76)

What unit was used for sampling?

Sampling Frame (p77)

What sampling frame was used?

Where was the sampling frame obtained from?

How was the sampling frame obtained?


8

Why was the sampling frame first compiled?

How did the sampling frame perform in term of:

accuracy

completeness

duplication

adequacy

up-to-dateness

Sampling Method (p80)

What sampling methods were considered?

What sampling method was finally chosen (and why)?


9

Was the selected sample representative of the population?

If not, how will this be corrected later?

What was the specific sampling procedure (full details)?


10

Consideration of Sampling Bias (p96)

What sources of sampling bias were considered?

How serious were these biases considered to be?

What steps were taken to overcome these sources of bias?


11

Sample Size and Composition (p101)

What was the final sample size?

What stratifications were used in the sample design?

How was the sample size calculated?

what were the key variables considered?

what was the variability of these variables?

what confidence limits were used?

what levels of confidence were used?


12

Estimation of Parameter Variances (p126)

How are parameter variances to be estimated in the data analysis?

Conduct of Sampling (p142)

What procedure was used in selecting the sample?

Was random sampling used at all stages of sampling?


13

Section 4 Survey Instrument DesignQuestion Content (p151)

What types of information are being sought in the survey?

Trip Recording Techniques (p155)

How are trips and activities being sought from respondents?

Physical Nature of Forms (p159)

What is the physical nature of the survey forms?

what paper size and weight was used?

what colours and printing methods were used?


14

Question Types (p166)

What classification questions were asked?

where did the classification categories come from?

What attitude questions were asked?

what testing was performed on the attitude scales?


15

Question Format (p187)

Which questions were asked as open questions (and why)?

Which questions were asked as closed questions (and why)?

where did the closed question categories come from?


16

Question Wording (p194)

How has the question wording been tested for:

simple vocabulary YES

words appropriate to the audience YES

length of questions YES

ambiguous questions (get someone else to read them) YES

double-barrelled questions YES

vague words YES

loaded questions YES

leading questions YES

double negatives YES

stressful questions YES

grossly hypothetical questions YES

the effect of response styles YES

periodicity questions YES

Question Ordering (p205)

What reasons are there for the question ordering?


17

Question Instructions (p207)

What instructions were provided for respondents/interviewers?


18

Section 5 Pilot Survey(s)Description of Pilot Surveys (p214)

What pilot testing was performed?

If no pilot testing was done, why not?

Size of the Pilot Survey (p222)


19

Lessons from the Pilot Survey (p216)

How adequate was the sampling frame?

What was the variability within the survey population?

What response rate was achieved?

How suitable was the survey method?

How well did the questionnaire perform?

How effective was the interviewer training?

Did the coding, data entry, editing and analysis procedures work satisfactorily?

Cost and Duration of Pilot Surveys (p221)

Cost:

Duration:


20

Section 6 Administration of the SurveySurvey Procedures

Self-Completion Questionnaires (p239)

- pre-contact procedures

- mail-out procedures

- response receipt procedures

- phone enquiry procedures

- postal reminder regime

- telephone follow-ups

- validation interviews

- non-response interviews


21

Personal Interviews (p246)


- call-back procedures

- maintenance of survey logs

- interviewer payment methods

- field supervisor tasks

- work distribution procedures


22

Telephone Interviews (p251)

- sampling procedures

- dealing with non-response

- use of CATI systems


23

Intercept Surveys (p255)

- procedures for obtaining respondents

- distribution of surveys

- collection of surveys


24

In-depth Interview Surveys (p258)


- call-back procedures

- maintenance of survey logs

- recording methods

- transcription methods

- interpretation of responses


25

Section 7 Data CodingSelection of Coding Method (p266)

What physical method was used for data coding?

Preparation of Code Format (p268)

What coding frame was used?

(provide full coding frame in Appendix)

What location-coding method was used?

Development of Data Entry Programs (p292)

What special data entry programs were developed?

(provide screen-shots of data entry screens in Appendix)


26

Coder and Data Entry Training (p290)

What training was provided for coders and data enterers?

(provide training manual in Appendix)

Coding Administration (p290)

How was the coding administered?

What quality control procedures were implemented?

How were changes made to coding frames?


27

Section 8 Data EditingInitial Questionnaire Editing (p299)

What in-field checking was done by interviewer/supervisor?

What checking was done on receipt in survey office?

Verification of Data Entry (p299)

Was data entry verified for accuracy? YES NO

Development of Editing Computer Programs (p299)

Were special data editing programs developed? YES NO

If so, in what language were they written?

Consistency and Range Checks (p299)

What permissible range checks were applied?


What logic checks were applied?



28

Missing Data (p299)

How was missing data coded?

Were estimates made of missing values?


29

Section 9 Data Correction and ExpansionEditing Check Corrections (p299)

What procedures were used for office edits?

Secondary Data Comparisons (p307)

What secondary data was used for sample expansion?

What variables were used for expansion purposes?

Was expansion based on cross-tabulations or marginal totals?

Cross-tabs Marginals

What were the final expansion factors?

(provide full list of expansion factors in Appendix)

How are they to be applied when using the data?


30

Corrections for Internal Biases

What recognition was there of non-reported data?(p313)

Were non-reporting factors calculated? YES NO

If so, how are they to be applied to the data?

What recognition was there of non-response?(p321)

Were non-response factors calculated? YES NO

If so, how are they to be applied to the data?


31

Section 10 Data Analysis and ManagementExploratory Data Analysis (p339)

What EDA methods were used?

Model Building (p346)

Is the data to be used to build specific models? YES NO

If so, what type?

Interpretation of Results (p395)

Are any limitations on the data clearly stated?YES NO

How?

How is the sampling error expressed?


32

Database Management (p268)

Is the structure of the datafiles clearly described? YES NO

(provide full list of datafiles in Appendix)

Are the relationships between data files clear? YES NO

(provide full set of relationships in Appendix)

Provision of Data Support Services (p402)

What support is available for users of the data?

Is it clear where such support can be obtained? YES NO


33

Section 11 Presentation of ResultsPresentation of Results of Analysis (p396)

Are the major descriptive results presented:

- in a clear visual manner? YES NO

- with accompanying written explanations? YES NO

- with appropriate interpretations? YES NO

- and with clear statement of any qualifications? YES NO

Publication of Results (p402)

Are the results of the survey or the survey methodology written up in concise form, andavailable in the general literature?

YES NO

If so, where?


34

Section 12 Tidying-UpStorage and Archival of Data (p412)

Where is the data stored?

Who is the contact person?

Name:

Position:

Organisation:

Address:

Are telephone, fax and e-mail numbers provided?Telephone:Fax:e-mail:

Is this documentation stored electronically with the data?

YES NO

Has the data been lodged with any other archival service?

YES NO

Have all survey staff been fully paid? YES NO

Have all outstanding bills been paid? YES NO

What arrangements have been made for destroying original questionnaires?

survey methods for transport planning

Documents