a review of (total) survey error models

1

A Review of (Total) Survey Error Models

William D. KalsbeekSurvey Research Unit

University of North Carolina

2

PurposeTo review the following for existing

total survey error (TSE) models:

• Composition and Structure• Presentation• Utility

3

Presentations of TSE Models• TSE Model (a Definition): *

– A postulation to understand or predict, by theory or simulation, the properties or behavior of the survey process

• Presentations of TSE:– Practical:

• Process origins; plus statistical nature, impact, measurement and/or control of error

– Theoretical:• A formulary (usually MSE-based)

* Based on Kotz, et al. (1981-89).

4

ThesisTSE Models

• Have organized our thinking on the statistical effects of error sources

But

• Translation of this understanding into practical improvement has been limited and largely marginalized to individual sources of error

5

Thesis

For the Future:

• Greater research emphasis on TSE components and application of TSE findings for a broader array of data systems?

• Model re-direction needed?

6

Sources of Error *• Sampling• Frame• Measurement• Nonresponse (Unit/Item)

* One might also view the underlying stochastic model responsible for the data array in model-based inference as a source of error

7

A Review of TSE Presentations • Tracking presentations for 2+ sources• Structural basis

– Various decompositions of MSE• Grouping by number of sources and:

– Type of presentation (practical/theoretical)– Source interrelationship (separate/integrated)

• Question: – Which parts of the survey process have TSE

models accommodated?

8

Sources of Error

1. Sampling2. Frame3. Measurement4. Nonresponse (Unit/Item)

10

Washington Nationals: Season starts: 4/4/05 (at Phillies) Home opener: 4/14/05 (Diamondbacks)

13

AROUND THE HORN

14

TSE

AROUND THE HORN

Total Survey Error

15

STSE

Sampling

AROUND THE HORN

16

S

M

TSE

Measurement

AROUND THE HORN

17

S

M

F TSE

Frame

AROUND THE HORN

18

S

NR

M

F

UI

TSE

Nonresponse

Item

Unit

AROUND THE HORN

19

S

NR

M

F

UI

TSE

AROUND THE HORN

Variances

20

UI

S

NR

M

F TSE

AROUND THE HORN

Interfaces

21

UI

F

M

S

NR

TSE

AROUND THE HORN

Biases(additive)

22

UI

M

S

NR

F TSE

A HOME RUN

23

• Nonresponse Bias – Hansen and Hurwitz (1946)– Several extension to more

complex sample designs • El-Badry (1956)• Rao (1968, 1973)• Rao and Hughes (1983)

Two-Source Theoretical (Integrated):

UI

F

M

S

NR

TSE

24

• Measurement Error Model – Hansen, et al. (1951a,

1951b, 1961, and 1964) – Subsequent work by others

at the Census Bureau– Forsman (1989) review


UI

F

NR

M

STSE

25

• Multiplicity Estimators:– Birnbaum and Sirken (1965)– Several subsequent papers by Sirken, et al.


S

NR

M

F

UI

TSE

26

• Model-Based Inference with Missing Data– Little (1995)– Little and Rubin (2002)


UI

F

M

S

NR

TSE

27

• Platek, et al. (1977, 1983)• Lessler (1983)

Three-Source Theoretical (Integrated):

UI

F

M

S

NR

TSE

28

• Following Kish (1965)– Anderson, et al (1979)– Groves (1989)– Groves, et al. (2004)

• Federal Committee on Statistical Methodology– FCSM (2001)– Kasprzyk & Giesbrecht (2003)– Other error profiles by

Bailar and colleagues for Census statistics

All-Source Practical (Separate):

UI

F

NR

M

STSE

29

• Lessler and Kalsbeek (1992)• Sarndahl, Swennsson, and

Wretman (1992)

All-Source Theoretical (Separate):

UI

SF

M

NR

TSE

30

• A general model appended to Lessler and Kalsbeek (1992)

All-Source Theoretical (Integrated):

UI

M

S

NR

F TSE

31

Utility of Existing Models• Provides a theoretical basis in survey

practice to:– Structure our thinking– Motivate preventive strategies– Suggest process quality indicators– Suggest measurement approaches– Catalog empirical findings

32

Limitations of Existing Models *• Compartments and smokestacks

– Marginalized treatment of error sources• Plausibility and complexity

– Inverse relationship between proximity to reality and complexity

• Context and comparability– Breadth of model utility

• Lack of Attention– Priorities and cost

* Inspiration and insight from Platek and Sarndahl (2001)

33

Questions for the Future• More emphasis on studying and

minimizing TSE?– For the major and minor leagues

• Greater integration of TSE and practice?– Cataloging and lessons learned

• New directions in TSE model structure?– All sources jointly TSE– Action-directed models TQM? – More process indicators

a review of (total) survey error models

Documents

measurement error model

theoretical basis

error profiles

modelbased inference

tse components

general model

application of tse findings

survey practice