new experiments on the design of complex survey questions
DESCRIPTION
New Experiments on the Design of Complex Survey Questions. Paul Beatty, National Center for Health Statistics Collaborators: Jack Fowler and Carol Cosenza, Center for Survey Research, University of Massachusetts-Boston. - PowerPoint PPT PresentationTRANSCRIPT
![Page 1: New Experiments on the Design of Complex Survey Questions](https://reader036.vdocument.in/reader036/viewer/2022062323/56815bda550346895dc9cb0c/html5/thumbnails/1.jpg)
New Experiments on the Design of Complex Survey
QuestionsPaul Beatty, National Center for Health
Statistics
Collaborators:Jack Fowler and Carol Cosenza,
Center for Survey Research, University of Massachusetts-Boston
![Page 2: New Experiments on the Design of Complex Survey Questions](https://reader036.vdocument.in/reader036/viewer/2022062323/56815bda550346895dc9cb0c/html5/thumbnails/2.jpg)
Optimal structure and presentation of explanatory material in survey questions Many survey questions are complex, particularly
on behavioral surveys This complexity is driven by:
The desire for very specific data points The need to collect data as efficiently as possible (i.e.
single questions if possible) A few common practices:
Presentation of material that follows the question mark The use of examples to illustrate complex concepts Detailed wording to capture relatively rare events
What alternatives do we have? Are they better?
![Page 3: New Experiments on the Design of Complex Survey Questions](https://reader036.vdocument.in/reader036/viewer/2022062323/56815bda550346895dc9cb0c/html5/thumbnails/3.jpg)
Methods Split ballot experimentation in RDD survey
(n=425) Original questions drawn from federal health
surveys; we constructed alternative questions Do responses differ across versions? If so, can we judge which distribution is more plausible?
Behavior coding random subset of tape recorded interviews (n=313) How often were initial responses inadequate? How often do respondents interrupt the question? How often did interviewer do something more than just
read the question to get a response? How often did respondents ask for repeat, clarifications,
and so on?
![Page 4: New Experiments on the Design of Complex Survey Questions](https://reader036.vdocument.in/reader036/viewer/2022062323/56815bda550346895dc9cb0c/html5/thumbnails/4.jpg)
Issue #1: Info after the question mark It is common for questions to apparently end but
then add some more material: In the past 12 months, how many times have you talked
to any health professional about your own health?
![Page 5: New Experiments on the Design of Complex Survey Questions](https://reader036.vdocument.in/reader036/viewer/2022062323/56815bda550346895dc9cb0c/html5/thumbnails/5.jpg)
Issue #1: Info after the question mark It is common for questions to apparently end but
then add some more material: In the past 12 months, how many times have you talked
to any health professional about your own health? Include in-person visits, telephone calls, or times you were a patient in a hospital.
![Page 6: New Experiments on the Design of Complex Survey Questions](https://reader036.vdocument.in/reader036/viewer/2022062323/56815bda550346895dc9cb0c/html5/thumbnails/6.jpg)
Issue #1: Info after the question mark It is common for questions to apparently end but
then add some more material: In the past 12 months, how many times have you talked
to any health professional about your own health? Include in-person visits, telephone calls, or times you were a patient in a hospital.
Concern: Do respondents pay adequate attention to this material? Failure to consider it could lead to under-reports.
![Page 7: New Experiments on the Design of Complex Survey Questions](https://reader036.vdocument.in/reader036/viewer/2022062323/56815bda550346895dc9cb0c/html5/thumbnails/7.jpg)
Issue #1: Info after the question mark It is common for questions to apparently end but
then add some more material: In the past 12 months, how many times have you talked
to any health professional about your own health? Include in-person visits, telephone calls, or times you were a patient in a hospital.
Concern: Do respondents pay adequate attention to this material? Failure to consider it could lead to under-reports.
Alternative: People talk to health professionals in person, over the
phone, or as a patient in a hospital. Including any of those, in the past 12 months how many times have you talked to a health professional about your own health?
![Page 8: New Experiments on the Design of Complex Survey Questions](https://reader036.vdocument.in/reader036/viewer/2022062323/56815bda550346895dc9cb0c/html5/thumbnails/8.jpg)
Results– Experiment 1 V1 V2
Qualifier: (after q) (begin of q) signif
Contacts w/healthprof in 12 months 6.6 5.9 n.s.
(n=214) (n=206)
Initial resp inadeq 32.5% 25.5% n.s.Resp req help 20.0% 13.1% p<.1
(n=160) (n=153)
![Page 9: New Experiments on the Design of Complex Survey Questions](https://reader036.vdocument.in/reader036/viewer/2022062323/56815bda550346895dc9cb0c/html5/thumbnails/9.jpg)
Issue #2: Related experiment– definition after the question mark Definitions are sometimes presented after the question
mark as well. For example: V1: Have any of your immediate blood relatives ever been told
by a doctor that they have diabetes? By "immediate blood relatives", we mean your parents, your children, and your brothers and sisters, whether or not they are still living.
![Page 10: New Experiments on the Design of Complex Survey Questions](https://reader036.vdocument.in/reader036/viewer/2022062323/56815bda550346895dc9cb0c/html5/thumbnails/10.jpg)
Issue #2: Related experiment– definition after the question mark Definitions are sometimes presented after the question
mark as well. For example: V1: Have any of your immediate blood relatives ever been told
by a doctor that they have diabetes? By "immediate blood relatives", we mean your parents, your children, and your brothers and sisters, whether or not they are still living.
V2: The next question is about immediate blood relatives-by that, we mean your parents, your children, and your brothers and sisters, whether or not they are still living. Have any of your immediate blood relatives ever been told by a doctor that they have diabetes?
If the definition is easier to ignore in V1, respondents might
interpret “blood relatives” more broadly than intended, leading to (erroneously) higher reports in V1.
![Page 11: New Experiments on the Design of Complex Survey Questions](https://reader036.vdocument.in/reader036/viewer/2022062323/56815bda550346895dc9cb0c/html5/thumbnails/11.jpg)
Results– Experiment 2 V1 V2
Definition: (after q) (begin of q) signif
Relative w/diabetes 42.6% 34.4% p<.1(n=209) (n=215)
Initial resp inadeq 7.2% 2.5% p<.1Interrupted 16.5% 0.6% p<.01Iwer intervention 9.2% 3.1% p<.05
(n=152) (n=159)
![Page 12: New Experiments on the Design of Complex Survey Questions](https://reader036.vdocument.in/reader036/viewer/2022062323/56815bda550346895dc9cb0c/html5/thumbnails/12.jpg)
Issue #3: Administration of response categories Conventional wisdom dictates that you administer the
question before offering response categories: V1: The last time you went to see a doctor, which of the
following best describes the main reason for your visit? Medical treatment for a new condition Follow-up care for an existing condition Or, a routine checkup
But what if this design encourages respondents to gravitate toward the first seemingly acceptable response rather than considering the whole list?
![Page 13: New Experiments on the Design of Complex Survey Questions](https://reader036.vdocument.in/reader036/viewer/2022062323/56815bda550346895dc9cb0c/html5/thumbnails/13.jpg)
Issue #3: Administration of response categories Conventional wisdom dictates that you administer the
question before offering response categories: V1: The last time you went to see a doctor, which of the
following best describes the main reason for your visit? Medical treatment for a new condition Follow-up care for an existing condition Or, a routine checkup
But what if this design encourages respondents to gravitate toward the first seemingly acceptable response rather than considering the whole list? V2: People schedule doctor visits for a variety of reasons,
including getting medical treatment for a new condition, follow-up care for an existing condition, or a routine checkup. Which of those best describes the main reason for your visit the last time you went to see a doctor?
![Page 14: New Experiments on the Design of Complex Survey Questions](https://reader036.vdocument.in/reader036/viewer/2022062323/56815bda550346895dc9cb0c/html5/thumbnails/14.jpg)
Results– Experiment 3V1 V2
Response categories: (after Q) (before Q) signif
New condition 21.5% 23.6% n.s.Follow-up41.0% 34.6%Routine exam 37.4% 41.9%
--------------------(n=195) (n=191)
Initial resp inadeq 10.6% 23.2% p<.01---------------------(n=141) (n=142)
![Page 15: New Experiments on the Design of Complex Survey Questions](https://reader036.vdocument.in/reader036/viewer/2022062323/56815bda550346895dc9cb0c/html5/thumbnails/15.jpg)
Issue #4: Examples vs. definitions to illustrate complex concepts Complex concepts such as “strenuous activity” are often
illustrated through examples: The next question is about strenuous tasks done around your
home. By "strenuous tasks," we mean things like shoveling soil in a garden, chopping wood, major carpentry projects, cleaning the garage, scrubbing floors, or moving furniture. In the past 30 days, on how many days did you do strenuous tasks in or around your home?
Although designed to express a range of possibilities, but we hypothesize that they have the opposite effect, focusing attention on a few specifics that might not be well chosen
We expect that a good definition will create higher reports and be easier to administer
However, previous attempts were not successful, presumably because our definition was too complex
![Page 16: New Experiments on the Design of Complex Survey Questions](https://reader036.vdocument.in/reader036/viewer/2022062323/56815bda550346895dc9cb0c/html5/thumbnails/16.jpg)
Examples vs. definitions V1: The next question is about strenuous tasks done around
your home. By "strenuous tasks," we mean things like shoveling soil in a garden, chopping wood, major carpentry projects, cleaning the garage, scrubbing floors, or moving furniture. In the past 30 days, on how many days did you do strenuous tasks in or around your home?
V2: The next question is about strenuous tasks done around your home. By "strenuous tasks", we mean any chores or projects that made you feel very tired by the time you finished them. In the past 30 days, on how many days did you do strenuous tasks in or around your home?
![Page 17: New Experiments on the Design of Complex Survey Questions](https://reader036.vdocument.in/reader036/viewer/2022062323/56815bda550346895dc9cb0c/html5/thumbnails/17.jpg)
Results– Experiment 4V1 V2(example) (def) signif
Strenuous activ/mo.4.9 3.9 n.s.Reported “zero times” 29.3% 37.7% p<.1
(n=208) (n=215)
Initial resp inadeq 27.0% 25.1% n.s.(n=153) (n=159)
![Page 18: New Experiments on the Design of Complex Survey Questions](https://reader036.vdocument.in/reader036/viewer/2022062323/56815bda550346895dc9cb0c/html5/thumbnails/18.jpg)
Issue #5: Question wording to capture rare events One reason questions are very complex is that
their authors want to prompt respondents to think of a broadly inclusive range of situations: In the past 12 months, how many times have you seen
or talked on the telephone about your physical or mental health with a family doctor or general practitioner?
The practice has a downside: respondents may lose track of the forest for the trees
Cognitive interview evaluation of the question above suggested that respondents thought it was exclusively about telephone contact with doctors.
If true, the question would generate significant undercounts.
![Page 19: New Experiments on the Design of Complex Survey Questions](https://reader036.vdocument.in/reader036/viewer/2022062323/56815bda550346895dc9cb0c/html5/thumbnails/19.jpg)
A simplified comparison “The next question is specifically about
primary care doctors….” V1: In the past 12 months, how many times
have you seen or talked on the telephone with a primary care doctor about your health?
V2: In the past 12 months, how many times have you seen or talked with a primary care doctor about your health?
The only difference between these two questions is the inclusion of “on the telephone.”
![Page 20: New Experiments on the Design of Complex Survey Questions](https://reader036.vdocument.in/reader036/viewer/2022062323/56815bda550346895dc9cb0c/html5/thumbnails/20.jpg)
Results– Experiment 5V1 V2(telephone) (no phone) signif
Mean contacts 3.4 3.6 n.s.“Zero” responses 24.7% 9.2% p<.01
(n=194) (n=195)
Initial resp inadeq 14.9% 21.1% n.s.Resp req help 5.7% 11.3% p<.1
(n=120) (n=121)
![Page 21: New Experiments on the Design of Complex Survey Questions](https://reader036.vdocument.in/reader036/viewer/2022062323/56815bda550346895dc9cb0c/html5/thumbnails/21.jpg)
Issue #6: Question decomposition Food consumption example:
“During the last 30 days, how many times did you eat cheese, including cheese as snacks, and cheese in sandwiches, burgers, lasagna, pizza, or casseroles? Do NOT count cream cheese.”
![Page 22: New Experiments on the Design of Complex Survey Questions](https://reader036.vdocument.in/reader036/viewer/2022062323/56815bda550346895dc9cb0c/html5/thumbnails/22.jpg)
Issue #6: Question decomposition Food consumption example:
“During the last 30 days, how many times did you eat cheese, including cheese as snacks, and cheese in sandwiches, burgers, lasagna, pizza, or casseroles? Do NOT count cream cheese.”
Clearly a challenging response task in general; we had little confidence in accuracy of reports
Cognitive testing: when probed about details… “did you include cheese in other dishes/sandwiches/etc? (If no), “would that have changed your overall answer?”
…some participants increased their reports
![Page 23: New Experiments on the Design of Complex Survey Questions](https://reader036.vdocument.in/reader036/viewer/2022062323/56815bda550346895dc9cb0c/html5/thumbnails/23.jpg)
Question decomposition (2) Alternative: multiple, response tasks divided into
reasonable sub-components:The next questions are about cheese you have eaten in the last
30 days. Please do NOT include any cream cheese you may have eaten.
During the last 30 days, how many times have you eaten cheese on a sandwich, including burgers?
During the last 30 days, how many times have you eaten cheese in lasagna, pizza, casseroles, or mixed in with other dishes?
During the last 30 days, how many times have you eaten cheese as a snack or appetizer?
![Page 24: New Experiments on the Design of Complex Survey Questions](https://reader036.vdocument.in/reader036/viewer/2022062323/56815bda550346895dc9cb0c/html5/thumbnails/24.jpg)
Results– Experiment 6 (responses)
V1 V2(single) (multi) signif
Mean times 13.9 19.0 p<.01(n=218) (n=228)
![Page 25: New Experiments on the Design of Complex Survey Questions](https://reader036.vdocument.in/reader036/viewer/2022062323/56815bda550346895dc9cb0c/html5/thumbnails/25.jpg)
Results– Experiment 6 (behavior coding) The individual “decomposed” questions
consistently outperform the single-item on virtually all measures
Orig Alt1 Alt2 Alt3Inadeq init resp 15.9 9.9 8.3 3.1Probes used 13.7 7.8 6.3 2.1Req help/repeat 19.1 15.1 3.1 2.1
(all expressed as %; most signif at p<.05)
![Page 26: New Experiments on the Design of Complex Survey Questions](https://reader036.vdocument.in/reader036/viewer/2022062323/56815bda550346895dc9cb0c/html5/thumbnails/26.jpg)
Some other considerations Mean time to administer original was 28 seconds; mean for
alternative was 51 seconds If we actually compare amount of probing, inadequate
responses, etc. to reach our desired data points (i.e., through three questions) the rates of behavior coding become very similar For example: 13.5% of original questions were probed; 15.1%
of the alternative series was ever probed Some research suggests that responses to decomposed
questions are less accurate (but…) Next steps: split ballot experiment on various food and
exercise questions (global vs. decomposed) with diary validation
![Page 27: New Experiments on the Design of Complex Survey Questions](https://reader036.vdocument.in/reader036/viewer/2022062323/56815bda550346895dc9cb0c/html5/thumbnails/27.jpg)
Conclusions Qualifiers and definitions that dangle after the
question mark should be avoided– provided there is a reasonable way to do so.
Conventional wisdom about response categories after the question seems to stand.
In spite of our reservations about examples, we have failed to find evidence that they limit frame of reference. They don’t perform wonderfully, but alternatives don’t do better
Details in questions have the potential to distract respondents from overall meaning. Additional words may help a few respondents, but simpler wording may have a more profound impact.
![Page 28: New Experiments on the Design of Complex Survey Questions](https://reader036.vdocument.in/reader036/viewer/2022062323/56815bda550346895dc9cb0c/html5/thumbnails/28.jpg)
Conclusions (2) Experiments presented here involve
single, interviewer-administered questions. Complexity can often be reduced by
asking multiple, smaller questions. However, the pressure to ask fewer
questions is real. Hopefully these results provide some guidance for how to structure questions given such constraints.