7-mon-harbor-235-calibrating bias in online samples-v2 audience measurem… · •online survey...
TRANSCRIPT
#ARFAxS
Calibrating Bias in Online Samples for High Quality Surveys at Scale
Steven Millman
Chief Scientist
MRI-Simmons
#ARFAxS
Calibrating for Bias in Online Samples
Steven Millman – Chief Scientist, MRI-Simmons
Contributing Author: Hu Yang – Lead Data Scientist, MRI-Simmons
• Online survey panelists are not generally drawn randomly from the population of interest
• Online survey panelists are individuals that have made a conscious decision to monetize their use of the Internet
• This research sought to identify the ways in which online panelists are different, and how those differences can be corrected
• A key finding of this research is that non-probability online panelists are different from probability sample in significant ways that cannot be corrected with simple demographic weighting
Online panelists are not representative(in other news, the sun is hot)
“Two important findings are that nonresponse [or selection] bias is generally unrelated to the overall
response rate and that… bias tends to be item-specific. Some or even many items may exhibit no bias while others have
substantial bias.” *
-John L. Czajka & Amy Beyler
Sample bias tends to be narrow, not broad
* "Declining Response Rates in Federal Surveys: Trends and Implications (Background Paper)," June, 2016. Mathematica Policy Research Reports prepared for the Office of the Assistant Secretary for Planning and Evaluation, U.S. Department of Health & Human Services. “[or selection]” was added by the author of this presentation and not Czajka & Beyler.
• The National Consumer Study (NCS) is a high-quality random probability sample of 25,000 American adults each year
• Sample is collected by mail from list of residential addresses• The NCS contains over 60,000 data elements, including:
► Demographics► Product purchase & usage► Brand preferences► Lifestyles, attitudes, and opinions► Media usage & preferences► Shopping behavior► Custom variables
The National Consumer Study
• Approximately 60% of the NCS was converted to an online version of the study
• Online sample (address verified) was stratified by age, gender, & top 14 DMAs
• The online study was fielded to 13,905 online panelists and compared to 20,953 respondents from the 2018 Summer NCS study (online adults only)
• New universe estimates where generated for the online US population, adding time spent online as a weight
• Compared weighted average responses between mail probability and online non-probability samples
• Identified significant and substantive deviations between properly weighted probability and online panel samples
Research Design
• Comparisons were made using raw data, demo weighted, and weighted against demo and time spent online
• We subsequently developed a naïve process to identify additional weights from among all questions in the digital version of the survey to find the best possible additional calibrators
• Remaining substantive and significant differences were likely to be driven by selection bias. Other potential biases could result from:◦ Modal differences◦ Questionnaire differences◦ Timing of the survey (October-December v. full year)
Identifying the bias in non-probability online panels
Results of weighting strategies
Variables with significant differences between probability and non-probability samples
Variables(~40,000 total) Unweighted Demo
Weighted Only
Weighted by Demos &
Internet Use
Weighted & Calibrated
Variables with more than five percentage point deviations
1,854(4.52%)
1,769(4.31%)
1,579(3.85%)
1,391(3.39%)
Variables with more than ten percentage point deviations
354(0.09%)
312(0.08%)
215(0.05%)
198(0.04%)
• Most variables were relatively unbiased compared to probability sample• Weighting on demographics had very little impact (see Appendix 1)• Even after weighting for demos, Internet use and naïve calibrators,
substantial biases remained
• Online shopping behavior
• Communication
• Information seeking
• Video streaming
• Use of technology
The most strongly biased variables fell into a small set of question categories
Online shopping behavior(%online-%probability, all significant at p<0.00001)
Self-reported behavior of online panelist Point DiffPaypal.com, last 30 days +28.2
Amazon.com, last 30 days +18.2
Find/Print Coupons from Websites, last 30 days +17.6
Made a purchase online +15.4
Online banking, last 30 days +15.4
Often I can be swayed by coupons to try new food products +13.5
Gathered information for shopping online, last 30 days +12.5
Ebay.com, last 30 days +12.5
Because of a coupon, I’d be drawn to a store I normally don’t shop at +11.6
Groupon.com, last 30 days +10.8
Communication(%online-%probability, all significant at p<0.00001)
Self-reported behavior of online panelist Point DiffEmail, highest use +23.7
Used email on mobile/handheld device, last 30 days +22.3
Visited social networking website, last 30 days +15.7
Visited social networking website on mobile/handheld, last 30 days +15.2
Internet has changed the way I spend my free time +15.1
Visited social networking website, highest use +12.3
Twitter.com, last 30 days +11.6
Facebook, Instagram, and Pinterest, last 30 days ~+5.0
Information seeking(%online-%probability, all significant at p<0.00001)
Self-reported behavior of online panelist Point DiffCheck the weather on mobile/handheld, last 30 days +18.4
Use Internet in stores +13.8
Use search engines +12.5
Bing +16.0
Yahoo +15.1
Google +11.6
Look for recipes online, highest use +11.6
Video streaming(%online-%probability, all significant at p<0.00001)
Self-reported behavior of online panelist Point DiffAmazon Prime Instant Video, last 30 days +20.0
Netflix, last 30 days +15.6
Hulu (limited commercials), last 30 days +11.1
Hulu (no commercials), last 30 days +10.1
Download or stream TV programs, last 30 days +12.6
Download or stream Movies, last 30 days +10.3
Watch Video Content online, last 30 days +9.7
My computer is a primary source of fun and entertainment +12.9
The Internet has become a primary source of entertainment for me +10.3
Attended the movies in a theater in the last 6 months? -12.2
Use of technology(%online-%probability, all significant at p<0.00001)
Self-reported behavior of online panelist Point DiffUse the Internet on a desktop computer at home +20.9
Use the Internet on a laptop computer at home +15.4
Use the Internet on a gaming system +15.1
Use the Internet on a tablet +14.7
Use the Internet on an iPod/MP3 player +10.6
Own or play video games +11.6
Use Chrome most often as Internet browser +14.4
Use Internet Explorer most often as Internet browser -10.6
Have a family plan for cellular phone -17.5
• I stick with clothing styles that have stood the test of time• Many similarly priced clothing brands look alike• I don't like the idea of being in debt• There is nothing wrong with indulging in eating fattening
foods from time to time• I try to include plenty of fiber in my diet these days• I usually look for the freshest ingredients when I cook• Non-vegetarian
Other psychographics with where online sample is >10 points higher
• Interest in MLB, NFL, MLS (much less)• Moisturizers/Creams/Lotions (much less often)• Bought an automobile last 12 months (much less)• Thirst Quenchers and Sports/activity drinks (much less)• Sneakers Athletic Shoes (much less)• Use eye shadow (much more)• Often eat frozen dinner (much more)• Hershey's Milk Chocolate (much more)
Selected brand use with over +/-10 point differences compared to probability
• Non-probability online panelists can be used to provide a reasonably representative snapshot of populations of interest, but are wildly inaccurate for an important subset of topics
• In particular, attitudes and use of the internet and technology are severely biased in ways that cannot be corrected with demographic weights
• In order to use online survey panelists to investigate these topics, the use of a calibration set derived from either from a properly representative random sample of respondents or from census-level data would be required
• Rule of thumb: Try to avoid asking online survey panelists about online activities and beliefs
Conclusion
• Gender • Age
• Personal Income (Includes not-employed)
• Marital Status
• County Size
• Number of Adults in HH (Non-Hispanic)
• Number of HISPANIC Adults in HH (Hispanics)
• Race• Education
• Homeowner type
• Presence of Children
• Area (DMA or Region)
• Born in US (Hispanic only)
• Heritage by Region (Hispanic only)
• Language Spoken at Home (Hispanic only)• Hours Spent Online, Work+Home (Online sample weights only)
Appendix 1:Sample Weight Variables
#ARFAxS
Your Feedback Matters!Please rate this session* to help the ARF with future programming:
1. Click on “evaluation” at the bottom of the screen
2. Answer the two questions (They are multiple choice. It’s easy!)
3. If you haven’t reviewed a session you attended earlier, pleasego back to review it.
*Surveys are available on the event app. Instructions are on the back of the program guide.