www.RoperCenter.uconn.edu 1
Upgrading ABC News/Washington Post Data Collections Using DDI
and Legacy Databases
Marc MaynardMarc Maynard
The Roper Center for Public Opinion ResearchThe Roper Center for Public Opinion ResearchUniversity of ConnecticutUniversity of Connecticut
IASSIST Conference 2005, Edinburgh, Scotland
www.RoperCenter.uconn.edu 2
Upgrading Data Collections
• Introduction
• Background
• Scope
• Challenges & Opportunities
• Prototype System
• Summary
www.RoperCenter.uconn.edu 3
The Roper Center Archives
• Public opinion data archive established in 1946
• Commercial and academic surveys from 1936-present
• The Archives house ~8,000 US and ~7,000 non-US surveys
including data files & documentation
• ABC News/Washington Post Survey Collection
Over 850 surveys 1979-present
www.RoperCenter.uconn.edu 4
Background: Metadata Integration
• Catalog of Holdings– study level– 15,000 records
– only studies for which raw data is housed at the Center
• iPOLL Databank– variable level– nearly 500,000
records– includes studies for
which data is not housed at the Center
www.RoperCenter.uconn.edu 5
Background: Metadata Integration
• External Review (2001)
• Overall Integrated Vision (IASSIST 2002)
• DDI – Archive Catalog Mapping (IASSIST 2003)– Study and File Level Integration (Sections 2 & 3)
• iPOLL Archive Catalog Links (2003-2004)
• Enhance Question/Variable Metadata (IASSIST 2005)
www.RoperCenter.uconn.edu 6
Background: Prototype Project
• ABC and the Post want to easily access and analyze all their survey data
• SPSS system files for post-1997 surveys exist
• Pre-1998 studies are a hodge-podge of available ASCII data, documentation and survey reports
• ABC experimented with various alternative strategies
• Determined that the major cost factor would be variable and response labeling
www.RoperCenter.uconn.edu 7
Scope
• >600 ABC/WP surveys, 1979-1997 More than 16,000 questions in the iPOLL system Fairly consistent documentation and data structure All ASCII data files
• Average about 35 variables per study Not including standard socio-demographic variables
• Employ a prioritized phased approach • Focus on joint monthly surveys (216 studies)
www.RoperCenter.uconn.edu 8
Challenges
1. iPOLL includes only surveys of US adult population
2. iPOLL does not store standard socio-demographic variables
3. Published results are source for many items
4. iPOLL does not store enough metadata on the variable level
www.RoperCenter.uconn.edu 9
Opportunities
• Enhance metadata available in iPOLL• Repurpose iPOLL’s store of question
text and response categories• Capitalize on:
– the fact that response categories are stored as individual items
– Linkages between question-level information and existing data files
www.RoperCenter.uconn.edu 10
Addressing the Challenges
1. iPOLL includes only surveys of US adult population
• State/Local surveys are lower priorities2. iPOLL does not store standard socio-
demographic variables• Add standard demogs menu to system
3. Published results are source for many items• Must allow for modifications to the variables
4. iPOLL does not store enough metadata on the variable level
• Extend iPOLL DataBank with DDI elements
www.RoperCenter.uconn.edu 11
Mapping Scheme - Sec. 4
Question/Variable
DDI Element Database Field
4.3 Name VarName
4.3.8.2 QstnLit Qstn_txt
4.3.1 RecSegNo RecSegNo
4.3.1 StartPos - EndPos Location
4.3.1 Width varWidth
4.3.23 varFormat varFormat
Response Categories
DDI Element Database Field
4.3.18.2 labl Resp_Txt
4.3.18.1 catValu Resp_Code
4.3.18 Missing Missing
www.RoperCenter.uconn.edu 12
File Preparation
iPOLL SPSS:Enhanced variable-
level metadataiPOLL
(q/v)
ProjectApplication
ASCII Data File
SPSS Portable File
SPSS Syntax File
Archive Catalog
Standard Demogs
www.RoperCenter.uconn.edu 13
Application Requirements
• Edit and add missing metadata to each variable– Variable names, location, type
• Review and edit response category coding • Select and add standard socio-demographic
variables• Specify any recodes within variables or to new
variables• Handle string, as well as numeric, value labeling and
recoding• Generate SPSS syntax file to include study
metadata, creation date and data file path and structure
www.RoperCenter.uconn.edu 14
Prototype System
www.RoperCenter.uconn.edu 15
www.RoperCenter.uconn.edu 16
www.RoperCenter.uconn.edu 17
www.RoperCenter.uconn.edu 18
www.RoperCenter.uconn.edu 19
Summary
• Continuation of metadata enhancement and integration efforts begun in 2001
• Will provide practical feedback and suggestions for extending the capabilities of iPOLL
• Promising beginning for expanding coverage to other data collections
www.RoperCenter.uconn.edu 20
iPOLL Databank can be found at:
http://www.ropercenter.uconn.edu/ipoll.html
Email: [email protected]