1 ola conference february 2008 session 1022 jeff moon head, maps, data, & government information...
TRANSCRIPT
1
OLA Conference February 2008Session 1022Jeff MoonHead, Maps, Data, & Government Information Centre (MADGIC)Queen’s University
An Introduction to
No statisticsDo I want to
use Statistics?NO
Flowchart: ‘Do I want to use statistics?’
What we’ll cover:
• What is survey data, and what’s the big deal?
• What’s happening in Ontario on the ‘data front’?
• Show me the goods…
• Why is this important at my library?
What is Survey Data and what’s the big deal?
Tables, Charts, Graphs
(in Books, CD-ROM, the WWW)
A ‘number’ Survey Data
(machine-readable)
Data continuum…
(Microdata)
Age Sex MarStat Children Income Occ Educ
Person 1 24 M 1 1 5 1 7Person 2 34 F 1 0 3 5 3Person 3 52 F 2 2 4 3 3Person 4 64 F 1 3 6 4 4Person 5 23 M 3 1 7 2 6Person 6 63 F 4 1 5 6 3………Person "n" 29 M 1 0 5 2 2
What is Survey Data and what’s the big deal?
Percentages
Counts
Standard Deviations
Cross-tabs
More advanced
AnalysisMeans
Statistical Analysis continuum…
Descriptive Statistics Inferential Statistics
What is Survey Data and what’s the big deal?
Tables, Charts, Graphs
(in Books, CD-ROM, the WWW)
A ‘number’ Survey Data
(machine-readable)
Statistics…
Percentages
Counts
Standard Deviations
Cross-tabs
More advanced
AnalysisMeans
Statistical Analysis…
(Microdata)
Survey DataAggregate DataPostcard Camera
“Fixed”
“Flexible”
What is Survey Data and what’s the big deal?
We’ll look at the flexibility of survey data a bit later on…
In the mean time, let’s look at the situation in Ontario
right now…
1990’sHome-grown survey data
systems
- Guelph, Western, Queen’s- No ‘cataloguing’ standard- Varying features/capabilities- Served a purpose at the time
2000’s Emerging
data cataloguing standards
Data Documentation Initiative-- an international standard for describing survey data.Like ‘MARC’, only for data
Mature commercial
software solutions
Software such as Nesstar, SDA, and others
In 2005, the Data IN Ontario (DINO) working group of OCUL (Ontario Council of University Libraries) started thinking about moving beyond ‘home-grown’ data solutions, adopting the DDI standard, and building a province-wide data solution. A discussion paper followed…
In 2007, with funding from OCUL and “Ontario Buys”, a Project Director was hired, and hardware/software purchased through Scholars Portal.
OCUL & Ontario Buys
Commercial
SoftwareScholars
Portal
DDI Standard
Ontario Data Documentation, Extraction Service and Infrastructure Initiative
Lead institutions in <ODESI> are Carleton and Guelph, with in-kind assistance from Queen’s University.
First step was developing a Canadian ‘best practices’ document for cataloguing data files using DDI – analogous to AACR2 for MARC.
Next, survey files were ‘marked up’ (catalogued) and loaded onto a test server at Guelph.
The team at Scholars Portal is working with <ODESI> to establish a data server and load data files.
13
SOFTWARE CHOSEN NESSTARDeveloped by the “Norwegian Social Science Data Services” -- Networked Social Science Tools and Resources
• In use internationally (Europe, UK, US, Canada)
• In Ontario: Queens, Guelph, Carleton, Windsor, Ottawa, U. of T. and Statistics Canada use Nesstar
• DDI compliant
• Search by keyword for surveys and survey questions
• Do basic data exploration and analysis on the web
• Download full datasets or subsets in popular formats
• Export tables and charts
15
Nesstar Publisher produces DDI-compliant metadata using a set of structured tags, grouped into ‘tabs’ in Publisher.
Document Description Tab
17
Study Description Tab
18
Other Study Materials Tab
19
File Description Tab
20
Variables Tab
21
Variable Groups Tab
22
Data Entry Tab
23
Other Materials Tab
24
Once ready, a ‘marked up’ survey file is ‘published’ to the
Nesstar Server where it becomes available through
Nesstar Webview.
Let’s take a look at how <ODESI> can be used to answer a research question.
How do men and women differ in perceptions of their health (using
weight as an example).
Concepts?Health
Body Mass Index (BMI)Weight
Males/Females
Starting point: A simple search on the Statistics Canada web site…
“Fixed”
“Flexible”
29
30
31
32
33
34
Variable ‘groups’ Variables
35
Basic ‘frequencies’ or ‘marginals’ for categorical variables…
36
Descriptive statistics for ‘continuous’ variables…
37
But what if we want to look at more than one variable at a time?
Say, for instance,
the issue of weight and
gender?
38
Before proceeding, you must log into the Nesstar System
39
OK… now we want to add gender as a variable.
40
41
Opinion of own weight, by sex
Proportionally, more women than men had the opinion that they were “Overweight”.
42
OK, but how does this change if we add an ‘objective’ measure of
weight, such as ‘Body Mass Index’ (BMI)?
43
Start where we left off… ‘opinion of own weight’, by sex
But add another variable as a ‘layer’…
44
Add ‘BMI class’ as a layer…
45
Of respondents who were ‘objectively’ underweight, proportionally more women than men had the ‘subjective’ opinion that they were “Just About Right”.
Layer = those with a BMI indicating ‘underweight’
46
Of respondents who were ‘objectively’ normal weight, proportionally more women than men had the ‘subjective’ opinion that they were “Overweight”.
Layer = those with a BMI indicating ‘normal weight’
47
Layer = those with a BMI indicating ‘overweight’
Of respondents who were ‘objectively’ overweight, proportionally more MEN than women had the ‘subjective’ opinion that they were “Just About Right”.
OK, I have an confession to make…
Statistical Weight…All the previous slides ignored an important concept… that of weight.
Not ‘weight in kilograms’ but rather ‘statistical weight’.
We don’t want to describe the sample… we want to describe the population at large (in this case, Canadians 18+).
Statistical weights are assigned by statisticians, not surprisingly, to each individual in a sample, based on a variety of demographic and sampling considerations. These weights reflect how many people a given respondent ‘represents’ in the population being studied.
Sample count Population EstimateStatistical weight
Weight ‘off’: Note the sample sizes
Weight ‘on’: Note the sample sizes
But also note the differences in percentages…
In general, you must apply the Statistical Weight in order to get valid results.
It is easy to turn weight ‘on’ in Nesstar ( ), or other statistical packages (e.g. SPSS, SAS, STATA).BUT READ THE DOCUMENTATION
They say a picture is worth a thousand words…
If this is true, then a good chart has to be worth at least a couple of hundred…
Let’s revisit our data visually using the ‘bar chart’ feature of Nesstar.
Weight is on
Barcharts showing weighted results:
Proportionally, of those who are objectively underweight, more women than men think they are ‘just about right’
Weight is on
Barcharts showing weighted results:
Proportionally, of those who are objectively normal weight, more women than men think they are overweight
Weight is on
Barcharts showing weighted results:
Proportionally, of those who are objectively overweight, more men than women think they are ‘just about right’
Searching for ‘questions’ in Nesstar: Simple Search
Search results – Simple search
You get all the surveys that have the ‘keyword’ you searched for… but specific questions (variables) are NOT highlighted.
Searching for ‘questions’ in Nesstar: Advanced Search
Advanced Search
Advanced Search Screen
Search results – Advanced search
Here, specific variables that meet the search criteria are shown, with the option of “opening in context”
61
Barchart
Table
Time series graph
Map
Clear
Weight
Subset
Export to spreadsheet
Download
Export PDF
Create bookmark
Help
Menu options:
OK, so what kind of data can I expect to find using ODESI?
1. Statistics Canada survey files released through the Data Liberation Initiative (Census PUMF’s, Special Surveys, General Social Surveys, and more)
2. Public Opinion Polls (e.g. Gallup)3. Survey files from other sources (academics)
These surveys and polls include questions on all manner of topics (politics, health, work, leisure, education, drug use, aging, spending, internet use, and many more)…
Let’s take a look at some Gallup questions…
Dataset: Canadian Gallup Poll, August 1951, #212
In some cities in Canada, horsemeat is now being sold, because of the high price of other meats. If horsemeat were available here, would you be willing to try it?
35.9% of respondents said “Yes” they’d be willing.
Of course, this questions begs for a yea or ‘neigh’ answer
Dataset: Canadian Gallup Poll, September 1956, #251
WOULD YOU FAVOR REQUIRING EVERY ABLE-BODIED YOUNG MAN IN THIS COUNTRY, WHEN HE REACHES THE AGE OF 18, TO SPEND ONE YEAR IN MILITARY TRAINING AND THEN JOIN THE RESERVES OR MILITIA?
65.7% favoured this.
$41-50
UP TO $40
OVER $100
$71-80
$81-100
$61-70
$51-60
Dataset: Canadian Gallup Poll, August 1953, #231
HOW MUCH DO YOU THINK A YOUNG MAN SHOULD BE EARNING PER WEEK BEFORE HE GETS MARRIED? $41 - $50 per week equals roughly
$2100 - $2600 annually.
Dataset: Canadian Gallup Poll, August 1953, #231
THERE'S AN ATTEMPT BEING MADE BY SOME FASHION LEADERS TO SHORTEN WOMEN'S SKIRTS. DO YOU THINK THAT WOMEN SHOULD FOLLOW THIS LEAD - AND WEAR SKIRTS SHORTER THAN THEY ARE NOW?
13% Shorter
82 % About the same
5 % Longer
Year % in FavourApprove of Birth Control? 1960 66.4%
1964 82.1%1965 78.7%
Approve of Male Sterilization? 1971 48.6%
DO YOU APPROVE OF THE USE OF BIRTH CONTROL?
Tracking Opinions over time
1. Researchers can search across all surveys in a collection.
2. Researchers have the ability to explore surveys in more detail (e.g. looking at questions by gender, province, age group, income, etc.).
3. Tables can be saved in Excel or Adobe format.4. Researchers can download data for use in more
powerful statistical packages (SPSS, SAS, etc.)
Key points about survey data in <ODESI>
In conclusion, ODESI will:
1. Provide a more level ‘data’ playing field for Ontario Universities.
2. Provide students and researchers with access to a substantial and growing body of survey and polling data, both current and historical.
3. Provide an easy, yet powerful, search and exploration tool (Nesstar) that will serve both beginners and ‘power users’.
4. Encourage cooperation and sharing of data and metadata in Ontario.
5. Serve as a potential model for other jurisdictions.
<odesi.ca>