1 2006 public use microdata file (pumf) 1. change factors 2. scenarios : characteristics 3. analytic...

7
1 2006 Public Use Microdata File (PUMF) 1. Change factors 2. Scenarios : characteristics 3. Analytic Content: additions and losses Outline DLI Ontario Training, Ryerson University, Dec. 13, 2007 Martine Grenier, Mokili Mbuluyo, Jean René Boudreau, Statistics Canada

Upload: destiny-macpherson

Post on 27-Mar-2015

212 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: 1 2006 Public Use Microdata File (PUMF) 1. Change factors 2. Scenarios : characteristics 3. Analytic Content: additions and losses Outline DLI Ontario

1

2006 Public Use Microdata File (PUMF)

1. Change factors2. Scenarios : characteristics3. Analytic Content: additions and losses

Outline

DLI Ontario Training, Ryerson University, Dec. 13, 2007

Martine Grenier, Mokili Mbuluyo, Jean René Boudreau, Statistics Canada

Page 2: 1 2006 Public Use Microdata File (PUMF) 1. Change factors 2. Scenarios : characteristics 3. Analytic Content: additions and losses Outline DLI Ontario

2

1. Change Factors

Improvement of the three files’ analytic content for greater use at the national and international levels

Greater accessibility of census data

Data confidentiality constraints• File size• Limited geography• Age “variable”• Income “variable”

Late release of PUMFs • Delay due to heavy workload of selecting, certifying and

deriving variables and quality control on the files

Page 3: 1 2006 Public Use Microdata File (PUMF) 1. Change factors 2. Scenarios : characteristics 3. Analytic Content: additions and losses Outline DLI Ontario

3

Content

 

1. Sample size Individuals: 800,000 records

  Families: 310,000 records

  Households and dwellings: 350,000 records

2. Geography Provinces, Territories, CMAs

3. Variables Variables extracted from the dissemination database

  Large number of derived variablesLess detailed variables for Maritime provinces andNorthern territories

  Variables repeated in the 3 files

Reduction of disclosurerisks

Substantial disclosure control by the microdata filereview committeeConfidentiality rules applied separately to each file3 years, expected release in 2010?

Production time Considerable amount of work for SM analysts to certify derived variables

2. Scenario #1: Status Quo (2001)

Page 4: 1 2006 Public Use Microdata File (PUMF) 1. Change factors 2. Scenarios : characteristics 3. Analytic Content: additions and losses Outline DLI Ontario

4

Content 

1. File size Single file: 800,000 records (individuals)

  Some persons will represent a family or a household

 

2. Geography Canada, 5 regions, 5 CMAs with a population of at leastone million

3. Variables Variables extracted from the dissemination database

  Derived variables of complexity level 4 or which require the use of limited data

 

Reduction of disclosure risks Eliminate values with Canada frequency of less than

100,000. Collapse some or all of age groups. Round off or

generate noise in income components

Production time   Projected release: Summer 2009Reduced certification

2. Scenario #2: Single File

Page 5: 1 2006 Public Use Microdata File (PUMF) 1. Change factors 2. Scenarios : characteristics 3. Analytic Content: additions and losses Outline DLI Ontario

5

Content

 

1. File size Hierarchical file: 350,000 records on households

All families and persons are included and identified in thehousehold (about 800,000 persons).

2. Geography Canada, regions with a population of at least 2 million

3. Variables 2B variables from the dissemination database

  Derived variables of complexity level 4 or which require the use of limited data

Reduction ofdisclosure risks

Eliminate values with Canada frequency of less than

100,000. Collapse age groups. Round off or generate

noise in income components

Production time   Reduced certificationProjected release: Summer 2009

2. Scenario #3: Hierarchical File

Page 6: 1 2006 Public Use Microdata File (PUMF) 1. Change factors 2. Scenarios : characteristics 3. Analytic Content: additions and losses Outline DLI Ontario

6

3. Analytic Content: additions and lossesPUMF-2006

(Status Quo )PUMF-2006 (Single File)

PUMF-2006 (Hierarchical File)

Content

Size: 2.7% of the population Size: 2.7% of the population Size: 2.7% of the population

Independent samples of the three universes

Some people represent a family or a household

All families and persons in households sampled are included

Diverse geographies at theprovince and CMA levels

Geography limited toprovinces and major CMAs(pop. 1 million)

Geography more limited toregions

Families and households wellrepresented

Loss of information about families and households

File representative of households; more varied content including all data

Repetition of variablesbetween the 3 universes; complex derived variables

Variables taken from the questionnaire so that userscan create their own derivedvariables

Variables taken from the questionnaire so that userscan create their own derivedvariables

Analytic content limited to one universe at a time 

Analytic content extended to the three universes

Analytic content extended to the three universesGreater potential for analysisand international comparison

Production requirements

Certification and productionprojected for summer 2010

Production projected for summer 2009

Production projected for summer 2009

Confidentiality Suppression level higher thanin 2001

Suppression level lower than in 2001 (less geographies)

Same suppression level as in 2001 (less geographies)

Page 7: 1 2006 Public Use Microdata File (PUMF) 1. Change factors 2. Scenarios : characteristics 3. Analytic Content: additions and losses Outline DLI Ontario

7

Thank you!