some acs data issues and statistical significance (moes)

35
Some ACS Data Issues Some ACS Data Issues and Statistical and Statistical Significance (MOEs) Significance (MOEs) Table Release Rules Table Release Rules Statistical Filtering & Statistical Filtering & Collapsing Collapsing Disclosure Review Board Disclosure Review Board Statistical Significance Statistical Significance Testing & Margins of Error Testing & Margins of Error (MOEs) (MOEs)

Upload: jacqui

Post on 05-Jan-2016

27 views

Category:

Documents


0 download

DESCRIPTION

Some ACS Data Issues and Statistical Significance (MOEs). Table Release Rules Statistical Filtering & Collapsing Disclosure Review Board Statistical Significance Testing & Margins of Error (MOEs). Table Release Rules. February 28, 2007. “B” and “C” Tables. Full Table – PASSED FILTERING. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Some ACS Data Issues and Statistical Significance (MOEs)

Some ACS Data Issues and Some ACS Data Issues and Statistical Significance Statistical Significance (MOEs)(MOEs)

Table Release RulesTable Release Rules

Statistical Filtering & CollapsingStatistical Filtering & Collapsing

Disclosure Review BoardDisclosure Review Board

Statistical Significance Testing & Statistical Significance Testing & Margins of Error (MOEs)Margins of Error (MOEs)

Page 2: Some ACS Data Issues and Statistical Significance (MOEs)

Table Release RulesTable Release Rules

February 28, 2007February 28, 2007

Page 3: Some ACS Data Issues and Statistical Significance (MOEs)

““B” and “C” TablesB” and “C” Tables

Page 4: Some ACS Data Issues and Statistical Significance (MOEs)

Full Table – Full Table – PASSED FILTERINGPASSED FILTERING

Statistically Statistically too Smalltoo Small

Page 5: Some ACS Data Issues and Statistical Significance (MOEs)

Collapsed TableCollapsed Table

Page 6: Some ACS Data Issues and Statistical Significance (MOEs)

The Census Bureau StoryThe Census Bureau Story

Why did we collect all this data if we were not going to

release it?

Page 7: Some ACS Data Issues and Statistical Significance (MOEs)

ACS Data Release Rules

Doug Hillmer

Data Products Area

American Community Survey Office

U.S. Census Bureau

October 11, 2006

Page 8: Some ACS Data Issues and Statistical Significance (MOEs)

Limitation of Disclosure Risk

– The Census Bureau’s Disclosure Review Board (DRB) must clear all data products prior to their release to the public.

Assurance of Statistical Reliability

– Data users need to be able to use ACS estimates as official Census Bureau data. Thus, some rules must be in place to ensure minimum reliability of estimates.

– Statistical reliability is assured by:

• Population size thresholds below which estimates are not released

• Data release testing and collapsing of tables that fail

The Census Bureau Will Not Release All Available Estimates to the Public

Page 9: Some ACS Data Issues and Statistical Significance (MOEs)

The ACS “Identity Crisis” on Reliability• Ultimately, the 5-year estimates, with no “data

release rules” acts as a long-form replacement• Single-year ACS sample is more like a current

demographic survey – although much larger in size

• Question to answer for single-year estimates: Do we accept less detail in our measures of characteristics or do we allow more detail but with data release rules in place? Less detail punishes those areas with the diversity to support the detail.

Page 10: Some ACS Data Issues and Statistical Significance (MOEs)

Choices for displaying estimatesin ACS data products

No suppression

1. Publish full detail with no suppression but higher pop threshold (eg., 500,000)

2. Publish limited set of estimates for all areas with 65,000+ pop

3. Published more detailed estimates for higher pop threshold and limited set for lower threshold

With suppression or Warnings4. Define a very detailed set of estimates for all geo areas with

65,000+ pop and suppress estimates that fail reliability test

5. Define a very detailed set of estimates for all geo areas with 65,000+ pop and flag estimates that fail reliability test

Page 11: Some ACS Data Issues and Statistical Significance (MOEs)

Filtering <<Data Release Rules >>

• Goal: to identify “weak” tables• Some tables have many zero or “near zero” cells

and relatively large standard errors• Filtering <<Data Release>> rule used during

2000-2004 ACS: drop tables if…– Universe is less than 500 (weighted) – Average cell size is less than 2 cases (unweighted)

• filtering <<data release>> rule used now: – Accept if median coefficient of variation is less than or

equal to 61%– Otherwise, collapse and review again

Page 12: Some ACS Data Issues and Statistical Significance (MOEs)
Page 13: Some ACS Data Issues and Statistical Significance (MOEs)

Why not just use cell suppression as is done for the Economic products?

Advantages• Gets rid of the “bad” estimates• Keeps the “good” estimates (depends on complementary

suppression)

Disadvantages• Creates “holes” in distributions• Makes new problems for combined estimates (eg., in derived

products, such as data profiles)• Produces a new set of problems for year-to-year comparisons

Page 14: Some ACS Data Issues and Statistical Significance (MOEs)

Data Release Testing – Step by Step• Compute coefficients of variation

– Coefficient of variation = standard error / estimate– Standard error = (upper bound – estimate) / 1.65– If the estimate = 0 set coefficient of variation = 100%

• Ignore total and sub-total lines in base table• Sort coefficients of variation in descending order• Find the middle value (the median)• If the median is greater than 61% the table FAILS

(median > 61% means more than half of the cells have a lower bound of 0; i.e., these cells are not statistically different from 0)

• If the median is 61% or less the table PASSES

Page 15: Some ACS Data Issues and Statistical Significance (MOEs)

Collapsing

• Goal: release a simplified version of a base table for a geographic area that otherwise would get nothing

• Decisions on design of collapsed tables are made by subject-matter experts at the Census Bureau

• For operational reasons, only one collapsed version of each base table will be available regardless of geographic area

Page 16: Some ACS Data Issues and Statistical Significance (MOEs)

How the Data Release Rules will Work with Collapsed Versions of Base Tables

Page 17: Some ACS Data Issues and Statistical Significance (MOEs)

More About Collapsing

• Collapsed Tables are designed to assure that derived products (profiles, ranking tables, subject tables,…) can still be sourced from the base tables

• 2005 Tables: if a table passes filtering and a collapsed version exists, publish both the original version and the collapsed version for that geographic area

Page 18: Some ACS Data Issues and Statistical Significance (MOEs)

Problems to fix in the current implementation of the data

release rules

• Collapsed versions missing in some cases

• Collapsed versions that aren’t working

• Poor choices in “sourcing” for derived products (eg., profiles)

Page 19: Some ACS Data Issues and Statistical Significance (MOEs)

Statistical Significance Testing Statistical Significance Testing

Why should I do it?Why should I do it?

When should I do it?When should I do it?

How do I do it?How do I do it?

Page 20: Some ACS Data Issues and Statistical Significance (MOEs)

Testing is ImportantTesting is Important

Page 21: Some ACS Data Issues and Statistical Significance (MOEs)

• Estimate X is bigger than YEstimate X is bigger than Y

• Estimate X this year is larger Estimate X this year is larger than X last yearthan X last year

• Estimate X is smaller than Estimate X is smaller than Census 2000 valueCensus 2000 value

• State Z has the highest valueState Z has the highest value

Statements you might want to makeStatements you might want to make

Page 22: Some ACS Data Issues and Statistical Significance (MOEs)

1.1. Get the Margin of Error (MOE) from ACS Get the Margin of Error (MOE) from ACS

2. Calculate the Standard Error (SE)2. Calculate the Standard Error (SE) [SE = MOE / 1.645][SE = MOE / 1.645]

3. Solve for Z where A and B are the two 3. Solve for Z where A and B are the two estimatesestimates

22 (SE(B))(SE(A))

BAZ

4. If Z < -1.645 or Z > 1.6454. If Z < -1.645 or Z > 1.645Difference is Significant at 90% confidenceDifference is Significant at 90% confidence

How do I do a significance test?How do I do a significance test?

Page 23: Some ACS Data Issues and Statistical Significance (MOEs)

Obtaining Standard Errors is the KeyObtaining Standard Errors is the Key

• Sum or Difference of EstimatesSum or Difference of Estimates

• Proportions and PercentsProportions and Percents

• Means and Other RatiosMeans and Other Ratios

Simple FormulasSimple Formulas

222 )()(1

BSEPASEB

PSE

22 )(BSEASEBASE

B

AP Where….

Page 24: Some ACS Data Issues and Statistical Significance (MOEs)

There is There is HELP HELP off in off in the the

wingswings

Page 25: Some ACS Data Issues and Statistical Significance (MOEs)

But what if I am using 2000But what if I am using 2000non-ACS Data?non-ACS Data?

Where’s are my MOEs?Where’s are my MOEs?

Page 26: Some ACS Data Issues and Statistical Significance (MOEs)
Page 27: Some ACS Data Issues and Statistical Significance (MOEs)

Lets get to work on the Standard ErrorLets get to work on the Standard Error

)1(Y5ΥSE NY

N = Size of publication area (population)

Y = Estimate of characteristic

XSurvey Design Factor

Page 28: Some ACS Data Issues and Statistical Significance (MOEs)

Survey Design Factor

www.census.gov/prod/cen2000/doc/tablec-xx.pdfxx=fl

Mode to Work 1.4 1.2 0.9 0.7

Page 29: Some ACS Data Issues and Statistical Significance (MOEs)

)1(Y5ΥSE NY

N = Size of publication area (population = 362,563 )

Y = Estimate of characteristic

5Y = 5* 126,540632,700

1 - (Y/N) = 126,540 / 362,5631- 0.3490152

0.6509848

SE = 641.7772

Page 30: Some ACS Data Issues and Statistical Significance (MOEs)

)1(Y5ΥSE NY X

Survey Design Factor

SE = 641.777 126,540 / 362,563 = 35%

Survey Design Factor

= 0.7Final Adjusted SE = 450

Page 31: Some ACS Data Issues and Statistical Significance (MOEs)
Page 32: Some ACS Data Issues and Statistical Significance (MOEs)
Page 33: Some ACS Data Issues and Statistical Significance (MOEs)
Page 34: Some ACS Data Issues and Statistical Significance (MOEs)

Tempting

Green is OKGreen is OK

This is NOTThis is NOT

Page 35: Some ACS Data Issues and Statistical Significance (MOEs)

Want to do an Want to do an exercise on your exercise on your

own?own?