michelle simard joint unece/eurostat work session on statistical data confidentiality

12
Michelle Simard Joint UNECE/Eurostat Work Session on Statistical Data Confidentiality Tarragona, Spain, November 23 rd , 2011 Progress on Real Time Remote Access

Upload: ori-meadows

Post on 02-Jan-2016

35 views

Category:

Documents


2 download

DESCRIPTION

Progress on Real Time Remote Access. Michelle Simard Joint UNECE/Eurostat Work Session on Statistical Data Confidentiality Tarragona, Spain, November 23 rd , 2011. Since 2009. Developed a Prototype o ffering tabulated counts Developed Statistical Disclosure Control (SDC) - PowerPoint PPT Presentation

TRANSCRIPT

Michelle Simard

Joint UNECE/Eurostat Work Session on Statistical Data Confidentiality

Tarragona, Spain,

November 23rd , 2011

Progress on Real Time Remote Access

23-04-20Statistics Canada • Statistique Canada2

Since 2009

Developed a Prototype offering tabulated counts

Developed Statistical Disclosure Control (SDC)

Continued development on different fronts

23-04-20Statistics Canada • Statistique Canada3

EFT

RTRA Server

SAS Server

Data

CRMS Metadata

Autorisation

E-FT NET BINBOX

E-FT Net A INBOX

Air Gap

Confidentiality

Researcher

The Prototype

23-04-20Statistics Canada • Statistique Canada4

Spring 2010 Tabular(counts) outputs only - SAS only Modified PROC FREQ, Data steps Limited to particular household surveys data sets Confidentiality automated, no manual intervention Limited to some Canadian Federal Departments only Ability to query RTRA micro data at any time Access from any computer with internet access, using a secure

username and password No travel to Research Data Centres

The Prototype

23-04-20Statistics Canada • Statistique Canada5

Minimum 4 minutes plus process time Maximum 3 hours plus process time

Email notification for outputs with 7-day retention Formatted table in HTML or in SAS

The Prototype

23-04-20Statistics Canada • Statistique Canada6

Additive and Controlled Rounding (ACROUND)• Create rounded additive table close to original table

with controlled grand total→ semi-controlled rounding

• Use an iterative process to improve the semi-controlled result

→ controlled rounding

• Protects against possible matching of information with PUMF and small impact on precision

• Maximum : 5 dimensions

Current SDC

7

Proc Percentile

Release the percentile only if

1. there are at least n1 observations ≥ the percentile value and at least n2 observations ≤ the percentile value

2. it is ≠ minimum or maximum value

3. the total number of unweighted observations is ≥ m

4. the rounded frequency associated (from ACROUND) with the percentile is ≠ 0

Statistics Canada • Statistique Canada 23-04-20

Recent Development

8

Proc Mean

Release the mean only if

1. there are at least n3 observations present in the domain

2. the rounded frequency associated with the mean (from ACROUND) is ≠ 0

For both PROC, “magnitude” rounding will be applied on statistics to balance precision and noise

Statistics Canada • Statistique Canada 23-04-20

Recent Development

Not only balancing confidentiality and precision BUT quality measures as well

Evaluating the risk

Displaying information (What and How)• Statistics, Standard Error(SE), Variance, Coefficient of Variation

(CV), Confidence Interval (CI),Quality Indicator (QI), weighted counts, unweighted counts, ACROUND outputs?

23-04-20Statistics Canada • Statistique Canada9

Challenges and Issues

10

Value of CV*Quality

IndicatorGuideline

0 ≤ C.V. ≤ 0.10 (a) very good0.10 < C.V. ≤

0.20 (b) acceptable

0.20 < C.V. ≤ 0.35

(c) marginal

C.V. > 0.35 (d) very poor

Quality Measures

Release estimate with SE and a Quality Indicator (QI) If not releasable ==> put ‘X’s or other symbol otherwise release SE and QI as follows:

*Note: CV is calculated from original non-rounded S.E. and percentileStatistics Canada • Statistique Canada 23-04-20

23-04-20Statistics Canada • Statistique Canada11

Next steps

Used output control techniques rather than input control techniques

Next step: proportions, ratios, totals, models May need input control techniques when going into modeling

Expansion to the academic community Expansion to Censuses, then administrative data Streamlining the approval processes Developing a “fee” structure and “penalty” processes

23-04-20Statistics Canada • Statistique Canada12

THANK YOU

For more information, Pour plus d’information,

please contact: veuillez contacter :

Michelle Simard

[email protected]