statistical disclosure control

32
Statistical Disclosure Control Philip Johnston, Information Services Division, NHSNSS ScotPHO training course, 1 April 2011

Upload: rashad-joyner

Post on 03-Jan-2016

45 views

Category:

Documents


3 download

DESCRIPTION

Statistical Disclosure Control. Philip Johnston, Information Services Division, NHSNSS ScotPHO training course, 1 April 2011. Statistical Disclosure Control. background / definitions ISD practice a high profile case some examples. Legislation & standards. Data Protection Act - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Statistical Disclosure Control

Statistical Disclosure Control

Philip Johnston, Information Services Division, NHSNSS

ScotPHO training course, 1 April 2011

Page 2: Statistical Disclosure Control

Statistical Disclosure Control

• background / definitions

• ISD practice

• a high profile case

• some examples

Page 3: Statistical Disclosure Control

Legislation & standards

• Data Protection Act• Caldicott recommendations• NHS code of confidentiality• statistical disclosure guidance (eg ONS)• Official Statistics code(s) of practice• public trust• case law

Page 4: Statistical Disclosure Control

What is ‘disclosure’?

• When confidential information about a person/body is released, either directly or indirectly, in breach of public trust or legal obligations

• [ Even aggregate tables risk revealing confidential information ]

Page 5: Statistical Disclosure Control

What is ‘statistical disclosure control’ (SDC)?

• The practice of reducing the risk of disclosure by suppressing, aggregating or modifying data before release

Page 6: Statistical Disclosure Control

Types of disclosure

• Identification – ‘that’s you’

• Self-identification – ‘that’s me’

• Attribute disclosure – learn something new

• Motivated intruder – might target an individual

• ‘Differencing’ – combining different sources

Page 7: Statistical Disclosure Control

‘Individual Attribute’ Disclosure

Health Board XTreatment Y

Age group

Treatment Type 31-35 36-40 41-45 46-50 Total

Type 1 1 0 7 1 9

Type 2 0 0 18 19 37

Type 3 0 12 5 0 17

Total 1 12 30 20 63

Page 8: Statistical Disclosure Control

‘Group Attribute’ Disclosure

Health Board XTreatment Y

Age group

Treatment Type 31-35 36-40 41-45 46-50 Total

Type 1 1 0 7 0 8

Type 2 0 0 18 20 38

Type 3 0 12 5 0 17

Total 1 12 30 20 63

Page 9: Statistical Disclosure Control

Self Identification

Health Board XTreatment Y

Age group

<16 16-30 31-60 >60 Total

Drug 1 6 20 10 1 37

Drug 2 7 25 18 10 60

Drug 3 9 40 25 8 82

Total 22 85 53 19 179

Page 10: Statistical Disclosure Control

When is SDC needed?

The level of control needed depends on the risk ofdisclosure

The risk of disclosure depends on:• who wants the data• the data in question• what they are going to do with it

We need only take account of what is likely reasonably to happen, not what is hypothetically possible.

Page 11: Statistical Disclosure Control

aspects of ISD

• 500 staff, varied grades/types/roles

• 100 data sets

• per year: - 100 publications - 3000 information requests (incl. FoIs) - 500 Parliamentary Questions

• a range of users, eg: SG, NHS Boards, Local Authorities, researchers, media/public

Page 12: Statistical Disclosure Control

ISD practice

• ISD guidance/practice has evolved in recent years

• Informed by ONS guidance on abortions (2005), health (2006)

• ISD SDC Protocol, first issued in March 2009

• Applies to publications, information requests, FOIs, PQs, ‘management information’

Page 13: Statistical Disclosure Control
Page 14: Statistical Disclosure Control

ISD practice - risk assessment

• based on: - likelihood of an attempt of disclosure - impact of disclosure

• consider: - cell values and table design - is the subject ‘sensitive’? - size of ‘population at risk’ - geography / institutions / practitioners - judgement – no magic formula

Page 15: Statistical Disclosure Control

ISD practice – SDC methods

preferred current ISD methods:• table re-design (eg aggregation)• cell suppression (primary / secondary)

then consider:• rounding

other methods (discuss first with Head of Stats team):• adjusting cell values (e.g. Barnardisation), database

modification

Page 16: Statistical Disclosure Control

ISD practice – a few points

• where possible, discuss options with customer

• if SDC is used then provide some explanation for users

• primary suppression should not be distinguishable from secondary suppression

• document the rationale – shows that thought has been applied

• ‘management information’ – risks are usually lower

• wider question? – what do ‘small numbers’ mean?

Page 17: Statistical Disclosure Control

Getting the balance right

‘Risk’- keep data safe

‘Utility’- exploit the data

‘maximise utility while reducing risk to acceptable level’

Page 18: Statistical Disclosure Control

Other organisations?

• … have different procedures

• eg Scottish Government, GROS, England NHS Information Centre

• NHS Boards?

Page 19: Statistical Disclosure Control

An FoI example - childhood leukaemia (1)

• on behalf of an MSP [11/01/05]:

“Please supply me with details of all incidents of leukaemia for both sexes in the age range 0-14 by year from 1990-2003 for all the DG [Dumfries and Galloway] postal area by census ward.”

Page 20: Statistical Disclosure Control

An FoI example- childhood leukaemia (2)

14 separate years

18 cases (aged 0-14)

47 census wards- ave. ~50 sq. miles- ave pop ~550

Page 21: Statistical Disclosure Control

• ISD refused to release the information: a risk of identification of patients and therefore in

breach of the Data Protection Act

• Customer appealed to Scottish Information Commissioner (SIC):

SIC said that data should be released

• ISD appealed to Court of Session: Court said that data should be released

• ISD appealed to House of Lords

An FoI example- childhood leukaemia (3)

Page 22: Statistical Disclosure Control

• House of Lords judged that SIC should reconsider the case.

• Dr Adam Bryson, (then) Medical Director of NHS NSS, said [June 2008]:

“Our motive (…) has been to secure clarity on the

legal position regarding a serious issue that potentially impacts on the rights to privacy of each of the 60 million people in the UK. (…)”

• SIC published decision notice in 2010

An FoI example- childhood leukaemia (4)

Page 23: Statistical Disclosure Control

Example A:

• extract from a publication table• safe to publish?

Number of inpatient episodes, length of stay and average length of stay;year ending 31 March 2010

NHS Board of Treatment Episodes Stay (days)Average length

of stay

Scotland 962,859 5,088,990 5.3

NHS Ayrshire and Arran 66,845 338,374 5.1NHS Borders 18,225 110,723 6.1NHS Dumfries and Galloway 25,156 153,906 6.1NHS Fife 50,121 230,126 4.6NHS Forth Valley 39,365 191,880 4.9NHS Greater Glasgow and Clyde 270,149 1,488,123 5.5NHS Grampian 89,931 511,314 5.7NHS Golden Jubilee 16,048 47,820 3.0NHS Highland 48,181 272,603 5.7NHS Lanarkshire 100,408 465,607 4.6NHS Lothian 150,347 797,572 5.3NHS Orkney 2,007 13,317 6.6NHS Shetland 2,264 12,340 5.5NHS Tayside 79,649 418,098 5.2NHS Western Isles 4,132 37,090 9.0

(… notes …)

All Admission Types

Page 24: Statistical Disclosure Control

Example B:ILLUSTRATIVE DATA

Teenage Pregnancies - Scotland, 2009

Age (conception) <16Area of Residence Delivery Aborted

All areas 293 420

Ayrshire & Arran 36 37Borders 3 4Dumfries & Galloway 6 5Fife 22 35Forth Valley 20 20

Grampian 20 34Greater Glasgow & Clyde 57 92Highland 15 17Lanarkshire 38 51Lothian 37 71

Tayside 32 41Orkney 0 1Shetland 1 1Western Isles 1 2Other/NK 5 9

• extract from a draft publication table

• safe to publish?

Page 25: Statistical Disclosure Control

Table 1: Number of emergency hospital admissions due to assault by sharp object1 in 0-17 and 18+ year olds, by council area of residence; discharged during financial years 2002/2003 to 2006/2007

Age Group Council Area of residence 2002/2003 2003/2004 2004/2005 2005/2006 2006/2007

0-17 Council 1 1 1 1 1 1 Council 2 - 1 2 1 - Council 3 3 - - - - Council 4 1 3 - 2 1 Council 5 10 5 5 10 7 Council 6 1 - - - -

• Parliamentary Question on ‘stab wounds’ (extract)• ISD draft answer:

Example C:

Page 26: Statistical Disclosure Control

Example C:

Table 1: Number of emergency hospital admissions due to assault by sharp object1 in 0-17 and 18+ year olds, by council area of residence; discharged during financial years 2002/2003 to 2006/2007

Age Group Council Area of residence 2002/2003 2003/2004 2004/2005 2005/2006 2006/2007

0-17 Council 1 * * * * * Council 2 * * * * * Council 3 * * * * * Council 4 * * * * * Council 5 10 5 5 10 7 Council 6 * * * * *

• differing views• SG final answer, incl secondary suppression:

Page 27: Statistical Disclosure Control

Example D:

• Freedom of Information request:

• ‘The number of registrations of cancers (all cancers combined, ICD-10 C00-C96 excluding C44) for each of the last 10 years for which you have published registrations - 1997-2006, for the postcode XX99 9XX.’

• safe to release the information?

Page 28: Statistical Disclosure Control

Example D:

• Freedom of Information request:

• ‘The number of registrations of cancers (all cancers combined, ICD-10 C00-C96 excluding C44) for each of the last 10 years for which you have published registrations - 1997-2006, for the postcode XX99 9XX.’

• there were 0 cases in each year

• safe to release the information?

Page 29: Statistical Disclosure Control

Example D:

• ISD answer (part):

• ‘We are unable to provide this as to do so would breach the Data Protection Act (DPA) 1998. We consider the information requested to be personal data, as defined in section 1(1) of the act, and by the first Data Protection Principle. This is because of the small numbers of health events recorded at unit postcode level. We believe that there would be risk of disclosing personal information if we were to provide the information you seek in the form requested. The Freedom of Information (Scotland) Act 2002 allows an exemption from release for personal data under section 38.’

Page 30: Statistical Disclosure Control

Example E:

• Freedom of Information request (part):

• ‘a breakdown, by NHS Board, by age, of every girl below 16 who had an abortion in the last 12 months, including a breakdown of how many were miscarriages or induced by surgical procedure’

Page 31: Statistical Disclosure Control

Example E:

Deliveries and abortions in under 16 year olds; by NHS Board of Residence and age at conception;Year of conception 2007

NHS Board of Residence

Under 14 14 15 Total Under 14 14 15 Total

Scotland 6 56 247 309 15 103 324 442

NHS Ayrshire & Arran * * 25 34 * * 26 37

NHS Borders * * * * * * * *

NHS Dumfries & Galloway * * * * * * * *

NHS Fife * * 18 24 * * * 34

NHS Forth Valley * * * 18 * * * 20

NHS Grampian * * * 16 * * 31 46

NHS Greater Glasgow & Clyde * * 68 84 * * 68 96

NHS Highland * * * 10 * * * 24

NHS Lanarkshire * * 30 40 * * 36 52

NHS Lothian * * * 32 * * * 62

NHS Orkney * * * * * * * *NHS Shetland * * * * * * * *NHS Tayside * * 29 34 * * 36 51

NHS Western Isles * * * * * * * *NHS Board not known / outwith Scotland * * * * * * * *

(...notes...)

Deliveries Abortions

Age at conception Age at conception

• ISD answer (part):

Page 32: Statistical Disclosure Control

Summary

• ‘disclosure’ is important … but there is a balance with utility – we want the data to be used, wherever appropriate

• now part of ISD daily work

• ISD outputs now more consistent, more secure