preserving privacy in data sharing - aamc · data sharing in multi-center studies • benefits and...
TRANSCRIPT
![Page 1: Preserving privacy in data sharing - AAMC · Data sharing in multi-center studies • Benefits and challenges • Ways to facilitate data sharing while protecting privacy • Stakeholders’](https://reader030.vdocument.in/reader030/viewer/2022040607/5ec023fb2224fc24ca3610d6/html5/thumbnails/1.jpg)
Preserving privacy in data sharing
Darren Toh, ScDDepartment of Population Medicine
Harvard Medical School & Harvard Pilgrim Health Care Institute
December 7, 2017
![Page 2: Preserving privacy in data sharing - AAMC · Data sharing in multi-center studies • Benefits and challenges • Ways to facilitate data sharing while protecting privacy • Stakeholders’](https://reader030.vdocument.in/reader030/viewer/2022040607/5ec023fb2224fc24ca3610d6/html5/thumbnails/2.jpg)
Disclosures
• The work presented here is/was supported by• Patient-Centered Outcomes Research Institute (ME-1403-11305)• Office of the Assistant Secretary for Planning and Evaluation• Food and Drug Administration (HHSF223200910006I)• National Institutes of Health (U01EB023683)• Agency for Healthcare Research and Quality (R01HS019912)
• All statements in this presentation, including its findings and conclusions, are solely those of the authors and do not necessarily represent the views of AHRQ, ASPE, FDA, NIH, PCORI, or PCORI’s Board of Governors or Methodology Committee
2
![Page 3: Preserving privacy in data sharing - AAMC · Data sharing in multi-center studies • Benefits and challenges • Ways to facilitate data sharing while protecting privacy • Stakeholders’](https://reader030.vdocument.in/reader030/viewer/2022040607/5ec023fb2224fc24ca3610d6/html5/thumbnails/3.jpg)
Overview
• Data sharing in multi-center studies• Benefits and challenges
• Ways to facilitate data sharing while protecting privacy• Stakeholders’ views on data sharing• Use of distributed data networks• Use of privacy-protecting analytic and data-sharing methods
• Discussion
3
![Page 4: Preserving privacy in data sharing - AAMC · Data sharing in multi-center studies • Benefits and challenges • Ways to facilitate data sharing while protecting privacy • Stakeholders’](https://reader030.vdocument.in/reader030/viewer/2022040607/5ec023fb2224fc24ca3610d6/html5/thumbnails/4.jpg)
Overview
• Data sharing in multi-center studies• Benefits and challenges
• Ways to facilitate data sharing while protecting privacy• Stakeholders’ views on data sharing• Use of distributed data networks• Use of privacy-protecting analytic and data-sharing methods
• Discussion
4
![Page 5: Preserving privacy in data sharing - AAMC · Data sharing in multi-center studies • Benefits and challenges • Ways to facilitate data sharing while protecting privacy • Stakeholders’](https://reader030.vdocument.in/reader030/viewer/2022040607/5ec023fb2224fc24ca3610d6/html5/thumbnails/5.jpg)
Multi-database studies
• Many studies are now done in multi-database settings
5
![Page 6: Preserving privacy in data sharing - AAMC · Data sharing in multi-center studies • Benefits and challenges • Ways to facilitate data sharing while protecting privacy • Stakeholders’](https://reader030.vdocument.in/reader030/viewer/2022040607/5ec023fb2224fc24ca3610d6/html5/thumbnails/6.jpg)
Benefits of multi-database studies
• Larger sample sizes• Allow studies of rare treatments or rare outcomes• Allow studies in specific subpopulations• Allow studies to be done more quickly
• More diverse populations• Allow more generalizable findings• Allow assessment of treatment effect heterogeneity
6
![Page 7: Preserving privacy in data sharing - AAMC · Data sharing in multi-center studies • Benefits and challenges • Ways to facilitate data sharing while protecting privacy • Stakeholders’](https://reader030.vdocument.in/reader030/viewer/2022040607/5ec023fb2224fc24ca3610d6/html5/thumbnails/7.jpg)
Types of data shared
• Insurance claims
• Electronic health records (inpatient & outpatient)
• Registries (e.g., birth, immunization, disease, treatment)
• Genomic data
• Patient-generated data
7
![Page 8: Preserving privacy in data sharing - AAMC · Data sharing in multi-center studies • Benefits and challenges • Ways to facilitate data sharing while protecting privacy • Stakeholders’](https://reader030.vdocument.in/reader030/viewer/2022040607/5ec023fb2224fc24ca3610d6/html5/thumbnails/8.jpg)
Multi-database studies
Analysis center
Site 1
Site 2Site 3
8
![Page 9: Preserving privacy in data sharing - AAMC · Data sharing in multi-center studies • Benefits and challenges • Ways to facilitate data sharing while protecting privacy • Stakeholders’](https://reader030.vdocument.in/reader030/viewer/2022040607/5ec023fb2224fc24ca3610d6/html5/thumbnails/9.jpg)
Multi-database studies
Analysis center
Site 1
Site 2Site 3
9
![Page 10: Preserving privacy in data sharing - AAMC · Data sharing in multi-center studies • Benefits and challenges • Ways to facilitate data sharing while protecting privacy • Stakeholders’](https://reader030.vdocument.in/reader030/viewer/2022040607/5ec023fb2224fc24ca3610d6/html5/thumbnails/10.jpg)
Multi-database studies
Analysis center
Site 1
Site 2Site 3
Pooling the entire databases10
![Page 11: Preserving privacy in data sharing - AAMC · Data sharing in multi-center studies • Benefits and challenges • Ways to facilitate data sharing while protecting privacy • Stakeholders’](https://reader030.vdocument.in/reader030/viewer/2022040607/5ec023fb2224fc24ca3610d6/html5/thumbnails/11.jpg)
Concerns about data sharing
• Patient privacy and confidentiality
• Data security
• Unauthorized use of data
• Inaccurate analysis or interpretation of data
• Disclosure of sensitive institutional or corporate info
• Contractual restrictions
11
![Page 12: Preserving privacy in data sharing - AAMC · Data sharing in multi-center studies • Benefits and challenges • Ways to facilitate data sharing while protecting privacy • Stakeholders’](https://reader030.vdocument.in/reader030/viewer/2022040607/5ec023fb2224fc24ca3610d6/html5/thumbnails/12.jpg)
Data sharing in multi-database studies
Data we need to conduct the
desired analysis
What data partners are willing or able
to share
12
![Page 13: Preserving privacy in data sharing - AAMC · Data sharing in multi-center studies • Benefits and challenges • Ways to facilitate data sharing while protecting privacy • Stakeholders’](https://reader030.vdocument.in/reader030/viewer/2022040607/5ec023fb2224fc24ca3610d6/html5/thumbnails/13.jpg)
Overview
• Data sharing in multi-center studies• Benefits and challenges
• Ways to facilitate data sharing while protecting privacy• Stakeholders’ views on data sharing• Use of distributed data networks• Use of privacy-protecting analytic and data-sharing methods
• Discussion
13
![Page 14: Preserving privacy in data sharing - AAMC · Data sharing in multi-center studies • Benefits and challenges • Ways to facilitate data sharing while protecting privacy • Stakeholders’](https://reader030.vdocument.in/reader030/viewer/2022040607/5ec023fb2224fc24ca3610d6/html5/thumbnails/14.jpg)
Understand factors that influence data sharing
• Semi-structured interviews with key stakeholders
• Identify factors that facilitate data sharing
• Identify concerns that discourage data sharing
14
![Page 15: Preserving privacy in data sharing - AAMC · Data sharing in multi-center studies • Benefits and challenges • Ways to facilitate data sharing while protecting privacy • Stakeholders’](https://reader030.vdocument.in/reader030/viewer/2022040607/5ec023fb2224fc24ca3610d6/html5/thumbnails/15.jpg)
Stakeholders interviewed
Mazor et al, J Comp Eff Res, 2017;6(6):537-547 15
![Page 16: Preserving privacy in data sharing - AAMC · Data sharing in multi-center studies • Benefits and challenges • Ways to facilitate data sharing while protecting privacy • Stakeholders’](https://reader030.vdocument.in/reader030/viewer/2022040607/5ec023fb2224fc24ca3610d6/html5/thumbnails/16.jpg)
Stakeholder interview domains
Mazor et al, J Comp Eff Res, 2017;6(6):537-547 16
![Page 17: Preserving privacy in data sharing - AAMC · Data sharing in multi-center studies • Benefits and challenges • Ways to facilitate data sharing while protecting privacy • Stakeholders’](https://reader030.vdocument.in/reader030/viewer/2022040607/5ec023fb2224fc24ca3610d6/html5/thumbnails/17.jpg)
Findings from stakeholder interviews
Mazor et al, J Comp Eff Res, 2017;6(6):537-547 17
![Page 18: Preserving privacy in data sharing - AAMC · Data sharing in multi-center studies • Benefits and challenges • Ways to facilitate data sharing while protecting privacy • Stakeholders’](https://reader030.vdocument.in/reader030/viewer/2022040607/5ec023fb2224fc24ca3610d6/html5/thumbnails/18.jpg)
Overview
• Data sharing in multi-center studies• Benefits and challenges
• Ways to facilitate data sharing while protecting privacy• Stakeholders’ views on data sharing• Use of distributed data networks• Use of privacy-protecting analytic and data-sharing methods
• Discussion
18
![Page 19: Preserving privacy in data sharing - AAMC · Data sharing in multi-center studies • Benefits and challenges • Ways to facilitate data sharing while protecting privacy • Stakeholders’](https://reader030.vdocument.in/reader030/viewer/2022040607/5ec023fb2224fc24ca3610d6/html5/thumbnails/19.jpg)
Distributed data network (DDN) architecture
• No pooling of the entire databases from all sites
• Data partners maintain physical control of their data
• Data partners have ability to opt out of any request
• Only transfer minimal necessary information
19
![Page 20: Preserving privacy in data sharing - AAMC · Data sharing in multi-center studies • Benefits and challenges • Ways to facilitate data sharing while protecting privacy • Stakeholders’](https://reader030.vdocument.in/reader030/viewer/2022040607/5ec023fb2224fc24ca3610d6/html5/thumbnails/20.jpg)
Distributed data networks – Vanilla version
Analysis center
Site 1
Site 2Site 3
20
![Page 21: Preserving privacy in data sharing - AAMC · Data sharing in multi-center studies • Benefits and challenges • Ways to facilitate data sharing while protecting privacy • Stakeholders’](https://reader030.vdocument.in/reader030/viewer/2022040607/5ec023fb2224fc24ca3610d6/html5/thumbnails/21.jpg)
Distributed data networks – Vanilla version
Analysis center
Site 1
Site 2Site 3
Pooling study-specific individual-level datasets21
![Page 22: Preserving privacy in data sharing - AAMC · Data sharing in multi-center studies • Benefits and challenges • Ways to facilitate data sharing while protecting privacy • Stakeholders’](https://reader030.vdocument.in/reader030/viewer/2022040607/5ec023fb2224fc24ca3610d6/html5/thumbnails/22.jpg)
Examples of distributed data networks
22
![Page 23: Preserving privacy in data sharing - AAMC · Data sharing in multi-center studies • Benefits and challenges • Ways to facilitate data sharing while protecting privacy • Stakeholders’](https://reader030.vdocument.in/reader030/viewer/2022040607/5ec023fb2224fc24ca3610d6/html5/thumbnails/23.jpg)
Typical analytic datasets shared in DDNs
PatID Exposure Outcome Time X1 X2 X3 X4 X5 …
001 1 0 312 0 1 0 1 1 …
002 0 0 40 1 1 0 1 0 …
003 0 0 365 1 0 0 0 0 …
004 0 0 200 2 0 1 0 0 …
005 0 1 2 3 0 0 1 0 …
006 1 1 15 3 1 0 0 1 …
007 1 0 4 1 1 1 0 1 …
008 1 0 145 0 0 1 0 0 …
009 0 1 33 2 1 0 0 0 …
010 0 0 98 1 1 0 0 0 …
011 0 0 34 1 0 0 0 0 …
… … … … … … … … … …
Site 1
23
![Page 24: Preserving privacy in data sharing - AAMC · Data sharing in multi-center studies • Benefits and challenges • Ways to facilitate data sharing while protecting privacy • Stakeholders’](https://reader030.vdocument.in/reader030/viewer/2022040607/5ec023fb2224fc24ca3610d6/html5/thumbnails/24.jpg)
Typical analytic datasets shared in DDNs
PatID Exposure Outcome Time X1 X2 X3 X4 X5 …
001 1 0 312 0 1 0 1 1 …
002 0 0 40 1 1 0 1 0 …
003 0 0 365 1 0 0 0 0 …
004 0 0 200 2 0 1 0 0 …
005 0 1 2 3 0 0 1 0 …
006 1 1 15 3 1 0 0 1 …
007 1 0 4 1 1 1 0 1 …
008 1 0 145 0 0 1 0 0 …
009 0 1 33 2 1 0 0 0 …
010 0 0 98 1 1 0 0 0 …
011 0 0 34 1 0 0 0 0 …
… … … … … … … … … …
Site 1
Each row represents an individual
24
![Page 25: Preserving privacy in data sharing - AAMC · Data sharing in multi-center studies • Benefits and challenges • Ways to facilitate data sharing while protecting privacy • Stakeholders’](https://reader030.vdocument.in/reader030/viewer/2022040607/5ec023fb2224fc24ca3610d6/html5/thumbnails/25.jpg)
Typical analytic datasets shared in DDNs
PatID Exposure Outcome Time X1 X2 X3 X4 X5 …
001 1 0 312 0 1 0 1 1 …
002 0 0 40 1 1 0 1 0 …
003 0 0 365 1 0 0 0 0 …
004 0 0 200 2 0 1 0 0 …
005 0 1 2 3 0 0 1 0 …
006 1 1 15 3 1 0 0 1 …
007 1 0 4 1 1 1 0 1 …
008 1 0 145 0 0 1 0 0 …
009 0 1 33 2 1 0 0 0 …
010 0 0 98 1 1 0 0 0 …
011 0 0 34 1 0 0 0 0 …
… … … … … … … … … …
Site 1
Each column represents a variable
25
![Page 26: Preserving privacy in data sharing - AAMC · Data sharing in multi-center studies • Benefits and challenges • Ways to facilitate data sharing while protecting privacy • Stakeholders’](https://reader030.vdocument.in/reader030/viewer/2022040607/5ec023fb2224fc24ca3610d6/html5/thumbnails/26.jpg)
Standardizing databases
Adapted from: http://www.hcsrn.org/asset/b9efb268-eb86-400e-8c74-2d42ac57fa4F/VDW.Infographic031511.jpg
Individual data partners
Site 1 Site 2
Site 3 Site 4
Data standardization(common data model)
Site 1
Site 2
Site 3
Site 4
Data accessible to research projects
• Research projects
• Programs written against common data model
Data quality improvement feedback loop
26
![Page 27: Preserving privacy in data sharing - AAMC · Data sharing in multi-center studies • Benefits and challenges • Ways to facilitate data sharing while protecting privacy • Stakeholders’](https://reader030.vdocument.in/reader030/viewer/2022040607/5ec023fb2224fc24ca3610d6/html5/thumbnails/27.jpg)
Distributed analysis
Review & Run Query
Review & Return Output
Data Partner 1
EnrollmentDemographics
UtilizationPharmacy
Etc
1- User creates and submits query
2- Data Partners retrieve query
3- Data Partners review and run query against their local data
4- Data Partners review results
5- Data Partners return results via secure network
6 Results are aggregated and returned
Review & Run Query
Review & Return Output
Data Partner 2
EnrollmentDemographics
UtilizationPharmacy
Etc
Analysis Center
Secure Network Portal
1
https://www.sentinelinitiative.org/privacy-and-security 27
![Page 28: Preserving privacy in data sharing - AAMC · Data sharing in multi-center studies • Benefits and challenges • Ways to facilitate data sharing while protecting privacy • Stakeholders’](https://reader030.vdocument.in/reader030/viewer/2022040607/5ec023fb2224fc24ca3610d6/html5/thumbnails/28.jpg)
Distributed analysis
Review & Run Query
Review & Return Output
Data Partner 1
EnrollmentDemographics
UtilizationPharmacy
Etc
1- User creates and submits query
2- Data Partners retrieve query
3- Data Partners review and run query against their local data
4- Data Partners review results
5- Data Partners return results via secure network
6 Results are aggregated and returned
2
Review & Run Query
Review & Return Output
Data Partner 2
EnrollmentDemographics
UtilizationPharmacy
Etc
Analysis Center
Secure Network Portal
1
https://www.sentinelinitiative.org/privacy-and-security 28
![Page 29: Preserving privacy in data sharing - AAMC · Data sharing in multi-center studies • Benefits and challenges • Ways to facilitate data sharing while protecting privacy • Stakeholders’](https://reader030.vdocument.in/reader030/viewer/2022040607/5ec023fb2224fc24ca3610d6/html5/thumbnails/29.jpg)
Distributed analysis
Review & Run Query
Review & Return Outout
Data Partner 1
EnrollmentDemographics
UtilizationPharmacy
Etc
1- User creates and submits query
2- Data Partners retrieve query
3- Data Partners review and run query against their local data
4- Data Partners review results
5- Data Partners return results via secure network
6 Results are aggregated and returned
23
Review & Run Query
Review & Return Output
Data Partner 2
EnrollmentDemographics
UtilizationPharmacy
Etc
3
Analysis Center
Secure Network Portal
1
https://www.sentinelinitiative.org/privacy-and-security 29
![Page 30: Preserving privacy in data sharing - AAMC · Data sharing in multi-center studies • Benefits and challenges • Ways to facilitate data sharing while protecting privacy • Stakeholders’](https://reader030.vdocument.in/reader030/viewer/2022040607/5ec023fb2224fc24ca3610d6/html5/thumbnails/30.jpg)
Distributed analysis
Review & Run Query
Review & Return Output
Data Partner 1
EnrollmentDemographics
UtilizationPharmacy
Etc
1- User creates and submits query
2- Data Partners retrieve query
3- Data Partners review and run query against their local data
4- Data Partners review output
5- Data Partners return results via secure network
6 Results are aggregated and returned
23 4
Review & Run Query
Review & Return Output
Data Partner 2
EnrollmentDemographics
UtilizationPharmacy
Etc
3 4
Analysis Center
Secure Network Portal
1
https://www.sentinelinitiative.org/privacy-and-security 30
![Page 31: Preserving privacy in data sharing - AAMC · Data sharing in multi-center studies • Benefits and challenges • Ways to facilitate data sharing while protecting privacy • Stakeholders’](https://reader030.vdocument.in/reader030/viewer/2022040607/5ec023fb2224fc24ca3610d6/html5/thumbnails/31.jpg)
Distributed analysis
Review & Run Query
Review & Return Output
Data Partner 1
EnrollmentDemographics
UtilizationPharmacy
Etc
1- User creates and submits query
2- Data Partners retrieve query
3- Data Partners review and run query against their local data
4- Data Partners review output
5- Data Partners return outputs via secure network
6 Results are aggregated and returned
23 4
5
Review & Run Query
Review & Return Output
Data Partner 2
EnrollmentDemographics
UtilizationPharmacy
Etc
3 4
Analysis Center
Secure Network Portal
1
https://www.sentinelinitiative.org/privacy-and-security 31
![Page 32: Preserving privacy in data sharing - AAMC · Data sharing in multi-center studies • Benefits and challenges • Ways to facilitate data sharing while protecting privacy • Stakeholders’](https://reader030.vdocument.in/reader030/viewer/2022040607/5ec023fb2224fc24ca3610d6/html5/thumbnails/32.jpg)
Distributed analysis
Review & Run Query
Review & Return Output
Data Partner 1
EnrollmentDemographics
UtilizationPharmacy
Etc
1- User creates and submits query
2- Data Partners retrieve query
3- Data Partners review and run query against their local data
4- Data Partners review output
5- Data Partners return outputs via secure network
6- Outputs are aggregated and analyzed
23 4
5
6
Review & Run Query
Review & Return Output
Data Partner 2
EnrollmentDemographics
UtilizationPharmacy
Etc
3 4
Analysis Center
Secure Network Portal
1
https://www.sentinelinitiative.org/privacy-and-security 32
![Page 33: Preserving privacy in data sharing - AAMC · Data sharing in multi-center studies • Benefits and challenges • Ways to facilitate data sharing while protecting privacy • Stakeholders’](https://reader030.vdocument.in/reader030/viewer/2022040607/5ec023fb2224fc24ca3610d6/html5/thumbnails/33.jpg)
Sharing patient-level datasets in DDNs
• Patient-level info can generally be de-identified to avoid sharing of sensitive patient info
• But even so, several concerns may still persist
• Sometimes it is not possible to share patient-level info due to these concerns or other reasons
33
![Page 34: Preserving privacy in data sharing - AAMC · Data sharing in multi-center studies • Benefits and challenges • Ways to facilitate data sharing while protecting privacy • Stakeholders’](https://reader030.vdocument.in/reader030/viewer/2022040607/5ec023fb2224fc24ca3610d6/html5/thumbnails/34.jpg)
Challenges in de-identifying patient information
34
![Page 35: Preserving privacy in data sharing - AAMC · Data sharing in multi-center studies • Benefits and challenges • Ways to facilitate data sharing while protecting privacy • Stakeholders’](https://reader030.vdocument.in/reader030/viewer/2022040607/5ec023fb2224fc24ca3610d6/html5/thumbnails/35.jpg)
Question 1
• Do we have other ways to share data?
35
![Page 36: Preserving privacy in data sharing - AAMC · Data sharing in multi-center studies • Benefits and challenges • Ways to facilitate data sharing while protecting privacy • Stakeholders’](https://reader030.vdocument.in/reader030/viewer/2022040607/5ec023fb2224fc24ca3610d6/html5/thumbnails/36.jpg)
Question 2
• Can we perform the analysis we want without sharing potentially identifiable patient-level data?
36
![Page 37: Preserving privacy in data sharing - AAMC · Data sharing in multi-center studies • Benefits and challenges • Ways to facilitate data sharing while protecting privacy • Stakeholders’](https://reader030.vdocument.in/reader030/viewer/2022040607/5ec023fb2224fc24ca3610d6/html5/thumbnails/37.jpg)
Question 3
• Better yet, can we perform the analysis we want without sharing patient-level data at all?
37
![Page 38: Preserving privacy in data sharing - AAMC · Data sharing in multi-center studies • Benefits and challenges • Ways to facilitate data sharing while protecting privacy • Stakeholders’](https://reader030.vdocument.in/reader030/viewer/2022040607/5ec023fb2224fc24ca3610d6/html5/thumbnails/38.jpg)
Overview
• Data sharing in multi-center studies• Benefits and challenges
• Ways to facilitate data sharing while protecting privacy• Stakeholders’ views on data sharing• Use of distributed data networks• Use of privacy-protecting analytic and data-sharing methods
• Discussion
38
![Page 39: Preserving privacy in data sharing - AAMC · Data sharing in multi-center studies • Benefits and challenges • Ways to facilitate data sharing while protecting privacy • Stakeholders’](https://reader030.vdocument.in/reader030/viewer/2022040607/5ec023fb2224fc24ca3610d6/html5/thumbnails/39.jpg)
Not using privacy-protecting analytic methods
Analysis center
Site 1
Site 2Site 3
39
![Page 40: Preserving privacy in data sharing - AAMC · Data sharing in multi-center studies • Benefits and challenges • Ways to facilitate data sharing while protecting privacy • Stakeholders’](https://reader030.vdocument.in/reader030/viewer/2022040607/5ec023fb2224fc24ca3610d6/html5/thumbnails/40.jpg)
Using privacy-protecting analytic methods
Analysis center
Site 1
Site 2Site 3
40
![Page 41: Preserving privacy in data sharing - AAMC · Data sharing in multi-center studies • Benefits and challenges • Ways to facilitate data sharing while protecting privacy • Stakeholders’](https://reader030.vdocument.in/reader030/viewer/2022040607/5ec023fb2224fc24ca3610d6/html5/thumbnails/41.jpg)
Using privacy-protecting analytic methods
Analysis center
Site 1
Site 2Site 3
Pooling study-specific summary-level datasets41
![Page 42: Preserving privacy in data sharing - AAMC · Data sharing in multi-center studies • Benefits and challenges • Ways to facilitate data sharing while protecting privacy • Stakeholders’](https://reader030.vdocument.in/reader030/viewer/2022040607/5ec023fb2224fc24ca3610d6/html5/thumbnails/42.jpg)
Confounder summary scores
Race
AgeSex
TxPx Dx
Propensity Score (PS)or
Disease Risk Score (DRS)
Treatment Outcome
Confounders
DRSPS
…
42
![Page 43: Preserving privacy in data sharing - AAMC · Data sharing in multi-center studies • Benefits and challenges • Ways to facilitate data sharing while protecting privacy • Stakeholders’](https://reader030.vdocument.in/reader030/viewer/2022040607/5ec023fb2224fc24ca3610d6/html5/thumbnails/43.jpg)
Typical analytic datasets shared in DDNs
PatID Exposure Outcome Time X1 X2 X3 X4 X5 …
001 1 0 312 0 1 0 1 1 …
002 0 0 40 1 1 0 1 0 …
003 0 0 365 1 0 0 0 0 …
004 0 0 200 2 0 1 0 0 …
005 0 1 2 3 0 0 1 0 …
006 1 1 15 3 1 0 0 1 …
007 1 0 4 1 1 1 0 1 …
008 1 0 145 0 0 1 0 0 …
009 0 1 33 2 1 0 0 0 …
010 0 0 98 1 1 0 0 0 …
011 0 0 34 1 0 0 0 0 …
… … … … … … … … … …
Site 1
43
![Page 44: Preserving privacy in data sharing - AAMC · Data sharing in multi-center studies • Benefits and challenges • Ways to facilitate data sharing while protecting privacy • Stakeholders’](https://reader030.vdocument.in/reader030/viewer/2022040607/5ec023fb2224fc24ca3610d6/html5/thumbnails/44.jpg)
Using confounder summary scores
PatID Exposure Outcome Time PS
001 1 0 312 0.33
002 0 0 40 0.04
003 0 0 365 0.05
004 0 0 200 0.54
005 0 1 2 0.22
006 1 1 15 0.45
007 1 0 4 0.09
008 1 0 145 0.79
009 0 1 33 0.21
010 0 0 98 0.01
011 0 0 34 0.38
… … … … …
Site 1
Toh et al, Med Care, 2013;51:S4-S10 44
![Page 45: Preserving privacy in data sharing - AAMC · Data sharing in multi-center studies • Benefits and challenges • Ways to facilitate data sharing while protecting privacy • Stakeholders’](https://reader030.vdocument.in/reader030/viewer/2022040607/5ec023fb2224fc24ca3610d6/html5/thumbnails/45.jpg)
Summary score-matched analysis
PatID Exposure Outcome Time PS
001 1 0 312 0.33
002 0 0 40 0.04
003 0 0 365 0.05
004 0 0 200 0.54
005 0 1 2 0.22
006 1 1 15 0.45
007 1 0 4 0.09
008 1 0 145 0.79
009 0 1 33 0.21
010 0 0 98 0.01
011 0 0 34 0.38
… … … … …
Site 1 PT in
ExposedPT in Un-exposed
Event in Exposed
Event in Un-
exposed
355.6 233.4 40 35
• Only four numbers are needed (in 1:1 matching)
• Lead team uses data from all sites to obtain overall results
Toh et al, Med Care, 2013;51:S4-S10 45
![Page 46: Preserving privacy in data sharing - AAMC · Data sharing in multi-center studies • Benefits and challenges • Ways to facilitate data sharing while protecting privacy • Stakeholders’](https://reader030.vdocument.in/reader030/viewer/2022040607/5ec023fb2224fc24ca3610d6/html5/thumbnails/46.jpg)
Summary score-stratified analysis
PatID Exposure Outcome Time PS
001 1 0 312 0.33
002 0 0 40 0.04
003 0 0 365 0.05
004 0 0 200 0.54
005 0 1 2 0.22
006 1 1 15 0.45
007 1 0 4 0.09
008 1 0 145 0.79
009 0 1 33 0.21
010 0 0 98 0.01
011 0 0 34 0.38
… … … … …
Site 1
• Each record is a summary score-based stratum
• Lead team uses methods, e.g., the Mantel-Haenszelmethod, to obtain overall results
PS stratum
PT in Exposed
PT in Un-
exposed
Event in Exposed
Event in Un-
exposed
1 34.5 70.1 10 8
2 32.4 32.6 7 21
3 56.2 44.2 9 10
4 12.8 56.2 12 6
Toh et al, Med Care, 2013;51:S4-S10 46
![Page 47: Preserving privacy in data sharing - AAMC · Data sharing in multi-center studies • Benefits and challenges • Ways to facilitate data sharing while protecting privacy • Stakeholders’](https://reader030.vdocument.in/reader030/viewer/2022040607/5ec023fb2224fc24ca3610d6/html5/thumbnails/47.jpg)
Meta-analysisSite 1
• Each record is an effect estimate and its 95% CI
• There is only 1 record per site
• Lead team uses meta-analytic approach to obtain overall results
Toh et al, Med Care, 2013;51:S4-S10
HR Lower 95% CI
Upper 95% CI
2.97 1.95 4.52
PatID Exposure Outcome Time X1 X2 X3 X4 X5 …
001 1 0 312 0 1 0 1 1 …
002 0 0 40 1 1 0 1 0 …
003 0 0 365 1 0 0 0 0 …
004 0 0 200 2 0 1 0 0 …
005 0 1 2 3 0 0 1 0 …
006 1 1 15 3 1 0 0 1 …
007 1 0 4 1 1 1 0 1 …
008 1 0 145 0 0 1 0 0 …
009 0 1 33 2 1 0 0 0 …
010 0 0 98 1 1 0 0 0 …
011 0 0 34 1 0 0 0 0 …
… … … … … … … … … …
47
![Page 48: Preserving privacy in data sharing - AAMC · Data sharing in multi-center studies • Benefits and challenges • Ways to facilitate data sharing while protecting privacy • Stakeholders’](https://reader030.vdocument.in/reader030/viewer/2022040607/5ec023fb2224fc24ca3610d6/html5/thumbnails/48.jpg)
Distributed regression
PatID Exposure Outcome Time X1 X2 X3 X4 X5 …
001 1 0 312 0 1 0 1 1 …
002 0 0 40 1 1 0 1 0 …
003 0 0 365 1 0 0 0 0 …
004 0 0 200 2 0 1 0 0 …
005 0 1 2 3 0 0 1 0 …
006 1 1 15 3 1 0 0 1 …
007 1 0 4 1 1 1 0 1 …
008 1 0 145 0 0 1 0 0 …
009 0 1 33 2 1 0 0 0 …
010 0 0 98 1 1 0 0 0 …
011 0 0 34 1 0 0 0 0 …
… … … … … … … … … …
Site 1
Type Name INT X1 X2
SSCP INT 152.45 56.74 121.65
SSCP X1 56.74 342.45 88.55
SSCP X2 121.65 88.55 422.32
Mean 1.00 3.45 65.78
STD 0.00 4.65 22.34
N 500 500 500
Karr et al, J Comput Graph Stat, 2005;14:263-279
• Each record is a summary statistic
• Lead team uses the summary statistics to perform regression analysis
48
![Page 49: Preserving privacy in data sharing - AAMC · Data sharing in multi-center studies • Benefits and challenges • Ways to facilitate data sharing while protecting privacy • Stakeholders’](https://reader030.vdocument.in/reader030/viewer/2022040607/5ec023fb2224fc24ca3610d6/html5/thumbnails/49.jpg)
Distributed analysis
Review & Run Query
Review & Return Output
Data Partner 1
EnrollmentDemographics
UtilizationPharmacy
Etc
1- User creates and submits query
2- Data Partners retrieve query
3- Data Partners review and run query against their local data
4- Data Partners review output
5- Data Partners return outputs via secure network
6- Outputs are aggregated and analyzed
23 4
5
6
Review & Run Query
Review & Return Output
Data Partner 2
EnrollmentDemographics
UtilizationPharmacy
Etc
3 4
Analysis Center
Secure Network Portal
1
https://www.sentinelinitiative.org/privacy-and-security 49
![Page 50: Preserving privacy in data sharing - AAMC · Data sharing in multi-center studies • Benefits and challenges • Ways to facilitate data sharing while protecting privacy • Stakeholders’](https://reader030.vdocument.in/reader030/viewer/2022040607/5ec023fb2224fc24ca3610d6/html5/thumbnails/50.jpg)
Example: A comparative effectiveness study
• A Scalable Partnering Network for CER (SPAN) project
• Risk of long-term re-hospitalization with lap band vs. bypass procedure
• Included 7 of 11 data partners
Toh et al, Med Care, 2014;52:664-668 50
![Page 51: Preserving privacy in data sharing - AAMC · Data sharing in multi-center studies • Benefits and challenges • Ways to facilitate data sharing while protecting privacy • Stakeholders’](https://reader030.vdocument.in/reader030/viewer/2022040607/5ec023fb2224fc24ca3610d6/html5/thumbnails/51.jpg)
Study setting
http://www.hopkinsmedicine.org/healthlibrary/test_procedures/gastroenterology/laparoscopic_adjustable_gastric_banding_135,63/
http://www.hopkinsmedicine.org/healthlibrary/test_procedures/gastroenterology/roux-en-y_gastric_bypass_weight-loss_surgery_135,65/
51
![Page 52: Preserving privacy in data sharing - AAMC · Data sharing in multi-center studies • Benefits and challenges • Ways to facilitate data sharing while protecting privacy • Stakeholders’](https://reader030.vdocument.in/reader030/viewer/2022040607/5ec023fb2224fc24ca3610d6/html5/thumbnails/52.jpg)
Study design
•≥21 years at time of bariatric surgery•≥1 BMI of 35kg/m2 or greater •Continuous enrollment w/ benefits•No prior bariatric surgery•No prior diagnosis of study outcome
1/1/2005
Time
Contributing person-times
12/31/2010Start of follow up (discharge date)
•Re-hospitalization•Death•Health plan disenrollment•12/31/2010•730 days of follow-up
365 days
Index bariatric hospitalization
Toh et al, Med Care, 2014;52:664-668 52
![Page 53: Preserving privacy in data sharing - AAMC · Data sharing in multi-center studies • Benefits and challenges • Ways to facilitate data sharing while protecting privacy • Stakeholders’](https://reader030.vdocument.in/reader030/viewer/2022040607/5ec023fb2224fc24ca3610d6/html5/thumbnails/53.jpg)
ConfoundersAge Asthma*Sex Deep vein thrombosis*Race/ethnicity Pulmonary embolism*Diabetes* Congestive heart failure*Baseline BMI* Hyperlipidemia*Year of procedure Coronary artery disease*Charlson comorbidity score* Oxygen use*Atrial fibrillation* Assistive walking device*GERD* Smoking status*Hypertension* Blood pressure*Sleep Apnea* Length of stay assoc. with procedure
*Identified during the 365-day baseline period prior to the index bariatric hospitalization
Toh et al, Med Care, 2014;52:664-668 53
![Page 54: Preserving privacy in data sharing - AAMC · Data sharing in multi-center studies • Benefits and challenges • Ways to facilitate data sharing while protecting privacy • Stakeholders’](https://reader030.vdocument.in/reader030/viewer/2022040607/5ec023fb2224fc24ca3610d6/html5/thumbnails/54.jpg)
Statistical analysis
• Propensity score stratification
• Analysis• Pooled patient-level data analysis (benchmark)• Risk set-based analysis• PS-stratified analysis (by quintile)• Meta-analysis of site-specific effect estimates
Toh et al, Med Care, 2014;52:664-668 54
![Page 55: Preserving privacy in data sharing - AAMC · Data sharing in multi-center studies • Benefits and challenges • Ways to facilitate data sharing while protecting privacy • Stakeholders’](https://reader030.vdocument.in/reader030/viewer/2022040607/5ec023fb2224fc24ca3610d6/html5/thumbnails/55.jpg)
Selected baseline patient characteristicsCharacteristics Adjustable gastric band (n=1,550) Roux-en-y gastric bypass (n=5,792)
N %* N %*
Mean age (SD) 46.7 11.2 45.7 10.7
Age > 65 years 76 4.9 141 2.4
Female sex 1,266 81.7 4,823 83.3
Race/ethnicity
Black or African American 137 8.8 522 9.0
White 1,130 72.9 3,840 66.3
Hispanic 142 9.2 769 13.3
Other 62 4.0 280 4.8
Unknown 79 5.1 381 6.6
Baseline BMI
30-34.9 96 6.2 174 3.0
35-39.9 480 31.0 1,410 24.3
40-49.9 813 52.4 3,126 54.0
≥50 161 10.4 1,082 18.7
Toh et al, Med Care, 2014;52:664-668 55
![Page 56: Preserving privacy in data sharing - AAMC · Data sharing in multi-center studies • Benefits and challenges • Ways to facilitate data sharing while protecting privacy • Stakeholders’](https://reader030.vdocument.in/reader030/viewer/2022040607/5ec023fb2224fc24ca3610d6/html5/thumbnails/56.jpg)
Patient-level data analysis, by site
Site Adjusted HR 95% CISite 1 0.68 0.45, 1.02Site 2 0.65 0.37, 1.15Site 3 0.52 0.26, 1.04Site 4 0.72 0.35, 1.50Site 5 0.82 0.46, 1.48Site 6 0.32 0.13, 0.75Site 7 0.79 0.62, 1.01
Toh et al, Med Care, 2014;52:664-668 56
![Page 57: Preserving privacy in data sharing - AAMC · Data sharing in multi-center studies • Benefits and challenges • Ways to facilitate data sharing while protecting privacy • Stakeholders’](https://reader030.vdocument.in/reader030/viewer/2022040607/5ec023fb2224fc24ca3610d6/html5/thumbnails/57.jpg)
Overall results by method
Toh et al, Med Care, 2014;52:664-668 57
![Page 58: Preserving privacy in data sharing - AAMC · Data sharing in multi-center studies • Benefits and challenges • Ways to facilitate data sharing while protecting privacy • Stakeholders’](https://reader030.vdocument.in/reader030/viewer/2022040607/5ec023fb2224fc24ca3610d6/html5/thumbnails/58.jpg)
Results by method
Method Adjusted HR 95% CI
Benchmark 0.71 0.59, 0.84
Risk set analysis 0.71 0.59, 0.84
PS stratification 0.70 0.59, 0.83
Meta-analysis 0.71 0.60, 0.84
Toh et al, Med Care, 2014;52:664-668 58
![Page 59: Preserving privacy in data sharing - AAMC · Data sharing in multi-center studies • Benefits and challenges • Ways to facilitate data sharing while protecting privacy • Stakeholders’](https://reader030.vdocument.in/reader030/viewer/2022040607/5ec023fb2224fc24ca3610d6/html5/thumbnails/59.jpg)
Pooled patient-level linear regression (from PROC REG)
Distributed linearregression
59
![Page 60: Preserving privacy in data sharing - AAMC · Data sharing in multi-center studies • Benefits and challenges • Ways to facilitate data sharing while protecting privacy • Stakeholders’](https://reader030.vdocument.in/reader030/viewer/2022040607/5ec023fb2224fc24ca3610d6/html5/thumbnails/60.jpg)
Pooled patient-level logistic regression (from PROC LOGISTIC)
Distributed logistic regression
60
![Page 61: Preserving privacy in data sharing - AAMC · Data sharing in multi-center studies • Benefits and challenges • Ways to facilitate data sharing while protecting privacy • Stakeholders’](https://reader030.vdocument.in/reader030/viewer/2022040607/5ec023fb2224fc24ca3610d6/html5/thumbnails/61.jpg)
Pooled patient-level Cox PH regression (from PROC PHREG)
Distributed Cox PHregression
61
![Page 62: Preserving privacy in data sharing - AAMC · Data sharing in multi-center studies • Benefits and challenges • Ways to facilitate data sharing while protecting privacy • Stakeholders’](https://reader030.vdocument.in/reader030/viewer/2022040607/5ec023fb2224fc24ca3610d6/html5/thumbnails/62.jpg)
Data-sharing methods in multi-database studies
Data shared across sites
Patient-level data
Individual covariates
Confounder summary scores
A hybrid of above
Summary-level data
Stratum-specific counts
Risk-set data
Intermediate statistics
Database-specific effect estimates
62
![Page 63: Preserving privacy in data sharing - AAMC · Data sharing in multi-center studies • Benefits and challenges • Ways to facilitate data sharing while protecting privacy • Stakeholders’](https://reader030.vdocument.in/reader030/viewer/2022040607/5ec023fb2224fc24ca3610d6/html5/thumbnails/63.jpg)
Data-sharing methods in multi-database studies
Data shared across sites
Patient-level data
Individual covariates
Confounder summary scores
A hybrid of above
Summary-level data
Stratum-specific counts
Risk-set data
Intermediate statistics
Database-specific effect estimates
63
![Page 64: Preserving privacy in data sharing - AAMC · Data sharing in multi-center studies • Benefits and challenges • Ways to facilitate data sharing while protecting privacy • Stakeholders’](https://reader030.vdocument.in/reader030/viewer/2022040607/5ec023fb2224fc24ca3610d6/html5/thumbnails/64.jpg)
Analytic flexibility vs. privacy protection
Privacy protection
Anal
ytic
flex
ibili
ty
Patient-level info
with individual covariates
Database-specific effect
estimates
Patient-level info
with summary
scores
Stratum-specific counts
Risk-set data
Summary statistics
* Approximation
64
![Page 65: Preserving privacy in data sharing - AAMC · Data sharing in multi-center studies • Benefits and challenges • Ways to facilitate data sharing while protecting privacy • Stakeholders’](https://reader030.vdocument.in/reader030/viewer/2022040607/5ec023fb2224fc24ca3610d6/html5/thumbnails/65.jpg)
Overview
• Data sharing in multi-center studies• Benefits and challenges
• Ways to facilitate data sharing while protecting privacy• Stakeholders’ views on data sharing• Use of distributed data networks• Use of privacy-protecting analytic and data-sharing methods
• Discussion
65
![Page 66: Preserving privacy in data sharing - AAMC · Data sharing in multi-center studies • Benefits and challenges • Ways to facilitate data sharing while protecting privacy • Stakeholders’](https://reader030.vdocument.in/reader030/viewer/2022040607/5ec023fb2224fc24ca3610d6/html5/thumbnails/66.jpg)
Discussion
• Stakeholders are willing to share data if:• Benefits of research outweigh the risks• Risks are minimized• Cost is reasonable
• Although we did not spend too much time on it here, proper governance on data sharing is critical
66
![Page 67: Preserving privacy in data sharing - AAMC · Data sharing in multi-center studies • Benefits and challenges • Ways to facilitate data sharing while protecting privacy • Stakeholders’](https://reader030.vdocument.in/reader030/viewer/2022040607/5ec023fb2224fc24ca3610d6/html5/thumbnails/67.jpg)
Discussion
• Use of distributed data network structure and privacy-protecting analytic methods allow analysis of multiple databases while protecting patient privacy
67
![Page 68: Preserving privacy in data sharing - AAMC · Data sharing in multi-center studies • Benefits and challenges • Ways to facilitate data sharing while protecting privacy • Stakeholders’](https://reader030.vdocument.in/reader030/viewer/2022040607/5ec023fb2224fc24ca3610d6/html5/thumbnails/68.jpg)
A national DDN infrastructure for evidence generation
Health Plan 2
Health Plan 1
Health Plan 5
Health Plan 4
Health Plan 7 Hospital 1
Health Plan 3
Health Plan 6
Health Plan 8
Hospital 3Health Plan 9
Hospital 2
Hospital 4
Hospital 6
Hospital 5
Outpatient clinic 1
Outpatient clinic 3
Outpatient clinic 4
Outpatient clinic 6
Outpatient clinic 5
Outpatient clinic 2
• Each organization can participate in multiple networks• Each network controls its governance and coordination• Networks share infrastructure, analytics, lessons, security, software
68
![Page 69: Preserving privacy in data sharing - AAMC · Data sharing in multi-center studies • Benefits and challenges • Ways to facilitate data sharing while protecting privacy • Stakeholders’](https://reader030.vdocument.in/reader030/viewer/2022040607/5ec023fb2224fc24ca3610d6/html5/thumbnails/69.jpg)
Summary
• Sending analysis to the data
• Sharing information, not data
• Getting more by asking for less
69
![Page 70: Preserving privacy in data sharing - AAMC · Data sharing in multi-center studies • Benefits and challenges • Ways to facilitate data sharing while protecting privacy • Stakeholders’](https://reader030.vdocument.in/reader030/viewer/2022040607/5ec023fb2224fc24ca3610d6/html5/thumbnails/70.jpg)
Acknowledgments• HPHCI
• Jeffrey Brown• Mia Gallagher• Qoua Her• Xiaojuan Li• Sarah Malek• Jessica Malenfant• Richard Platt• Yury Vilk• Jessica Young• Zilu Zhang
• Others• Susan Gruber• Bruce Fireman• Lingling Li• Kazuki Yoshida
• Penn State University• Aleksandra Slavković• Yuji Samizo
• PCORnet• David Arterburn • Jason Block• Jane Anau• Yates Coley• Casie Horgan• Kathleen McTigue• Erick Moyneur• Roy Pardee• Juliane Reynolds• Sheryl Rifas-Shiman• Jessica Sturtevant• Robert Wellman• Many others
70