testing data completeness with dqe -c-v2 - ohdsi€¦ · testing data completeness with dqe -c-v2...

Post on 17-Jul-2020

3 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

TestingDataCompletenesswithDQe-c-v2OHDSISymposium2019:DataQualityWorkshop

09/17/19

TimBergquist,GraduateResearchAssistantBiomedicalInformatics&MedicalEducation

UniversityofWashington

WWAMI region Practice & Research Network

• 60+PrimarycareWWAMIclinics• ~20dataconnectedclinics• CHCsandRHCs• Underservedpopulations• Manyservingruralpopulations• Collaborationwithnational

networkofpracticebasedresearchnetworks

• DataQUESTrepresentsover250,000patientshttps://dataquest.iths.org/

DataQUEST• 20data-connectedclinicsintheWPRN• Representsover250,000patients

Anelectronichealthdata-sharingarchitecture

acrosscommunity-basedprimarycarepracticesin

theWPRN

MeasuringDataQualityFramework

Completeness • Arethedatapresent?

Conformance • Arethedatastandardizedandformatted?

Plausibility • Arethedatabelievable?

Kahnetal.(2016).Aharmonizeddataqualityassessmentterminologyandframeworkforthesecondaryuseofelectronichealthrecorddata.eGEMS,4,1244.https://www.ncbi.nlm.nih.gov/pubmed/27713905

Operationalizingtheframeworkinto:5conceptualtestsand17discretetestsacross:

MeasuringDataQualityFramework

Completeness • Arethedatapresent?

Conformance • Arethedatastandardizedandformatted?

Plausibility • Arethedatabelievable?

Kahnetal.(2016).Aharmonizeddataqualityassessmentterminologyandframeworkforthesecondaryuseofelectronichealthrecorddata.eGEMS,4,1244.https://www.ncbi.nlm.nih.gov/pubmed/27713905

Operationalizingtheframeworkinto:5conceptualtestsand17discretetestsacross:

MeasuringDataQualityFramework

Completeness • Arethedatapresent?

Conformance • Arethedatastandardizedandformatted?

Plausibility • Arethedatabelievable?

Kahnetal.(2016).Aharmonizeddataqualityassessmentterminologyandframeworkforthesecondaryuseofelectronichealthrecorddata.eGEMS,4,1244.https://www.ncbi.nlm.nih.gov/pubmed/27713905

Operationalizingtheframeworkinto:5conceptualtestsand17discretetestsacross:

DataQualityTestsDQFramework

category TEST

COMPLETENESS Gender,Visit,Observationcompleteness(denominatorandproportionwithvaliddata)

COMPLETENESS Keyclinicalstatuscompleteness(denominatorandproportionwithvaliddata):Smokingstatus,alcoholconsumption

COMPLETENESS Measurementcompleteness(denominatorandproportionwithvaliddata):Height,Weight,SBP,DBP

COMPLETENESS CrossreferencetablesthatarepresentincurrentdatasettoexpectedtablesinstandardOMOPCDM

COMPLETENESS LooksforNULLandinvalidvariablevaluesineachcolumnandvisualizespercentmissingness

CONFORMANCE Checkthatprimaryandforeignkeysrelateproperly;HighPriority:Person_ID,Visit_Occurrence_ID

CONFORMANCE Checksthatorphandon'tkeysexist(aforeignkeyispresentinatablebutnoprimarykeyexistsinthereferencetable)

PLAUSIBILITY Comparisonofnewloadtooldload(Numberofobservations,Numberofuniquepatients,Numberoftableswithrows)

PLAUSIBILITY SizeoftablesandrowsacrosstheOMOPCDM

OriginalDQe-cToolModulartooldevelopedinRforassessingcompleteness inEHRdatarepositories.Customizationandconfigurationwasdifficult

HardtoaddnewmodulesDifficulttoaddnewCDMs(ornewversionsofCDMs)

DQe-c-v2ToolModulartooldevelopedinpythonforassessingcompleteness inEHRdatarepositories.

Takesinthedatabasecredentials,CDMversion,andconfigurations.

DQe-c-v2Tool

Takesinthedatabasecredentials,CDMversion,andconfigurations.

DQe-c-v2Tool

Simplyenteryourcredentialsandconfigurationsintotheconfig.json file.

Takesinthedatabasecredentials,CDMversion,andconfigurations.

DQe-c-v2Tool

Simplyenteryourcredentialsandconfigurationsintotheconfig.json file.

Run:pythonDQe-c.py –c/path/to/config.json

DQe-c-v2ToolSetsupthedatabaseconnection,managesreportoutput,andinitiatestheCDMfiles

DQe-c-v2ToolAssessesconformancetoaCommonDataModel.Checksformissingtablesandcalculatessizeoftables.

Quicklycheckthatthenewdataisgrowingasexpected

DQe-c-v2ToolAssessescompleteness ofallcolumnsintheavailabletablesinthedatabase.Checksfornullandnonsensevalues.

IdentifyemptyorusefulcolumnsineachofyourOMOPtables.

DQe-c-v2ToolChecksfororphankeys,foreignkeysnotpresentintheprimarytable.

DQe-c-v2ToolChecksformissingness inclinicalindicators.(Whatpercentofpatientshaveaheartratemeasure,bloodpressuremeasurement,etc.)

Addinganewindicatortestisstraightforward!

Completionasthepresenceofaconcept.Calculateswhatpercentageofpatientshavetheidentifiedconcept(s).

Completionasthepresenceofanon-null.Calculateswhatpercentageofpatientshaveanon-nullvalueintheidentifiedtable-column.

Wecanaddanewindictortestbyjustaddingfivenewfields.

AddingtestingforA1CHemoglobin.CalculateswhatpercentageofpatientshaveahemoglobinA1Cmeasurement.

DQe-c-v2ToolAllreportsarecombinedintoavisualizationdashboard

DQe-c-v2ToolAllthesemodulesoutputcsvreports.TheoutputfoldersaremanagedbyQuery.py

DQe-c-v2Tool

Allthesemodulesoutputcsvreports.TheoutputfoldersaremanagedbyQuery.py toaccountfordifferenttestdatesandorganizations.

DQe-c-v2NetworkAggregationTool

DQe-c-v2NetworkAggregationTool

DQe-c-v2NetworkAggregationTool

DQe-c-v2NetworkAggregationTool

DQe-c-v2ToolReportsarevisualizedintoanHTMLfile.Easytoembedintoawebsite

AddingNewModules

VocabularySummary

VocabularySummary

VocabularySummary

TemporalPlausibility

OperationalizinguseofDQetoolsfordataqualitytesting

*DataQUEST*DARTNet Institute*CD2H

https://github.com/WWAMI-DataQuest/DQe-c_OMOPv4/tree/master/docs

Questions?

• Wearelookingforcollaboratorsandcontributors!

• Contactmeifyouneedhelpgettingthetoolupandrunning.

• Wearealwayslookingforfeedback.

ThankstoKariStephens,HosseinEstiri,WPRN,ITHS,andCD2H!

Contact:TimBergquisttrberg@uw.edu

https://dataquest.iths.org/

https://ctsa.ncats.nih.gov/cd2h/

https://github.com/data2health/DQe-c-v2

CD2HDataQualityProject

https://ctsa.ncats.nih.gov/cd2h/data-quality-methods-and-tools-to-support-ctsa-hub-data-sharing/

top related