Download - Data Quality at the Scale of Aggregation
![Page 1: Data Quality at the Scale of Aggregation](https://reader035.vdocument.in/reader035/viewer/2022070516/587520421a28ab3f098b4595/html5/thumbnails/1.jpg)
DATA QUALITY AT THE SCALE OF AGGREGATION
![Page 2: Data Quality at the Scale of Aggregation](https://reader035.vdocument.in/reader035/viewer/2022070516/587520421a28ab3f098b4595/html5/thumbnails/2.jpg)
IF WE ALL USE STANDARDS, WHY IS THE DATA SO CRAP IN THE END?
![Page 3: Data Quality at the Scale of Aggregation](https://reader035.vdocument.in/reader035/viewer/2022070516/587520421a28ab3f098b4595/html5/thumbnails/3.jpg)
QUALITY IS CONTEXTUAL
![Page 4: Data Quality at the Scale of Aggregation](https://reader035.vdocument.in/reader035/viewer/2022070516/587520421a28ab3f098b4595/html5/thumbnails/4.jpg)
![Page 5: Data Quality at the Scale of Aggregation](https://reader035.vdocument.in/reader035/viewer/2022070516/587520421a28ab3f098b4595/html5/thumbnails/5.jpg)
![Page 6: Data Quality at the Scale of Aggregation](https://reader035.vdocument.in/reader035/viewer/2022070516/587520421a28ab3f098b4595/html5/thumbnails/6.jpg)
![Page 7: Data Quality at the Scale of Aggregation](https://reader035.vdocument.in/reader035/viewer/2022070516/587520421a28ab3f098b4595/html5/thumbnails/7.jpg)
QUALITY IS CONTEXTUALWhat is the “context” of aggregation? Specifically, DPLA’s aggregation…
• Heterogeneous• Basic metadata• Reliance on metadata vs. text• Reliance on item-level metadata
![Page 8: Data Quality at the Scale of Aggregation](https://reader035.vdocument.in/reader035/viewer/2022070516/587520421a28ab3f098b4595/html5/thumbnails/8.jpg)
DATA ISSUES IN DPLAContent Issues• Meaningless
values• Missing values• Confusing values• Incomplete values
Technical Issues• Granularity• Inappropriate
values• Lack of
normalization• Noisy data• Lack of standards
![Page 9: Data Quality at the Scale of Aggregation](https://reader035.vdocument.in/reader035/viewer/2022070516/587520421a28ab3f098b4595/html5/thumbnails/9.jpg)
SHARING METADATAContentConsistencyCoherenceContextCommunicationConformance to standards
…but which “standard”
![Page 10: Data Quality at the Scale of Aggregation](https://reader035.vdocument.in/reader035/viewer/2022070516/587520421a28ab3f098b4595/html5/thumbnails/10.jpg)
DPLA & DATA QUALITYData is robu
stDescriptive fields are present and have meaningful
values
Required properties have meaningful values
Data adheres to standards
All data is normalized in terms of punctuation, presence of noise, etc.
Required properties are present and semantically correct
Technical problems
Contentproblems
Contentquality
![Page 11: Data Quality at the Scale of Aggregation](https://reader035.vdocument.in/reader035/viewer/2022070516/587520421a28ab3f098b4595/html5/thumbnails/11.jpg)
DPLA DATA QUALITY WORKFLOW
Initial AnalysisQA in BlacklightVisual review in test portal site
![Page 12: Data Quality at the Scale of Aggregation](https://reader035.vdocument.in/reader035/viewer/2022070516/587520421a28ab3f098b4595/html5/thumbnails/12.jpg)
![Page 13: Data Quality at the Scale of Aggregation](https://reader035.vdocument.in/reader035/viewer/2022070516/587520421a28ab3f098b4595/html5/thumbnails/13.jpg)
![Page 14: Data Quality at the Scale of Aggregation](https://reader035.vdocument.in/reader035/viewer/2022070516/587520421a28ab3f098b4595/html5/thumbnails/14.jpg)
WE NEED MORE.
WE NEED BETTER.
![Page 15: Data Quality at the Scale of Aggregation](https://reader035.vdocument.in/reader035/viewer/2022070516/587520421a28ab3f098b4595/html5/thumbnails/15.jpg)
EUROPEANA DQCData Quality Committee (DQC) formed within Europeana• Reviewing mandatory elements• Data checking and normalization• Evaluation of meaningful metadata values• Quality of content• Coordination with other quality-related initiatives
![Page 16: Data Quality at the Scale of Aggregation](https://reader035.vdocument.in/reader035/viewer/2022070516/587520421a28ab3f098b4595/html5/thumbnails/16.jpg)
DPLA QUALITY INITIATIVES
WE NEED MORE.
WE NEED BETTER.
LET’S TALK.