i8k|dq-bigdata: i8k architecture extension for data quality in big data i8k|dq-bigdata: i8k...
TRANSCRIPT
![Page 1: I8K|DQ-BigData: I8K Architecture Extension for Data Quality in Big Data I8K|DQ-BigData: I8K Architecture Extension for Data Quality in Big Data Bibiano](https://reader030.vdocument.in/reader030/viewer/2022033107/5697c0251a28abf838cd57b0/html5/thumbnails/1.jpg)
I8K|DQ-BigData: I8K Architecture Extension for Data Quality in Big Data
Bibiano Rivas, Jorge Merino, Manuel Serrano, Ismael Caballero, Mario Piattini
Instituto de Tecnologías y Sistemas de InformaciónUniversidad de Castilla-La Mancha
![Page 2: I8K|DQ-BigData: I8K Architecture Extension for Data Quality in Big Data I8K|DQ-BigData: I8K Architecture Extension for Data Quality in Big Data Bibiano](https://reader030.vdocument.in/reader030/viewer/2022033107/5697c0251a28abf838cd57b0/html5/thumbnails/2.jpg)
2
Master Data
<data> <id>45838589</id> <name>Vladimir</name> <surname>Putin</surname> <email>[email protected]</email> <coolnesslvl>9001</coolnesslvl><data>
Exchange
I8K
|DQ
-Big
Data
: I8K
Arch
itectu
re E
xte
nsio
n fo
r D
ata
Qu
ality
in B
ig D
ata
AssetData
![Page 3: I8K|DQ-BigData: I8K Architecture Extension for Data Quality in Big Data I8K|DQ-BigData: I8K Architecture Extension for Data Quality in Big Data Bibiano](https://reader030.vdocument.in/reader030/viewer/2022033107/5697c0251a28abf838cd57b0/html5/thumbnails/3.jpg)
3
Master Data
<data> <id>45838589</id> <name>Vladimir</name> <surname>Putin</surname> <email>[email protected]</email> <coolnesslvl>9001</coolnesslvl> <DQlvl>75</DQlvl><data>
Exchange
I8K
|DQ
-Big
Data
: I8K
Arch
itectu
re E
xte
nsio
n fo
r D
ata
Qu
ality
in B
ig D
ata
<data> <id>34953858</id> <name>Stefan</name> <surname>Löfven</surname> <email>[email protected]</email> <coolnesslvl>8000</coolnesslvl> <DQlvl>100</DQlvl><data>
![Page 4: I8K|DQ-BigData: I8K Architecture Extension for Data Quality in Big Data I8K|DQ-BigData: I8K Architecture Extension for Data Quality in Big Data Bibiano](https://reader030.vdocument.in/reader030/viewer/2022033107/5697c0251a28abf838cd57b0/html5/thumbnails/4.jpg)
4
I8KI8
K|D
Q-B
igD
ata
: I8K
Arch
itectu
re E
xte
nsio
n fo
r D
ata
Qu
ality
in B
ig D
ata
![Page 5: I8K|DQ-BigData: I8K Architecture Extension for Data Quality in Big Data I8K|DQ-BigData: I8K Architecture Extension for Data Quality in Big Data Bibiano](https://reader030.vdocument.in/reader030/viewer/2022033107/5697c0251a28abf838cd57b0/html5/thumbnails/5.jpg)
5
ISO/TS 8000
• Describes specific aspects of Master DataISO/TS 8000-100
• Describes the vocabularyISO/TS 8000-102
• Establishes the way to translate the Master Data Messages ISO/TS 8000-110
• Information about the Master Data life-cycleISO/TS 8000-120
• Adds information about the Quality of Master Data in terms of AccuracyISO/TS 8000-130
• Adds information about the Quality of Master Data in terms of CompletenessISO/TS 8000-140
I8K
|DQ
-Big
Data
: I8K
Arch
itectu
re E
xte
nsio
n fo
r D
ata
Qu
ality
in B
ig D
ata
![Page 6: I8K|DQ-BigData: I8K Architecture Extension for Data Quality in Big Data I8K|DQ-BigData: I8K Architecture Extension for Data Quality in Big Data Bibiano](https://reader030.vdocument.in/reader030/viewer/2022033107/5697c0251a28abf838cd57b0/html5/thumbnails/6.jpg)
6
I8K Service Architecture
I8K Manager
certification translating
assessment
Data Base
I8K.Cer110
I8K.Cer140
I8K.Cer130 I8K.Ev130
I8K.Ev140
I8K.110
I8K.Mapper
Data Dictionary
DB_Mapping
I8K
|DQ
-Big
Data
: I8K
Arch
itectu
re E
xte
nsio
n fo
r D
ata
Qu
ality
in B
ig D
ata
![Page 7: I8K|DQ-BigData: I8K Architecture Extension for Data Quality in Big Data I8K|DQ-BigData: I8K Architecture Extension for Data Quality in Big Data Bibiano](https://reader030.vdocument.in/reader030/viewer/2022033107/5697c0251a28abf838cd57b0/html5/thumbnails/7.jpg)
7
I8K – ProtocolI8
K|D
Q-B
igD
ata
: I8K
Arch
itectu
re E
xte
nsio
n fo
r D
ata
Qu
ality
in B
ig D
ata
Data ProviderData Requester
A B
Master Data Messages
![Page 8: I8K|DQ-BigData: I8K Architecture Extension for Data Quality in Big Data I8K|DQ-BigData: I8K Architecture Extension for Data Quality in Big Data Bibiano](https://reader030.vdocument.in/reader030/viewer/2022033107/5697c0251a28abf838cd57b0/html5/thumbnails/8.jpg)
8
ProblemI8
K|D
Q-B
igD
ata
: I8K
Arch
itectu
re E
xte
nsio
n fo
r D
ata
Qu
ality
in B
ig D
ata
I8K
![Page 9: I8K|DQ-BigData: I8K Architecture Extension for Data Quality in Big Data I8K|DQ-BigData: I8K Architecture Extension for Data Quality in Big Data Bibiano](https://reader030.vdocument.in/reader030/viewer/2022033107/5697c0251a28abf838cd57b0/html5/thumbnails/9.jpg)
9
SolutionI8
K|D
Q-B
igD
ata
: I8K
Arch
itectu
re E
xte
nsio
n fo
r D
ata
Qu
ality
in B
ig D
ata
Big Data
![Page 10: I8K|DQ-BigData: I8K Architecture Extension for Data Quality in Big Data I8K|DQ-BigData: I8K Architecture Extension for Data Quality in Big Data Bibiano](https://reader030.vdocument.in/reader030/viewer/2022033107/5697c0251a28abf838cd57b0/html5/thumbnails/10.jpg)
10
Proposal Architecture Extension
I8KManager
I8K.Cer140
I8K.Cer130
I8K.Ev140-BiDa
I8K.Ev130-BiDa
Big Data
Regular Data
I8K.Ev130
I8K.Ev140
I8K
|DQ
-Big
Data
: I8K
Arch
itectu
re E
xte
nsio
n fo
r D
ata
Qu
ality
in B
ig D
ata
Enterprise Service Bus
I8K.Cer140-BiDa
I8K.Cer130-BiDa
![Page 11: I8K|DQ-BigData: I8K Architecture Extension for Data Quality in Big Data I8K|DQ-BigData: I8K Architecture Extension for Data Quality in Big Data Bibiano](https://reader030.vdocument.in/reader030/viewer/2022033107/5697c0251a28abf838cd57b0/html5/thumbnails/11.jpg)
11
Type Description
I8K.CR-BiDa An application needs to encrypt a Master Data Message and add information about the level of Data Quality for Big Data
I8K.CR130-BiDa An application request the encryption and the addition of information about the level of Accuracy for Big Data
I8K.CR140-BiDa An application request the encryption and the addition of information about the level of Completeness for Big Data
I8K
|DQ
-Big
Data
: I8K
Arch
itectu
re E
xte
nsio
n fo
r D
ata
Qu
ality
in B
ig D
ata
Proposal New Messages
![Page 12: I8K|DQ-BigData: I8K Architecture Extension for Data Quality in Big Data I8K|DQ-BigData: I8K Architecture Extension for Data Quality in Big Data Bibiano](https://reader030.vdocument.in/reader030/viewer/2022033107/5697c0251a28abf838cd57b0/html5/thumbnails/12.jpg)
12
I8K
|DQ
-Big
Data
: I8K
Arch
itectu
re E
xte
nsio
n fo
r D
ata
Qu
ality
in B
ig D
ata
Proposal Protocol
Data ProviderData Requester
6’. I8K.CR-BiDa/I8K.CR130-BiDa/I8K.CR140-BiDa
A B
![Page 13: I8K|DQ-BigData: I8K Architecture Extension for Data Quality in Big Data I8K|DQ-BigData: I8K Architecture Extension for Data Quality in Big Data Bibiano](https://reader030.vdocument.in/reader030/viewer/2022033107/5697c0251a28abf838cd57b0/html5/thumbnails/13.jpg)
13
hd-slave2(Data Node)
hd-slave1(Data Node)
hd-Master (Name Node)
I8K Manager24GB RAM
8GB RAM 8GB RAM
I8K
|DQ
-Big
Data
: I8K
Arch
itectu
re E
xte
nsio
n fo
r D
ata
Qu
ality
in B
ig D
ata
Proposal Infrastructure
![Page 14: I8K|DQ-BigData: I8K Architecture Extension for Data Quality in Big Data I8K|DQ-BigData: I8K Architecture Extension for Data Quality in Big Data Bibiano](https://reader030.vdocument.in/reader030/viewer/2022033107/5697c0251a28abf838cd57b0/html5/thumbnails/14.jpg)
14
Big Data Evaluators 130 and 140
def mapper140(dq_rules): for line in sys.stdin: data = line.strip().split(";") isIndq_rules = True length = range(len(data)) aux = "" for i in length: if (str(i) in dq_rules): if(isEmpty(data[i])==True): isIndq_rules = False else: aux+=data[i]+";" print(‘{0};{1}'.format(isIndq_rules, aux))
I8K
|DQ
-Big
Data
: I8K
Arch
itectu
re E
xte
nsio
n fo
r D
ata
Qu
ality
in B
ig D
ata
Proposal Implementation
![Page 15: I8K|DQ-BigData: I8K Architecture Extension for Data Quality in Big Data I8K|DQ-BigData: I8K Architecture Extension for Data Quality in Big Data Bibiano](https://reader030.vdocument.in/reader030/viewer/2022033107/5697c0251a28abf838cd57b0/html5/thumbnails/15.jpg)
15
The ability to assess a huge volume of data. Adaptation to the Preture.
The Improvement in the performance of the assessment
I8K
|DQ
-Big
Data
: I8K
Arch
itectu
re E
xte
nsio
n fo
r D
ata
Qu
ality
in B
ig D
ata
Proposal Advantages
![Page 16: I8K|DQ-BigData: I8K Architecture Extension for Data Quality in Big Data I8K|DQ-BigData: I8K Architecture Extension for Data Quality in Big Data Bibiano](https://reader030.vdocument.in/reader030/viewer/2022033107/5697c0251a28abf838cd57b0/html5/thumbnails/16.jpg)
16
It is necessary to have appropriate levels of quality when exchanging data
Using Big Data technologies to tackle the efficiency issues of the I8K architecture has improved the performance
Use Standards as foundations for our work will ease to cope new challenges
ConclusionsI8
K|D
Q-B
igD
ata
: I8K
Arch
itectu
re E
xte
nsio
n fo
r D
ata
Qu
ality
in B
ig D
ata
![Page 17: I8K|DQ-BigData: I8K Architecture Extension for Data Quality in Big Data I8K|DQ-BigData: I8K Architecture Extension for Data Quality in Big Data Bibiano](https://reader030.vdocument.in/reader030/viewer/2022033107/5697c0251a28abf838cd57b0/html5/thumbnails/17.jpg)
17
Include the assessment of new Data Quality Dimensions
Include the real-time assessment
Conduct a set of study cases to measure the improvement
Future WorkI8
K|D
Q-B
igD
ata
: I8K
Arch
itectu
re E
xte
nsio
n fo
r D
ata
Qu
ality
in B
ig D
ata
![Page 18: I8K|DQ-BigData: I8K Architecture Extension for Data Quality in Big Data I8K|DQ-BigData: I8K Architecture Extension for Data Quality in Big Data Bibiano](https://reader030.vdocument.in/reader030/viewer/2022033107/5697c0251a28abf838cd57b0/html5/thumbnails/18.jpg)
I8K|DQ-BigData: I8K Architecture Extension for Data
Quality in Big Data
Bibiano Rivas, Jorge Merino, Manuel Serrano, Ismael Caballero, Mario Piattini
Instituto de Tecnologías y Sistemas de InformaciónUniversidad de Castilla-La Mancha