big data value reference modelfrom bdva (sria 4.0) … · 12/31/2019 · big data value reference...
TRANSCRIPT
BIGDATAVALUEREFERENCEMODEL FROMBDVA(SRIA4.0)– WITH MACHINELEARNING/AI
SINTEF
BDVATF6(TechnicalPriorities)leader
Agenda
2
BDVAdrivingbig datastandardisation andinteroperability priorities: 'Outcomes 2017andupdate on ISOJTC1WG9BigDataandJTC1/SC42Artificial Intelligence(RayWalshe andAbdellatif Tuoimi)
BigDataValueReferencemodel fromBDVA(SRIA4.0)– with Machine Learning/AI (ArneJ.Berre)BigDataEurope(SimonScerri)BigDataStandardsrelated toBigDataPPPprojects:(ArneJ.Berre)BigDataPPP -LargeScale Pilots: (DataBio,TransformingTransport) (ICT-15-2017)BigDataPPP- IAProjectswith Linked Data/Graph data:(EUBusinessGraph,EWShopp,QROWD, ..AEGIS,Projectson Cross-sectorial andcross-lingual dataintegration andexperimentation (ICT-14-2017)BigDataBenchmarking – DataBench projectHOBBITBenchmarking project (GayaneSedrakyan)EO/Geo BigData typestandards(CodrinaIlie)Reporton the Trans-Atlantic EU-USSymposiumon PublicPrivatePartnerships forBigData,
November20th (RayWalshe andMelissaCragin)– on BigDataIntegrationandInteroperabilityinvarious domains SmartCities,Transportation,Health,Environment,AgriFood,Energy,Water,…
Roadmap tocoordinate EuropeanBigDataStandardisation priorities (RayWalshe /All)
04-01-18 3www.bdva.eu
TF1:Programme
TF2:Impact
TF3:Community
TF4:Communication
BDVA is taking care…of many different aspects of the big data technology
TF5:Policy&
Societal
Policy&Societal
TF6:Technical(Dr. ArneJ.Berre)
TF6-SG1:DataManagement
TF6-SG2:DataProcessingArchitectures
TF6-SG3:DataAnalytics
TF6-SG4:DataProtectionandPseudonymisation
Mechanisms
TF6-SG5:AdvancedVisualisationandUser
Experience
TF6-SG6:Standardisation
TF7:Application
TF7-SG1:EmergingApplication Areas
TF7-SG2:Telecom
TF7-SG3:Healthcare
TF7-SG4:Media
TF7-SG5:Earthobservation&geospatial
TF7-SG6:SmartManufacturing
Industry
TF8:Business
TF8-SG1:Dataentrepreneurs(SMEsandstartups)
TF8-SG2:Transformingtraditional
business (LargeEnterprise)
TF8-SG3:ObservatoryonDataBusinessModels
TF9:SkillsandEducation
TF9.SG1:Skillrequirements
fromEuropeanindustries
TF9SG2:Analysisofcurrentcurricula
related todatascience
TF9.SG3:Liaison withexisting
educationalprojects
04-01-18 4www.bdva.eu
BDVReferenceModel
04-01-18 5www.bdva.eu
BigDataValue ReferenceModel
Timeseries,IoT
GeoSpatioTemp
MediaImageAudio
TextNLP.Genom
WebGraphMeta
Structdata/BI
Standards
DataProcessingArchitecturesandWorkflowsBatch,Interactive,Streaming/Real-time
DataVisualisation andUserInteraction1D,2D,3D,4D,VR/AR
DataAnalytics Descriptive,Diagnostic,Predictive,Prescriptive
MachineLearning andAI,DeepLearning, Statistics, HYBRID ANALYTICS(Optim/Simulation)
j
DataManagementCollection,Preparation,Curation,Linking,Access,Sharing–DataMarket/DataSpaces
DBtypes:SQL,NoSQL(Document,Key-Value,Column,Array,Graph,…)
CloudandHighPerformanceComputing(HPC)
BigDataPriorityTech
Areas
Applications/Solutions:Manufacturing, Health,Energy, Transport, BioEco,Media,Telco,Finance,EO,SE,…
DataProtection,Anonymisation,…
BigdataTypes&semantics
Things/Assets,SensorsandActuators(Edge,Fog,IoT,CPS)
CommunicationandConnectivity,incl.5G
CyberSecurityandTrust
Developm
ent-EngineeringandD
evOps
Datasharingplatforms.,Industrial/Personal
04-01-18 6www.bdva.eu
Latest ISO JTC1 WG9 Big Data Reference Architecture
ApplicationProviderLayer
ProcessingLayer
PlatformLayer
ResourceLayer
Multi-layerfunctions
Integration Security&Privacy
Manage-ment
Algorithms
Visualizations
WorkFlowEngine
AccessServices
Transformations
BatchFrameworks
InteractiveFrameworks
StreamingFrameworks
RelationalStorage
DocumentStorage
Key-valuestorageGraphStorage
Wide-columnarstorage
FileSystems
MessagingFrameworks Audit
Frameworks
AuthorizationFrameworks
AuthenticationFrameworks
MonitoringFrameworks
Provisioning/ConfigurationFrameworks
PackageManagementFrameworks
ResourceManagementFrameworks
IngestServices
AnonimizationFrameworks
StateManagementFrameworks
DataLifeCycleManagement
DP-AP
AP-PR
PR-PL
DC-AP
PL-RL
IN-FL
SP-FL
MG-FL
IN-SP SP-MG
IN-MG
MakeDataAvailable
Role:DataProvider Role:BigDataConsumer
ReceiveDataOutput
ResourceAbstractionandControl
PhysicalResources
Columnbasedstorage
BigDataStandardsWorkshopandISOJTC1WG9meeting,Dublin,15-22August2017(hosted byRayWalshe)
04-01-18 7www.bdva.eu
Timeseries
,IoT
GeoSpatioTemp
MediaImageAudio
TextNLP.Genom
WebGraphMeta
Structdata/BI
Standards
DataProcessingArchitecturesandWorkflowsBatch, Interactive,Streaming/Real-time
DataVisualisationandUserInteraction1D,2D,3D,4D,VR/AR
DataAnalyticsDescriptive,Diagnostic,Predictive,PrescriptiveMachine Learning andAI, Deep Learning, Statistics,
j
DataManagementCollection, Preparation, Curation,Linking,Access– DataMarket/DataSpace
DBtypes:SQL,NoSQL(Document,Key-Value,Coloum,Array,Graph, …)
CloudandHighPerformanceComputing(HPC)
BigDataPriorityTech
Areas
Applications/Solutions:Manufacturing,Health,Energy,Transport,BioEco,Media,Telco,Finance,EO,SE
DataProtection,Anonymisation, …
BigdataTypes&semantics
Things/Assets,SensorsandActuators(Edge,Fog,IoT,CPS)
CommunicationandConnectivity,incl.5G
CyberSecurity,Risk,Trust,DataPrivacy
Developm
ent-EngineeringandD
evOps
Marketplaces,Ecosystem
s,andInnovationsupport
ApplicationProviderLayer
ProcessingLayer
PlatformLayer
ResourceLayer
Multi-layerfunctions
Integration Security&Privacy
Manage-ment
Algorithms
Visualizations
WorkFlowEngine
AccessServices
Transformations
BatchFrameworks
InteractiveFrameworks
StreamingFrameworks
RelationalStorage
DocumentStorage
Key-valuestorageGraphStorage
Wide-columnarstorage
FileSystems
MessagingFrameworks Audit
Frameworks
AuthorizationFrameworks
AuthenticationFrameworks
MonitoringFrameworks
Provisioning/ConfigurationFrameworks
PackageManagementFrameworks
ResourceManagementFrameworks
IngestServices
AnonimizationFrameworks
StateManagementFrameworks
DataLifeCycleManagement
DP-AP
AP-PR
PR-PL
DC-AP
PL-RL
IN-FL
SP-FL
MG-FL
IN-SP SP-MG
IN-MG
MakeDataAvailable
Role:DataProvider Role:BigDataConsumer
ReceiveDataOutput
ResourceAbstractionandControl
PhysicalResources
Columnbasedstorage
BDVA Reference Model vs ISO WG9 Big Data Reference ArchitectureUpdates/changes fromthe BDVAReferenceModelwill besubmitted into the ISOprocess
July 4th, 2017EXDCI WP28
HPC
HPC, Big Data &Deep Learning Stacks
Big Data Deep Learning
Infiniband & OPA fabrics
Storage & I/O nodes, NAS
GP* CPU nodes, GPUs, FPGAs
Linux OS Variant
Containers
PFS(Lustre etc.)
MPI OpenMP, threading
Accelerator APIs
Numerical libraries
Performance & debugging
Domain-specific libraries
Conventional compiled languages (C, C++, FORTRAN)
Scripting (Python, …)
IDEs & Frameworks(PETSc, …)
Compiled in-house, commercial & OSS applications
Cluster management(OpenHPC)
Batch scheduling (SLURM …)
Linux OS Variant (some Windows)
Ethernet fabrics
Local storage
GP* CPU hyper-convergent nodes
Virtualization: hypervisor or containers (Dockers, Kubernetes, …)
VMM and container management
I/O libraries(HDF5, …)
Orchestration and RMS
Cloud service I/F Storage systems(DFS, Key/value, …)
Map-Reduce Processing(Hadoop, Spark)
Data stream processing (Storm, Spark, …)
Distributed coordination (Zookeeper, …)
Workflows combining many application elements
Compiled languages (C++)
Traditional ML(Mahout)
Scripting & WF languages (R, Python, Java, Scala, …)
Linux OS Variant (Windows?)
Ethernet‡ GP* CPU + GPU/FPGA, TPU
Local storage or NAS/SAN
Virtualization: hypervisor or containers (Dockers, Kubernetes, …)
VMM and container management
Orchestration and RMS
Neural network frameworks(Caffe, Torch, Theano, Tensorflow … )
Load distribution layer
Scripting languages (Python, …)
Inference engines(low precision)
Defined and instantiated/trained neural networks
Can be part of
Applications
Middleware& Mgmt.
System SW
Hardware
* GP: general purpose
User-space fabric access
Direct accelerator use
Numerical libraries (dense LA) Accelerator APIs
Cloud service I/F Storage systems(DFS, Key/value, …)
Red boxes: data components ‡ need for faster fabrics for training scale-out
Exclusive use of partitions
04-01-18 9www.bdva.eu
Timeseries,IoT
StructuredData/
BusinessIntelligence
GeoSpatial
Temporal
MediaImageAudio
Text,Language,Genomics
WebGraphMeta
DataProtection
DataProcessingArchitectures
DataVisualisation andUserInteraction
DataAnalytics - w/AI Cognition – learning,reasoning,creativity
DataManagement
CloudandHighPerformanceComputing(HPC)
Things/Assets,SensorsandActuators(Edge,IoT,CPS)
CommunicationandConnectivity,incl.5G
CyberSecurityandTrust
Datasharingplatforms.,Industrial/Personal
CoreBDVA
Key
InCollaboration
DataTypes
Standards
Development-
EngineeringandDevOps
Perception andControl
BDVReferenceModelwith AI/MachineIntelligence areashighlighted
FactsRules
Sensor Image NLP KnowledgeGraphSpeech
Cognition ->Recommendations ->Decisions
Spatial
BDVA- Analytics– Taxonomy/Classifications• Analytictypes– withApplied/Industrial/CommercialAnalytictypes(4+2groups):• Descriptive(past)(Historicalanalysis,…)• Diagnostic(now)(Anomalydetection,Frauddetection,ConditionMonitoring• Predictive(future)(Predictivemaintenance,Forecasting,…• Prescriptive(Recommendationsystems,Qualityimprovement,..• Hybridanalytics(CombiningData-DrivenanalyticswithFirst-OrderModeling–
Simulation/Optimisationetc.)- Extremeanalytics(IntegrationwithHPC,scientificcomputingforextremeperformance,..)
• Machinelearning,AI,Statistics,Algorithms/DeepLearning/DataMining/DataScience(3groups):• Supervisedlearning(Classification,Regression,AI- DeepLearning(NeuralNetworks),..• Unsupervisedlearning(Clustering,Dimensionreduction(PrincipalComponentAnalysis,
MultidimensionalScaling),Associationrules,Causalitydetection,On-linelearning(streaming)• Reinforcementlearning(Collaborativefiltering,…)• Bigdatatypeanalytics(6groups+1– inputtoabovemachinelearningandadvancedanalytics):• Structureddata(Basicmin/max/sum/averagequeries,BasicStatistics,Correlations)• TimeSeries/Sensor/IOT/Stream• Media/Image/Video/Audio(Imagerecognition,speechrecognition,…)• (Geo)Spatial(Spatio temporalanalysis,geometry/topology...)• Text/NLP(Text/Speechunderstanding/cognition,translation,Semanticsearch,Sentiment
analysis)• Graph/Network/Web/(Metadata)(LinkedData–Graphanalytics,…)• Ontologies/Taxonomies/VocabulariesforSemanticInteroperability(forallbigdatatypesand
forvariousdomains/sectors/industries)
04-01-18 11www.bdva.euFromhttps://machinelearningmastery.com/
04-01-18 12www.bdva.eu12
04-01-18 13www.bdva.eu
ACMComputingclassificationsystemhttps://www.acm.org/publications/class-2012
from2012…
04-01-18 14www.bdva.eu
04-01-18 15www.bdva.eu
04-01-18 16www.bdva.eu
Therecent AIsummerisbased onthe use of Machinelearningcombined with (Big)Data
04-01-18 17www.bdva.eu
04-01-18 18www.bdva.eu
04-01-18 19www.bdva.eu
Agenda
BDVAdrivingbig datastandardisation andinteroperability priorities: 'Outcomes 2017andupdate on ISOJTC1WG9BigDataandJTC1/SC42Artificial Intelligence(RayWalshe andAbdellatif Tuoimi)
BigDataValueReferencemodel fromBDVA(SRIA4.0)(ArneJ.Berre)BigDataEurope(SimonScerri)BigDataStandardsrelated toBigDataPPPprojects:(ArneJ.Berre)BigDataPPP -LargeScale Pilots: (DataBio,TransformingTransport) (ICT-15-2017)BigDataPPP- IAProjectswith Linked Data/Graph data:(EUBusinessGraph,EWShopp,QROWD, ..AEGIS,Projectson Cross-sectorial andcross-lingual dataintegration andexperimentation (ICT-14-2017)BigDataBenchmarking – DataBench projectHOBBITBenchmarking project (GayaneSedrakyan)EO/Geo BigData typestandards(CodrinaIlie)Reporton the Trans-Atlantic EU-USSymposiumon PublicPrivatePartnerships forBigData,
November20th (RayWalshe andMelissaCragin)– on BigDataIntegrationandInteroperabilityinvarious domains SmartCities,Transportation,Health,Environment,AgriFood,Energy,Water,…
Roadmap tocoordinate EuropeanBigDataStandardisation priorities (RayWalshe /All)
BDVREFERENCEMODEL – FORCURRENT PROJECTS– INCL.DATABIO ANDTRANSFORMINGTRANSPORT
SINTEF
BDVATF6(TechnicalPriorities)leader
21www.bdva.eu
BDV PPP projects (H2020-ICT-2016-2017)
22www.bdva.eu
Enabling theEuropeanBusinessGraphforInnovative DataProductsandServices
SupportingEventandWeather-basedDataAnalyticsandMarketing alongthe
Shopper Journey
UnderstandingEurope’sFashionDataUniverse
AdvancedBigDataValue ChainforPublicSafety andPersonalSecurity
Accelerating DatatoMarket
BecauseBigDataIntegration isHumanlyPossible
Exploiting OceansofDataforMaritime Applications
ScalableLinkingandIntegration ofBigPOIdata
Cross-sectorialandcross-lingualdataintegrationandexperimentation (ICT-14)
http://eubusinessgraph.eu/
http://www.ew-shopp.eu/
https://fashionbrain-project.eu/
http://www.aegis-bigdata.eu/
https://datapitch.eu/
http://qrowd-project.eu/
http://www.bigdataocean.eu/site/
http://www.slipo.eu/
Support,industrialskills,benchmarkingandevaluation(ICT-17)
www.big-data-value.eu
Support theimplementation oftheBigDataValuePPP
23www.bdva.eu
MyHealth MyData
Scalable Oblivious Data Analytics
Ethical andSocietal ImplicationsofData Sciences
Scalable Policy-awarE linkeddataarchitecture forprivacy,transparency
and compliance
Privacy-preserving bigdatatechnologies (ICT-18)
Data-driven bioeconomy
TransformingTransport
Largescalepilotactionsinsectorsbestbenefitting fromdata-driven
innovation (ICT-15)
Knowledge Complexity
EnablingresponsibleICT-relatedresearch andinnovation(ICT-35)
https://www.databio.eu/en/
http://www.transformingtransport.eu/
https://kplex-project.com/
http://www.myhealthmydata.eu/
https://www.soda-project.eu/
http://www.e-sides.eu/
https://www.specialprivacy.eu/
24www.bdva.eu
REQUESTED INPUT FROM ALL OF THE BIGDATA PPP PROJECTS (THROUGH BDV)
- Use casetemplate – filled in– forallpilots/use cases- BDVReferenceModelmapping – fortechnical components andstandardsused/proposed- Inputforproject businessresults evaluation – forBusinessKPI- Projectneeds/opportunities/requirements forbusinessandtechnology benchmarks- Inputfortechnology results evaluation – forTechnologyKPIs
25www.bdva.eu
ISO JTC1 WG9 use case template (1/2)
26www.bdva.eu
ISO JTC1 WG9 use case template (1/2)
27www.bdva.eu
DataProtection&
CyberSecurity
Engineering&DevOps Standards
DataProcessingArchitectures
DataVisualisation andUserInteraction
DataAnalytics
DataManagement
Batch Interactive Streaming/Real-time Other
Descriptive Predictive Prescriptive
1D 2D 3D VR/AR
Collection/Ingestion
Preparation/Curation AccessLinking/
Integration
1+1D(time) 2+1D(time) 3+1D(time)
Created by your project Used by your project
Use cases,Pilots,Applications,Sectors,Domains,Solutions,….
Datasources andBigDatatypes
GraphNetworkMetadata
Text/Genome
MediaImage
StructData,
BI
IoTSensor
SpatioTemp
Diagnostic
SQL File-store
NoSQLKey-value
NoSQLDocumentStore
NoSQLColumn
NoSQLGraph
other
Classification Clustering Regression DeepLearning(ANN,CNN,RNN)
VariousAIMethods
OptimisationSimulation
other
other
Other
Template forshowing technology areas– with project use andcreation of technologies(Extended BDVAReferenceModel)
28www.bdva.eu
DataProtection&
CyberSecurity
Engineering&DevOps Standards
DataProcessingArchitectures
DataVisualisation andUserInteraction
DataAnalytics
DataManagement
Batch Interactive Streaming/Real-time Other
Descriptive Predictive Prescriptive
1D 2D 3D VR/AR
Collection/Ingestion
Preparation/Curation AccessLinking/
Integration
1+1D(time) 2+1D(time) 3+1D(time)
Created/Modified by DataBio Used by DataBio fishery pilot C2
PilotC2– T3.4.2Smallpelagicmarket predictions andtraceability
Datasources andBigDatatypes
GraphNetworkMetadata
Text/Genome
MediaImage
StructData,
BI
IoTSensor
SpatioTemp
Diagnostic
SQL File-store
NoSQLKey-value
NoSQLDocumentStore
NoSQLColumn
NoSQLGraph
other
Classification Clustering Regression DeepLearning(ANN,CNN,RNN)
VariousAIMethods
OptimisationSimulation
other
other
Other
Extended BDVAReferenceModel- forshowing technology areas– with project use
1. Ratatosk (C17.01)forloggingdatafromvessel2. Hydroacoustic/sonarandweather data3. Sonar,radarandEOimages(C07.01-4)4. Pelagic auction dataandcatch reports5. Worldbankmarket data(JSON)6. GeoJSON datastore– CouchDB?7. Map feature server– WMS– GeoServer,
GeoRocket orother (C04.01-4)?8. Stim(C17.02)fordatavalidation,and
decimation/interpolation9. SWcomponent gathering datafrom4and510. Batchprocessing of historic data- Hadoop?11. Interactiveanditerativedevelopment12. Onvessel processing,ref 1,2,3,datafiltering
forfish observations.Imageclassification(C29.01,C31.01orother?)
13. Datacollation fordifferentdatasources ofhigh variety insamplingfrequency(spatial/time)andformats– FromDataBioplatform
14. MLmethods forpredictivepricemodels(C34.01,C16.02,C34.01orother)
15. Correllation/Covariance analysis16. MLoptimization of factorsaffecting fishery
revenue17. Plotof price models/time–R,matlab or
octave?18. Interactivedisplayspatio-temporalcatch and
market information- CESIUM)and/orRasdaman (C04.01,C05.01)?
19. Gluster20. Docker forsetup/job/servicerun21. DCOSw/MESOS
41 2 35
5 647
681 9
10 11 12 13
14
14
15
16
17 18
19 20 21
This document is part of a project that has received funding from the European Union’s Horizon 2020 programme under grant agreement No 732064. It is the property of the DataBio consortium and shall not be distributed or reproduced without the formal approval of the DataBio Management Committee. See www.databio.eu.
} Horizon 2020 (ICT call, Big Data)
} 48 partners from 16 countries
} Funded between 2017 and 2020 (36 months)
} Overall budget 16 mil. €(EU contribution 13 mil. €)
ThisdocumentispartofaprojectthathasreceivedfundingfromtheEuropeanUnion’sHorizon2020researchandinnovationprogrammeunderagreementNo732064.ItisthepropertyoftheDataBioconsortiumandshallnotbedistributedorreproducedwithouttheformalapprovaloftheDataBioManagementCommittee.Findusatwww.databio.eu.
30
DataBioproject info
Projecttitle:Data-DrivenBioeconomyProjecttype:H2020 InnovationAction,intopicICT-15-2016-2017 - BigDataPPP:LargeScalePilotactionsinsectorsbestbenefittingfromdata-driveninnovationDuration:1Jan.2017– 31Dec.2019(36months)Totalbudget:16,2M€Partners:48partners,70+associatedpartners
ThisdocumentispartofaprojectthathasreceivedfundingfromtheEuropeanUnion’sHorizon2020researchandinnovationprogrammeunderagreementNo732064.ItisthepropertyoftheDataBioconsortiumandshallnotbedistributedorreproducedwithouttheformalapprovaloftheDataBioManagementCommittee.Findusatwww.databio.eu.
31
Projectobjectives
• Industrialdomain:bioeconomy• Improveutilizationofbestpossiblerawmaterialsfromagriculture,forestryandfishery
• Improveproductionoffood,energyandbiomaterials• Theopportunity:bioeconomycanbenefitfromBigData
• Farmmachines,fishingvessels,forestrymachineryandremoteandproximalsensorscollectlargequantitiesofdata
• Bigdatatechnologiesprocessdataandcreateknowledgethatincreasesperformanceandproductivityinasustainableway
• Projectobjectives• BuildaversatileDataBioplatformsuitablefordifferentindustriesanduserprofiles• EnsureeffectiveutilizationofBigDatadatasetsandtechnologiesinbioeconomy• OpenthepossibilitiesforEuropeanICTandEarthObservationindustriesinthebioeconomymarket
ThisdocumentispartofaprojectthathasreceivedfundingfromtheEuropeanUnion’sHorizon2020researchandinnovationprogrammeunderagreementNo732064.ItisthepropertyoftheDataBioconsortiumandshallnotbedistributedorreproducedwithouttheformalapprovaloftheDataBioManagementCommittee.Findusatwww.databio.eu.
32
Combiningdriversandassets
Sector Variety Volume(TB)
Velocity(TB/Year)
Agriculture 8sources, 4types 53 197
Forestry8sources, 7types 11,39 12,12
Aerial/UAV 100GB/h
Fishery 20sources, 13types 8,82 6,27
26pilots, in3sectors x3thematicgroups
ThisdocumentispartofaprojectthathasreceivedfundingfromtheEuropeanUnion’sHorizon2020researchandinnovationprogrammeunderagreementNo732064.ItisthepropertyoftheDataBioconsortiumandshallnotbedistributedorreproducedwithouttheformalapprovaloftheDataBioManagementCommittee.Findusatwww.databio.eu.
33
1. HsLayerNG3D-OLU• IntegratesCesium3D
2. HsLayerNG• JSwebmappinglibrary
3. Metaphactory• graphdatavisualization
4. D2RQ• transformsrelationalDBtovirtualRDFgraphs
5. Virtuoso• tostorethesemanticdatasets
6. Silk• fordiscoveringlinks
7. Sparql• forqueringsemanticdata
8. Farmingdatamodelandontology9. Datasets:
a. Yieldpotentialb. OpenLandUsec. SmartPOId. TerrasGauda
DataProtection&
CyberSecurity
Engineering&DevOps Standards
DataProcessingArchitectures
DataVisualisation andUserInteraction
DataAnalytics
DataManagement
Batch Interactive Streaming/Real-time Other
Descriptive Predictive Prescriptive
1D 2D 3D VR/AR
Collection/Ingestion
Preparation/Curation AccessLinking/
Integration
1+1D(time) 2+1D(time) 3+1D(time)
Integrationofgeographicandlinkeddataintoa3Dwebenvironment
DatasourcesandBigDatatypes
GraphNetworkMetadata
Text/Genome
MediaImage
StructData,
BI
IoTSensor
SpatioTemp
Diagnostic
SQL File-store
NoSQLKey-value
NoSQLDocumentStore
NoSQLColumn
NoSQLGraph
other
Classification Clustering Regression DeepLearning(ANN,CNN,RNN)
VariousAIMethods
OptimisationSimulation
other
other
Other
8
76
5
4
31
2 2
Componentsusedin1(of26pilots)
D2RQ
9
ThisdocumentispartofaprojectthathasreceivedfundingfromtheEuropeanUnion’sHorizon2020researchandinnovationprogrammeunderagreementNo732064.ItisthepropertyoftheDataBioconsortiumandshallnotbedistributedorreproducedwithouttheformalapprovaloftheDataBioManagementCommittee.Findusatwww.databio.eu.
34
(Existing) EOInfrastructure e.g.ESA,
VITO,e-GEOS,…
EODataManagement
EODataProcessors
EO“datapipe”
35www.bdva.eu
36www.bdva.eu
37www.bdva.eu
Example from TT – Transforming Transport (EU project)
38www.bdva.eu
Big Data PPP - IA Projects with Linked Data/Graph data: (EUBusinessGraph, EWShopp, QROWD, ..AEGIS, Projects on Cross-sectorial and cross-lingual data integration and
experimentation (ICT-14-2017)
Ref.presentations of these project inthe PPPproject session on November22nd-
Allprojects are using various aspects of linked datatechnologies (W3C++)
Suitable targetforbenchmarkingingeneralandlinked databenchmarking(HOBBIT)inparticular.
39www.bdva.eu
BENCHMARKING (REQUESTED INPUTFROM BIG DATA PPP PROJECTS)- Use casetemplate – filled in– forallpilots/use cases- BDVReferenceModelmapping – fortechnical components andstandardsused/proposed- Inputforproject businessresults evaluation – forBusinessKPI- Projectneeds/opportunities/requirements forbusinessandtechnology benchmarks- Inputfortechnology results evaluation andbenchmarking– forTechnologyKPIs
40www.bdva.eu
41www.bdva.eu
42www.bdva.eu
43www.bdva.eu
44www.bdva.eu
DataPrivacy
Timeseries,IoT
GeoSpatioTemp
MediaImageAudio
TextNLP
WebGraph
BDVAReferenceModel
Structdata/BI
DataProcessingArchitectures
DataVisualisation andUserInteraction
DataAnalytics
DataManagement
Infrastructure
BigDataPriorityTech
Areas
Sectors:Manufacturing, Health,Energy,Media,Telco,Finance,EO,..
BigdataTypes&semantics
BigBench
Hobbit-IV
BigDataBench
ALOJA
TPC
Hobbit-II
Hobbit-II
Hobbit-I+III
LDBC-3Graphalytics
LDBC-2SocialNet
LDBC-1SemanticPub
BigBench
BigDataBench
BigBench2.0
YStreamB
DeepMark DeepBenchRIoTBench
BigBench2.0
Horizontalbenchmarks
Verticalbenchmarks
SenseMark
ABenchSparkBench
YCSBSparkBench
StreamBench
45www.bdva.eu
46www.bdva.eu