summary, conclusions and commentsbigdata.htw-berlin.de/17/slides/4.4_guelzow.pdf · fraunhofer:...

27
Volker Guelzow Summary, conclusions and comments

Upload: others

Post on 18-Oct-2020

2 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Summary, conclusions and commentsbigdata.htw-berlin.de/17/slides/4.4_Guelzow.pdf · Fraunhofer: Data Life Cycle A D A 1 Application generates data 2Sensors Data recorded by sensors

Volker Guelzow

Summary, conclusions and comments

Page 2: Summary, conclusions and commentsbigdata.htw-berlin.de/17/slides/4.4_Guelzow.pdf · Fraunhofer: Data Life Cycle A D A 1 Application generates data 2Sensors Data recorded by sensors

Social Media Platforms 10 Billion Facebook messages/day 4.5 Billion likes 350 million picture/day Source: https://www.simplilearn.com/how-facebook-is-using-big-data-article

2

Page 3: Summary, conclusions and commentsbigdata.htw-berlin.de/17/slides/4.4_Guelzow.pdf · Fraunhofer: Data Life Cycle A D A 1 Application generates data 2Sensors Data recorded by sensors

Social Media used for marketing http://www.socialmediaexaminer.com/SocialMediaMarketingIndustryReport2015.pdf

3

Page 4: Summary, conclusions and commentsbigdata.htw-berlin.de/17/slides/4.4_Guelzow.pdf · Fraunhofer: Data Life Cycle A D A 1 Application generates data 2Sensors Data recorded by sensors

Big Data, are we still leading?

•  Flagship projects SKA and HL-LHC with Exabytes/Year in about 2025

•  Various other Projects like •  European XFEL ~ 100 PB/

Year before 2020 •  CTA, ….. •  Big Data not only a matter of size •  Big Data means open data

4

Page 5: Summary, conclusions and commentsbigdata.htw-berlin.de/17/slides/4.4_Guelzow.pdf · Fraunhofer: Data Life Cycle A D A 1 Application generates data 2Sensors Data recorded by sensors

Big Data in Astronomy

•  Data-driven methods •  Lots of information hidden in lowers data, but difficult to extract (degeneracies) individually •  Data driven methods can statistically tackle the problem. •  Astronomy is a discovery science •  In order to tap the full potential: Big data means open data! •  Some very exciting big data projects in the making •  Ideal playing ground (no commercial interest, (almost) no privacy

concerns) •  Data-driven algorithms key to exploit the fun

5

Page 6: Summary, conclusions and commentsbigdata.htw-berlin.de/17/slides/4.4_Guelzow.pdf · Fraunhofer: Data Life Cycle A D A 1 Application generates data 2Sensors Data recorded by sensors

The SKA Data Flow

6

Page 7: Summary, conclusions and commentsbigdata.htw-berlin.de/17/slides/4.4_Guelzow.pdf · Fraunhofer: Data Life Cycle A D A 1 Application generates data 2Sensors Data recorded by sensors

Fraunhofer: Data Life Cycle

A A D 1 Application generates data

2 Data recorded by sensors

3 Sensors pre-process data

4 Transmitting data

5 Storing data

6 Structuring data

8 Interpreting data

9 Forecasting data

13 Solutions show added values

15 Implementa-tion modifies application

14 Interpreting and implementing added values

10 Data simulation

12 Optimizing models

11 Data becomes input for decision models

7 Grouping data

A ANALYTICS A APPLICATION D DATA

Page 8: Summary, conclusions and commentsbigdata.htw-berlin.de/17/slides/4.4_Guelzow.pdf · Fraunhofer: Data Life Cycle A D A 1 Application generates data 2Sensors Data recorded by sensors

What about Data Compression/Reduction?

From 1 PB/s to 100 MB/s Nyriad(NZ) start up

8

Page 9: Summary, conclusions and commentsbigdata.htw-berlin.de/17/slides/4.4_Guelzow.pdf · Fraunhofer: Data Life Cycle A D A 1 Application generates data 2Sensors Data recorded by sensors

Will the network be a problem??

Cern to Tier X: Average 48 Gbps

9

Page 10: Summary, conclusions and commentsbigdata.htw-berlin.de/17/slides/4.4_Guelzow.pdf · Fraunhofer: Data Life Cycle A D A 1 Application generates data 2Sensors Data recorded by sensors

Future computing models?

•  Federation of resources •  Federation across domains •  Only a few large data centres •  Many centres for compute •  Hybrid solutions •  Commercial clouds

10

Page 11: Summary, conclusions and commentsbigdata.htw-berlin.de/17/slides/4.4_Guelzow.pdf · Fraunhofer: Data Life Cycle A D A 1 Application generates data 2Sensors Data recorded by sensors

The SKA Computing Model A collaborative alliance •  transparent and location

agnostic interface to SRCs for users

•  no SKA user should care where their data products are

•  all SKA users should be able to access their data products, irrespective of whether their country or region hosts a regional SRC

11

Page 12: Summary, conclusions and commentsbigdata.htw-berlin.de/17/slides/4.4_Guelzow.pdf · Fraunhofer: Data Life Cycle A D A 1 Application generates data 2Sensors Data recorded by sensors

Software, Algorithms, Methods?

Science need Causality, marketing from social media is focused on Correlation •  Software is a problem of all communities •  Algorithms like machine learning, simulations (start-to-end), low

signal to noise , Visual Analytics and visualization etc need to be evolved

•  Optimization, enableling new technologies etc is needed We need to invest in this field What about Data Curation?

12

Page 13: Summary, conclusions and commentsbigdata.htw-berlin.de/17/slides/4.4_Guelzow.pdf · Fraunhofer: Data Life Cycle A D A 1 Application generates data 2Sensors Data recorded by sensors

Software, Algorithms, Methods

13

Page 14: Summary, conclusions and commentsbigdata.htw-berlin.de/17/slides/4.4_Guelzow.pdf · Fraunhofer: Data Life Cycle A D A 1 Application generates data 2Sensors Data recorded by sensors

Cooperation with industry

14

Page 15: Summary, conclusions and commentsbigdata.htw-berlin.de/17/slides/4.4_Guelzow.pdf · Fraunhofer: Data Life Cycle A D A 1 Application generates data 2Sensors Data recorded by sensors

Cooperation with industry

SAP-HANA4Pulsar SAP for medical Insights Partner KIT in smart Innovation Lab But it‘s not oneway!

15

Page 16: Summary, conclusions and commentsbigdata.htw-berlin.de/17/slides/4.4_Guelzow.pdf · Fraunhofer: Data Life Cycle A D A 1 Application generates data 2Sensors Data recorded by sensors

Only Astonomers?

16

Page 17: Summary, conclusions and commentsbigdata.htw-berlin.de/17/slides/4.4_Guelzow.pdf · Fraunhofer: Data Life Cycle A D A 1 Application generates data 2Sensors Data recorded by sensors

17

Page 18: Summary, conclusions and commentsbigdata.htw-berlin.de/17/slides/4.4_Guelzow.pdf · Fraunhofer: Data Life Cycle A D A 1 Application generates data 2Sensors Data recorded by sensors

Education, Training, Reputation, new directions New type of scientists bridging the gap between computer science and domain science is needed … to make efficient use of ICT technology and to help harvesting the domain science We have to give reputation and carrer possibilities And we have to train scientist in the research fields

18

Page 19: Summary, conclusions and commentsbigdata.htw-berlin.de/17/slides/4.4_Guelzow.pdf · Fraunhofer: Data Life Cycle A D A 1 Application generates data 2Sensors Data recorded by sensors

ADA-Center Structure

Further Universities

National (LMU, Passau, Würzburg etc.)

International (Georgia-Tech, Montreal etc.)

Companies E-Commerce Industrie 4.0

Agrarian Public safety Automotive

Logistics Public transport

Insurance usw.

ADA-Center Fraunhofer IIS and FAU

Youth development

SENSORS Signal detection and processing

IoT Data transmission and networking

ANALYTICS Creation of

knowledge and models

Research cooperations, groups, projects

Cooperations with industry, projects

Page 20: Summary, conclusions and commentsbigdata.htw-berlin.de/17/slides/4.4_Guelzow.pdf · Fraunhofer: Data Life Cycle A D A 1 Application generates data 2Sensors Data recorded by sensors

Helmholtz Association HGF President Wiestler: Information & Datamanagement A Key Element for Helmholtz •  Helmholtz Data Federation •  LSDMA •  Helmholtz Analytics Framework •  LHC Tier 1&2 centres •  Helmholtz Incubator Softwaredevelopments eg. dCache, GGUS, HPC,…. -> Many competences!

20

Page 21: Summary, conclusions and commentsbigdata.htw-berlin.de/17/slides/4.4_Guelzow.pdf · Fraunhofer: Data Life Cycle A D A 1 Application generates data 2Sensors Data recorded by sensors

Helmholtz Association -DLR 20.000 simulations for digital aircraft Deep learning in remote sensing Jena Institute: Management /Analysis of BD Smart Systems IT Security Citizen science

21

Page 22: Summary, conclusions and commentsbigdata.htw-berlin.de/17/slides/4.4_Guelzow.pdf · Fraunhofer: Data Life Cycle A D A 1 Application generates data 2Sensors Data recorded by sensors

PartnershipIni-a-veComputa-onalSciencesπCS

n  Individualizedservicesforselectedscien-ficgroups–flagshiprole–  Dedicatedpoint-of-contact

–  Individualsupportandguidanceandtargetedtraining&educa-on

–  Planningdependabilityforusecasespecificop-mizedITinfrastructures

–  EarlyaccesstolatestITinfrastructure(hard-andsoBware)developmentsandspecifica-onoffuture

requirements

–  AccesstoITcompetencenetworkandexper-seatCSandMathdepartments

n  Partnercontribu-on–  EmbeddingITexpertsinusergroups

–  Jointresearchprojects(includingfunding)

–  Scien-ficpartnership–equalfoo-ng–jointpublica-ons

n  LRZbenefits–  Understandingthe(currentandfuture)needsandrequirementsoftherespec-vescien-ficdomain

–  Developingfutureservicesforallusergroups

–  Thema-cfocusing:EnvironmentalCompu-ng

D. Kranzlmüller CompBioMedWorkshop 22

Page 23: Summary, conclusions and commentsbigdata.htw-berlin.de/17/slides/4.4_Guelzow.pdf · Fraunhofer: Data Life Cycle A D A 1 Application generates data 2Sensors Data recorded by sensors

Technology

Technology developement (up to 20% increase in performance/year for constant budget) will NOT solve the problems. We don‘t really see disruptive technologies coming up soon Develope on software technology •  Optimization and parallelization •  GPGPU‘s, FPGA‘S, …. •  Green computing

23

Page 24: Summary, conclusions and commentsbigdata.htw-berlin.de/17/slides/4.4_Guelzow.pdf · Fraunhofer: Data Life Cycle A D A 1 Application generates data 2Sensors Data recorded by sensors

What about Funding ?

National and European Funding Coordinated and coherent Actions needed

24

Page 25: Summary, conclusions and commentsbigdata.htw-berlin.de/17/slides/4.4_Guelzow.pdf · Fraunhofer: Data Life Cycle A D A 1 Application generates data 2Sensors Data recorded by sensors

Will the EOSC solve the problem

25

Page 26: Summary, conclusions and commentsbigdata.htw-berlin.de/17/slides/4.4_Guelzow.pdf · Fraunhofer: Data Life Cycle A D A 1 Application generates data 2Sensors Data recorded by sensors

Summary – 1st •  Big data is not only a matter of size •  Google, Amazon, Facebook etc will not solve our problems •  But we can learn and profit from them •  We can produce a direct impact to industry and should cooperate •  But we need to cooperate across discilines •  There is plenty of expertise in Germany on data management,

Software development and nice cooperations with industry •  But we need to improve education of people •  We need to get experts across domains together •  We need a change in scientific reputation of „scientific software

development“

26

Page 27: Summary, conclusions and commentsbigdata.htw-berlin.de/17/slides/4.4_Guelzow.pdf · Fraunhofer: Data Life Cycle A D A 1 Application generates data 2Sensors Data recorded by sensors

Summary – 2nd

•  We need to invest into SW development •  We don‘t see pressing limits at networking •  This allows for different computing models -> Hybrid models, only

a few data centres in Germany? •  Computer and storage technology progress will not solve our

problems for a constant budget •  Open Data and FAIR principles need policies

27