thorsten papenbrock felix naumann introduction · deep learning for text mining (master)...

50
Distributed Data Management Introduction Thorsten Papenbrock Felix Naumann F-2.03/F-2.04, Campus II Hasso Plattner Institut

Upload: others

Post on 04-Jun-2020

12 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Thorsten Papenbrock Felix Naumann Introduction · Deep Learning for Text Mining (Master) Distributed Data Management (Master) Seminars Methoden der Forschung (Master) Data Preparation

Distributed Data Management

Introduction

Thorsten Papenbrock

Felix Naumann

F-2.03/F-2.04, Campus II

Hasso Plattner Institut

Page 2: Thorsten Papenbrock Felix Naumann Introduction · Deep Learning for Text Mining (Master) Distributed Data Management (Master) Seminars Methoden der Forschung (Master) Data Preparation

Lost & Invalid Messages

Consensus Termination

Page 3: Thorsten Papenbrock Felix Naumann Introduction · Deep Learning for Text Mining (Master) Distributed Data Management (Master) Seminars Methoden der Forschung (Master) Data Preparation

Distributed Data Management

Information Systems Team

Slide 3

Introduction

Distributed Data Management

Thorsten Papenbrock

Data Fusion Service-Oriented

Systems

Prof. Felix Naumann

Information Integration

Information Quality

Data Profiling

Entity Search

Duplicate Detection

RDF Data Mining

ETL Management

project DuDe

project Stratosphere

Data as a Service

Opinion Mining

Data Scrubbing

project DataChEx

Dependency Detection Linked Open Data

Data Cleansing

Web Data

Entity Recognition

Dr. Thorsten Papenbrock

Text Mining

Dr. Ralf Krestel

Konstantina Lazariduo John Koumarelas

Michael Loster

Hazar Harmouch

Diana Stephan

Tobias Bleifuß

Tim Repke

Lan Jiang

Web Science

Data Change

project Metanome

Julian Risch

Leon Bornemann

Change Exploration Data Preparation

Page 4: Thorsten Papenbrock Felix Naumann Introduction · Deep Learning for Text Mining (Master) Distributed Data Management (Master) Seminars Methoden der Forschung (Master) Data Preparation

Distributed Data Management

Introduction: Audience

Slide 4

Introduction

Distributed Data Management

Thorsten Papenbrock

English? Which semester?

HPI or Guest?

Database knowledge?

Other related lectures?

ITSE, DE, DH?

Distributed experience?

Page 5: Thorsten Papenbrock Felix Naumann Introduction · Deep Learning for Text Mining (Master) Distributed Data Management (Master) Seminars Methoden der Forschung (Master) Data Preparation

Distributed Data Management

Courses 2018/2019

Slide 5

Introduction

Distributed Data Management

Thorsten Papenbrock

Lectures

Datenbanksysteme II (Bachelor)

Deep Learning for Text Mining (Master)

Distributed Data Management (Master)

Seminars

Methoden der Forschung (Master)

Data Preparation for Science (Master)

Recommender Systems (Master)

Social Network Analysis in Practice (Master)

Page 6: Thorsten Papenbrock Felix Naumann Introduction · Deep Learning for Text Mining (Master) Distributed Data Management (Master) Seminars Methoden der Forschung (Master) Data Preparation

6

Domain

Knowledge

Data

Science

Control Flow Iterative Algorithms

Error Estimation

Active Sampling

Sketches

Curse of Dimensionality

Decoupling

Convergence

Monte Carlo

Mathematical Programming

Linear Algebra

Stochastic Gradient Descent

Statistics

Data Obfuscation

Parallelization

Query Optimization

Visual Analytics

Relational Algebra / SQL

Scalability

Data Analysis Languages

Fault Tolerance

Memory Management

Memory Hierarchy

Data Flow

Information Extraction

Indexing

RDF / SparQL

NF2 /XQuery

Data Warehouse/OLAP

Domain Expertise (e.g., Industry 4.0, Medicine, Physics, Engineering, Energy, Logistics)

Real-Time

Information Integration

Text Mining Graph Mining

Signal Processing

Business Models Legal Aspects

Privacy

Security

Regression

Machine Learning

Predictive Analytics

Page 7: Thorsten Papenbrock Felix Naumann Introduction · Deep Learning for Text Mining (Master) Distributed Data Management (Master) Seminars Methoden der Forschung (Master) Data Preparation

Distributed Data Management

This Lecture

Slide 7

Introduction

Distributed Data Management

Thorsten Papenbrock

Lecture

For master students

(IT-Systems Engineering,

Digital Health, Data Engineering)

6 credit points, 4 SWS

Mondays 13:30 – 15:00

Tuesdays 09:15 – 10:45

Exercises

Interleaved with lectures

Slides

On website

Website

https://hpi.de/naumann/teaching/teaching/ws-1819/distributed-data-management-vl-master.html

Prerequisites

To participate:

A little background and interest in

databases (e.g. DBS I lecture)

For exam:

Attending lectures, participation in

exercises, and completion of

exercise homework tasks

Antirequisite

• Distributed Data Analytics

Exam

Written exam

Probably first week after lectures

Page 8: Thorsten Papenbrock Felix Naumann Introduction · Deep Learning for Text Mining (Master) Distributed Data Management (Master) Seminars Methoden der Forschung (Master) Data Preparation

Distributed Data Management

Feedback

Slide 8

Introduction

Distributed Data Management

Thorsten Papenbrock

Question any time please!

During lectures

Visit us: Campus II, Rooms F-2.03 and F-2.04

Email:

[email protected]

[email protected]

Also: Give feedback about …

improving lectures

informational material

organization

Official evaluation

At the end of this semester

… too late for important feedback!

Page 9: Thorsten Papenbrock Felix Naumann Introduction · Deep Learning for Text Mining (Master) Distributed Data Management (Master) Seminars Methoden der Forschung (Master) Data Preparation

Distributed Data Management

Page 10: Thorsten Papenbrock Felix Naumann Introduction · Deep Learning for Text Mining (Master) Distributed Data Management (Master) Seminars Methoden der Forschung (Master) Data Preparation

Distributed Data Management

Page 11: Thorsten Papenbrock Felix Naumann Introduction · Deep Learning for Text Mining (Master) Distributed Data Management (Master) Seminars Methoden der Forschung (Master) Data Preparation

Motivation: “Distributed”

Paradigm Shift in Software-Writing

http://www.gotw.ca/publications/concurrency-ddj.htm

The free lunch is over!

Clock speeds stall

Transistor numbers still increase

Cores in CPUs/GPUs

CPUs/GPUs in compute nodes,

compute nodes in clusters

Paradigm Shift:

Earlier: optimize code for a single thread

Now: solve tasks in parallel

Distributed computing

“Distribution of work on (potentially)

physically isolated compute nodes”

Moor’s Law

Power wall

Page 12: Thorsten Papenbrock Felix Naumann Introduction · Deep Learning for Text Mining (Master) Distributed Data Management (Master) Seminars Methoden der Forschung (Master) Data Preparation

Motivation: “Distributed”

Surpassing Moor’s Law

Moore’s Law (Observation)

“The number of transistors in

integrated circuits doubles

approximately every two years”

With distributed machines, we can already build systems with any number of transistors!

(don’t even need to wait for a new processors)

Page 13: Thorsten Papenbrock Felix Naumann Introduction · Deep Learning for Text Mining (Master) Distributed Data Management (Master) Seminars Methoden der Forschung (Master) Data Preparation

Motivation: “Distributed”

A Rule to Acknowledge

Amdahl’s Law

“The speedup of a program using

multiple processors for parallel

computing is limited by the

sequential fraction of the program”

s: degree of parallelization (e.g. #cores)

p: percentage of the algorithm that

profits from parallelization

𝑆𝑝𝑒𝑒𝑑𝑢𝑝 𝑠 = 1

1 − 𝑝 +𝑝𝑠

Even distributed parallelization cannot work around this law!

Page 14: Thorsten Papenbrock Felix Naumann Introduction · Deep Learning for Text Mining (Master) Distributed Data Management (Master) Seminars Methoden der Forschung (Master) Data Preparation

Motivation: “Distributed”

New Technologies

Slide 14

Introduction

Distributed Data Management

Thorsten Papenbrock

Distributed Computing

… r

Distributed Storage

Page 15: Thorsten Papenbrock Felix Naumann Introduction · Deep Learning for Text Mining (Master) Distributed Data Management (Master) Seminars Methoden der Forschung (Master) Data Preparation

Slide 15

Introduction

Distributed Data Management

Thorsten Papenbrock

Page 16: Thorsten Papenbrock Felix Naumann Introduction · Deep Learning for Text Mining (Master) Distributed Data Management (Master) Seminars Methoden der Forschung (Master) Data Preparation

Slide 16

Introduction

Distributed Data Management

Thorsten Papenbrock

Page 17: Thorsten Papenbrock Felix Naumann Introduction · Deep Learning for Text Mining (Master) Distributed Data Management (Master) Seminars Methoden der Forschung (Master) Data Preparation

Motivation: “Distributed”

Small and Medium Scale

Low-cost and low energy cluster of Cubieboards running Hadoop

A cluster of commodity hardware running Hadoop

Page 18: Thorsten Papenbrock Felix Naumann Introduction · Deep Learning for Text Mining (Master) Distributed Data Management (Master) Seminars Methoden der Forschung (Master) Data Preparation

Motivation: “Distributed”

Large Scale

A cluster of machines running Hadoop at Yahoo!

Page 19: Thorsten Papenbrock Felix Naumann Introduction · Deep Learning for Text Mining (Master) Distributed Data Management (Master) Seminars Methoden der Forschung (Master) Data Preparation

Motivation: “Distributed”

Super Large Scale

Slide 19

Thorsten Papenbrock

Top 10 Super Computers 2017

All distributed systems!

https://www.top500.org/lists/2017/06/

Page 20: Thorsten Papenbrock Felix Naumann Introduction · Deep Learning for Text Mining (Master) Distributed Data Management (Master) Seminars Methoden der Forschung (Master) Data Preparation

Motivation: “Distributed”

Super Large Scale

Slide 20

Thorsten Papenbrock

Top 10 Super Computers 2017

All distributed systems!

https://www.top500.org/lists/2017/06/

Page 21: Thorsten Papenbrock Felix Naumann Introduction · Deep Learning for Text Mining (Master) Distributed Data Management (Master) Seminars Methoden der Forschung (Master) Data Preparation

Motivation: “Distributed”

Super Large Scale

Slide 21

Thorsten Papenbrock

Top 10 Super Computers 2017

All distributed systems!

https://www.top500.org/lists/2017/06/

Page 22: Thorsten Papenbrock Felix Naumann Introduction · Deep Learning for Text Mining (Master) Distributed Data Management (Master) Seminars Methoden der Forschung (Master) Data Preparation

Distributed Data Management

“The ability of extracting different kinds of information from data!”

Structural information

Explicit information

Implicit/derived information

Analytics

Page 23: Thorsten Papenbrock Felix Naumann Introduction · Deep Learning for Text Mining (Master) Distributed Data Management (Master) Seminars Methoden der Forschung (Master) Data Preparation

https://www.idc.com/getdoc.jsp?containerId=prUS41826116 http://sigmacareer.com/big-data-what-is-it-and-what-are-the-trends

Excellent job opportunities in many companies!

A market worth $122 billion in 2016 with a growth of 11.3% per year!

For a world that created an entire zettabyte (which is exactly 1012 GB)

of data in the 2010 alone!

Page 24: Thorsten Papenbrock Felix Naumann Introduction · Deep Learning for Text Mining (Master) Distributed Data Management (Master) Seminars Methoden der Forschung (Master) Data Preparation

VLDB 2017 Program

International conference “Very Large Data Bases”

Page 25: Thorsten Papenbrock Felix Naumann Introduction · Deep Learning for Text Mining (Master) Distributed Data Management (Master) Seminars Methoden der Forschung (Master) Data Preparation

VLDB 2017 Program

With strong relation to data analytics and/or

distributed processing

Page 26: Thorsten Papenbrock Felix Naumann Introduction · Deep Learning for Text Mining (Master) Distributed Data Management (Master) Seminars Methoden der Forschung (Master) Data Preparation

Motivation: “Data Analytics”

Back to Super Computers

Slide 26

Introduction

Distributed Data Analytics

Thorsten Papenbrock

For what tasks are these computer used?

Weather forecasting

Market analysis

Crash simulation

Disaster simulation

Brute force decryption

Molecular dynamics modeling

Data Analytics tasks!

Page 27: Thorsten Papenbrock Felix Naumann Introduction · Deep Learning for Text Mining (Master) Distributed Data Management (Master) Seminars Methoden der Forschung (Master) Data Preparation

Distributed Data Management

“The ability to efficiently read, transform, and store

large amounts of data!”

Static (block) data

Volatile (streaming) data

Processing

Page 28: Thorsten Papenbrock Felix Naumann Introduction · Deep Learning for Text Mining (Master) Distributed Data Management (Master) Seminars Methoden der Forschung (Master) Data Preparation

Motivation: “Data Processing”

Data Processing in Practice

Slide 28

Introduction

Distributed Data Management

Thorsten Papenbrock Startup Everyone starts small but be prepared for the hype!

Page 29: Thorsten Papenbrock Felix Naumann Introduction · Deep Learning for Text Mining (Master) Distributed Data Management (Master) Seminars Methoden der Forschung (Master) Data Preparation

Motivation: “Data Processing”

Data Processing in Practice

Slide 29

Introduction

Distributed Data Management

Thorsten Papenbrock

Example: Mobile Motion GmbH

Dubsmash

An HPI-Startup of 2013

Founders:

Jonas Drüppel, Roland Grenke, Daniel Taschik

November 19, 2014: Launch of the Dubsmash app November 26, 2014: Dubsmash reached the number one

downloaded app in Germany June 1, 2015: Dubsmash had been downloaded over

50 million times in 192 countries

Page 30: Thorsten Papenbrock Felix Naumann Introduction · Deep Learning for Text Mining (Master) Distributed Data Management (Master) Seminars Methoden der Forschung (Master) Data Preparation

Motivation: “Data Processing”

Data Processing in Practice

Slide 30

Introduction

Distributed Data Management

Thorsten Papenbrock

Many further HPI Startups!

Page 31: Thorsten Papenbrock Felix Naumann Introduction · Deep Learning for Text Mining (Master) Distributed Data Management (Master) Seminars Methoden der Forschung (Master) Data Preparation

Motivation: “Data Processing”

Data Processing in Practice

Slide 31

Introduction

Distributed Data Management

Thorsten Papenbrock

Successful IT-Startups in recent years are masters of data:

1. AirBnB

2. Instagram

3. Pinterest

4. Angry Birds

5. Linkedin

6. Uber

7. Snapchat

8. WhatsApp

9. Twitter

10.Facebook

11.…

Peta- to Exabytes of … profile data (names, addresses, friends, …) content data (images, videos, messages, …) event data (logins, interactions, games, …) …

Challenged with … streaming persistence analytics load-balancing …

Page 32: Thorsten Papenbrock Felix Naumann Introduction · Deep Learning for Text Mining (Master) Distributed Data Management (Master) Seminars Methoden der Forschung (Master) Data Preparation

Motivation: “Data Processing”

Driving Forces

Slide 32

Introduction

Distributed Data Management

Thorsten Papenbrock

Data volumes increase:

business data, sensor data, social media data, …

Data analytics gains importance:

downtime-less, real-time, predictive

Parallelization paradigm shifts:

multi-core and network speeds increase while CPU clock speeds stall

Computation resources become more available:

IaaS, PaaS, SaaS

Free and open source software gains popularity:

setting standards, utilizing external development resources, improving

software quality, avoiding vendor locks …

Page 33: Thorsten Papenbrock Felix Naumann Introduction · Deep Learning for Text Mining (Master) Distributed Data Management (Master) Seminars Methoden der Forschung (Master) Data Preparation

Distributed Data Management

Related Topics

Slide 33

Introduction

Distributed Data Management

Thorsten Papenbrock

Database

Systems

Software

Architecture

Data

Mining

Parallel

Computing

Distributed

Data

Management

Page 34: Thorsten Papenbrock Felix Naumann Introduction · Deep Learning for Text Mining (Master) Distributed Data Management (Master) Seminars Methoden der Forschung (Master) Data Preparation

Distributed Data Management

Related Topics

Slide 34

Introduction

Distributed Data Management

Thorsten Papenbrock

Software

Architecture

Data

Mining

Parallel

Computing

Database

Systems

Distributed

Data

Management

Page 35: Thorsten Papenbrock Felix Naumann Introduction · Deep Learning for Text Mining (Master) Distributed Data Management (Master) Seminars Methoden der Forschung (Master) Data Preparation

Distributed Data Management

Database Systems

Slide 35

Introduction

Distributed Data Management

Thorsten Papenbrock

Touch points

Data models, query languages, and consistency guarantees

Distributed storage and retrieval of data

Index structures

Not in this lecture

Physical data storage

In-depth presentations language constructs and transactions

Core database technology, e.g., query optimizer

More focused lectures

Database Systems I + II (Prof. Naumann)

Trends and Concepts in Software Industry (Prof. Plattner)

Page 36: Thorsten Papenbrock Felix Naumann Introduction · Deep Learning for Text Mining (Master) Distributed Data Management (Master) Seminars Methoden der Forschung (Master) Data Preparation

Distributed Data Management

Related Topics

Slide 36

Introduction

Distributed Data Management

Thorsten Papenbrock

Database

Systems

Data

Mining

Parallel

Computing

Software

Architecture

Distributed

Data

Management

Page 37: Thorsten Papenbrock Felix Naumann Introduction · Deep Learning for Text Mining (Master) Distributed Data Management (Master) Seminars Methoden der Forschung (Master) Data Preparation

Distributed Data Management

Software Architectures

Slide 37

Introduction

Distributed Data Management

Thorsten Papenbrock

Touch points

Requirements, design, and architecture of distributed systems

Pros and cons of different technologies for distributed systems

Not in this lecture

Non-distributed systems

Agile software development techniques

Software patterns

More focused lectures

Software Architecture (Dr. Uflacker)

Software Technique (Dr. Uflacker)

Page 38: Thorsten Papenbrock Felix Naumann Introduction · Deep Learning for Text Mining (Master) Distributed Data Management (Master) Seminars Methoden der Forschung (Master) Data Preparation

Distributed Data Management

Related Topics

Slide 38

Introduction

Distributed Data Management

Thorsten Papenbrock

Database

Systems

Software

Architecture

Data

Mining

Parallel

Computing

Distributed

Data

Management

Page 39: Thorsten Papenbrock Felix Naumann Introduction · Deep Learning for Text Mining (Master) Distributed Data Management (Master) Seminars Methoden der Forschung (Master) Data Preparation

Distributed Data Management

Parallel Computing

Slide 39

Introduction

Distributed Data Management

Thorsten Papenbrock

Touch points

Distributed data storage concepts

Distributed programming models, e.g., actor programming and MapReduce

Not in this lecture

Parallel, non-distributed programming languages, e.g., CUDA or OpenMP

Core parallel computing concepts, e.g., scheduling or shared memory

Processor architectures, cache hierarchies, GPU programming, …

More focused lectures

Parallel Programming (Dr. Tröger)

Programmierung paralleler und verteilter Systeme (Dr. Feinbube)

Page 40: Thorsten Papenbrock Felix Naumann Introduction · Deep Learning for Text Mining (Master) Distributed Data Management (Master) Seminars Methoden der Forschung (Master) Data Preparation

Distributed Data Management

Related Topics

Slide 40

Introduction

Distributed Data Management

Thorsten Papenbrock

Database

Systems

Software

Architecture

Parallel

Computing

Data

Mining

Distributed

Data

Management

Page 41: Thorsten Papenbrock Felix Naumann Introduction · Deep Learning for Text Mining (Master) Distributed Data Management (Master) Seminars Methoden der Forschung (Master) Data Preparation

Distributed Data Management

Data Mining

Slide 41

Introduction

Distributed Data Management

Thorsten Papenbrock

Touch points

Data analytics: aggregation queries and basic data mining algorithms

Not in this lecture

Detailed introduction to machine learning, e.g., neuronal networks,

(un)supervised learning, or Bayesian classification

Statistics, linear algebra, and most sophisticated mining algorithms

More focused lectures

Big Data Analytics (Prof. Müller)

Graph Mining (Prof. Müller)

Data Profiling (Prof. Naumann)

Data Mining and Probabilistic Reasoning (Dr. Krestel)

Page 42: Thorsten Papenbrock Felix Naumann Introduction · Deep Learning for Text Mining (Master) Distributed Data Management (Master) Seminars Methoden der Forschung (Master) Data Preparation

Distributed Data Management

Related Topics

Slide 42

Introduction

Distributed Data Management

Thorsten Papenbrock

Database

Systems

Software

Architecture

Data

Mining

Parallel

Computing

Distributed

Data

Management

Page 43: Thorsten Papenbrock Felix Naumann Introduction · Deep Learning for Text Mining (Master) Distributed Data Management (Master) Seminars Methoden der Forschung (Master) Data Preparation

Distributed Data Management

Lecture Goals

Slide 43

Introduction

Distributed Data Management

Thorsten Papenbrock

Sorting the buzzwords

NoSQL, Big Data, OLAP, Web-scale, ACID, Sharding, MapReduce, …

Understanding the systems

How and why do distributed systems work?

What are their individual core technologies?

Which advantages and disadvantages do the technologies have?

How can different technologies be combined?

Exercising in data analytics

Distributed processing of data-parallel tasks

Data mining on streams

Page 44: Thorsten Papenbrock Felix Naumann Introduction · Deep Learning for Text Mining (Master) Distributed Data Management (Master) Seminars Methoden der Forschung (Master) Data Preparation

Slide 44

Introduction

Distributed Data Management

Thorsten Papenbrock

Distributed Data Management

Lecture Outline

1. Introduction

2. Foundations

3. OLAP and OLTP

4. Encoding and Evolution

5. Hands-On: Akka

6. Data Models and Query Languages

7. Storage and Retrieval

8. Replication

9. Partitioning

10. Batch Processing

11. Hands-On: Spark

12. Distributed Systems

13. Consistency and Consensus

14. Transactions

15. Stream Processing

16. Hands on: Flink

17. Mining Data Streams

18. Distributed Algorithms

19. Services and Containerization

20. Cloud-based Data Systems

21. Lecture Summary and

Exam Preparation

On-site and by dataArtisans!

Page 45: Thorsten Papenbrock Felix Naumann Introduction · Deep Learning for Text Mining (Master) Distributed Data Management (Master) Seminars Methoden der Forschung (Master) Data Preparation

Slide 45

Introduction

Distributed Data Management

Thorsten Papenbrock

Distributed Data Management

Lecture Outline – Homework

1. Introduction

2. Foundations

3. OLAP and OLTP

4. Encoding and Evolution

5. Hands-On: Akka

6. Data Models and Query Languages

7. Storage and Retrieval

8. Replication

9. Partitioning

10. Batch Processing

11. Hands-On: Spark

12. Distributed Systems

13. Consistency and Consensus

14. Transactions

15. Stream Processing

16. Hands on: Flink

17. Mining Data Streams

18. Distributed Algorithms

19. Services and Containerization

20. Cloud-based Data Systems

21. Lecture Summary and

Exam Preparation

Page 46: Thorsten Papenbrock Felix Naumann Introduction · Deep Learning for Text Mining (Master) Distributed Data Management (Master) Seminars Methoden der Forschung (Master) Data Preparation

Distributed Data Management

Literature: Course Book

Slide 46

Introduction

Distributed Data Management

Thorsten Papenbrock

Designing Data-Intensive Applications

Author: Martin Klappmann

Date: March 2017

Publisher: O‘Reilly Media, Inc

ISBN: 978-1-449-37332-0

References:

https://github.com/ept/ddia-references

Scope for this lecture

Distributed and parallel systems for

data analytics and data mining

Big data storage

Batch and stream processing

Page 47: Thorsten Papenbrock Felix Naumann Introduction · Deep Learning for Text Mining (Master) Distributed Data Management (Master) Seminars Methoden der Forschung (Master) Data Preparation

Distributed Data Management

Literature: Further Reading

Slide 47

Introduction

Distributed Data Management

Thorsten Papenbrock

Web-links are given on the slides during

the lecture.

Page 48: Thorsten Papenbrock Felix Naumann Introduction · Deep Learning for Text Mining (Master) Distributed Data Management (Master) Seminars Methoden der Forschung (Master) Data Preparation

Distributed Data Management

Feedback: Distributed Data Analytics

Slide 48

Introduction

Distributed Data Management

Thorsten Papenbrock

Page 49: Thorsten Papenbrock Felix Naumann Introduction · Deep Learning for Text Mining (Master) Distributed Data Management (Master) Seminars Methoden der Forschung (Master) Data Preparation

Distributed Data Management

Feedback: Distributed Data Analytics

Slide 49

Introduction

Distributed Data Management

Thorsten Papenbrock

Page 50: Thorsten Papenbrock Felix Naumann Introduction · Deep Learning for Text Mining (Master) Distributed Data Management (Master) Seminars Methoden der Forschung (Master) Data Preparation

Distributed Data Analytics

Introduction Thorsten Papenbrock

G-3.1.09, Campus III

Hasso Plattner Institut