data integration in the world of microservices

47
in the world of microservices DATA INTEGRATION

Upload: fabian-wollert

Post on 26-Jan-2017

401 views

Category:

Technology


2 download

TRANSCRIPT

Page 1: Data Integration in the world of Microservices

in the world of microservicesDATA INTEGRATION

Page 2: Data Integration in the world of Microservices

About us

Valentine Gogichashvili

Head of Data Engineering @ZalandoTechtwitter: @valgoggoogle+: +valgogemail: [email protected]

Fabian Wollert

Data Engineer Business Intelligencegithub: @drummerwolliemail: [email protected]

2

Page 3: Data Integration in the world of Microservices

3

Page 4: Data Integration in the world of Microservices

Zalando Technology

BERLINDORTMUNDDUBLIN

HELSINKI

ERFURT

MÖNCHENGLADBACH

HAMBURG

4

Page 5: Data Integration in the world of Microservices

Zalando Technology

1100+ TECHNOLOGISTS

Rapidly growing international team

http://tech.zalando.com

5

Page 6: Data Integration in the world of Microservices

Good old small world

Page 7: Data Integration in the world of Microservices

Once upon a time...

Started as a tiny online shop

Prototyped on Magento (PHP)

Used MySQL as a database

Web Application

Backend

Database

7

Page 8: Data Integration in the world of Microservices

REBOOT

Page 9: Data Integration in the world of Microservices

REBOOT

5½ years ago

● Java○ macro service architecture with SOAP as RPC layer

● PostgreSQL ○ Heavy usage of Stored Procedures○ 4 databases + 1 sharded database on 2 shards

● Python for tooling (i.e code deploy automation)9

Page 10: Data Integration in the world of Microservices

REBOOT

Java Web Frontend

Java Backend

PostgreSQL

Java Backend

PostgreSQL

Java Backend

PostgreSQL 9.0 RC1PostgreSQL 9.0

RC1PostgreSQL 9.0 RC1PostgreSQL

"macro" services

10

Page 11: Data Integration in the world of Microservices

Classical ETL process

REBOOT

Business Logic

Data Warehouse (DWH)

DatabaseDBA

BI

Business Logic

Database

Business Logic

Database

Business Logic

Database

Dev

11

Page 12: Data Integration in the world of Microservices

REBOOT

Classical ETL process

● Use-case specific

● Usually outputs data into a Data Warehouse○ well structured○ easy to use by the end user (SQL)

12

Page 13: Data Integration in the world of Microservices

Live long and prosper...

Very stable architecture that is still in use in the oldest (vintage) components

We implemented everything ourselves starting from warehouse and order management and finishing with Web Shop and Mobile Applications

13

Page 14: Data Integration in the world of Microservices

Live long and prosper...

"I want to code in Scala/Clojure/Haskell because it is cool and compact"

14

Page 15: Data Integration in the world of Microservices

"I want to code in Scala/Clojure/Haskell because it is cool and compact"

"But nobody will be able to support your code if you leave the company, everybody should use Java, learn SQL and write Stored Procedures"

Live long and prosper...

15

Page 16: Data Integration in the world of Microservices

"I want to code in Scala/Clojure/Haskell because it is cool and compact"

"But nobody will be able to support your code if you leave the company, everybody should use Java, learn SQL and write Stored Procedures"

"Zalando is cool but f*ck you, I am moving on to another company where I can use cool technologies!"

Live long and prosper...

16

Page 17: Data Integration in the world of Microservices

RADICAL AGILITY

Page 18: Data Integration in the world of Microservices

AUTONOMY

PURPOSE

MASTERY

Radical Agility

18

Page 19: Data Integration in the world of Microservices

Autonomy

Autonomous teams

● can choose own technology stack

● including persistence layer

● are responsible for operations

● should use isolated AWS accounts19

Page 20: Data Integration in the world of Microservices

Supporting autonomy — Microservices

Business Logic

Database

RE

ST

AP

I

Business Logic

Database

RE

ST

AP

I

Business Logic

Database

RE

ST

AP

I

Business Logic

Database

RE

ST

AP

I

Business Logic

Database

RE

ST

AP

I

Business Logic

Database

RE

ST

AP

I

Business Logic

Database

RE

ST

AP

I

20

Page 21: Data Integration in the world of Microservices

Supporting autonomy — Microservices

Business Logic

Database

Team A Business Logic

Database

Team B

RE

ST

AP

I RE

ST A

PI

● Applications communicate using REST APIs

● Databases hidden behind the walls of AWS VPC

public Internet

21

Page 22: Data Integration in the world of Microservices

Supporting autonomy — Microservices

Business Logic

Database

Team A Business Logic

Database

Team B

Classical ETL process is impossible!

RE

ST A

PI

public Internet

RE

ST

AP

I

22

Page 23: Data Integration in the world of Microservices

Supporting autonomy — Microservices

Business Logic

Database

RE

ST

AP

I

Business Logic

Database

RE

ST

AP

I

Business Logic

Database

RE

ST

AP

I

Business Logic

Database

RE

ST

AP

I

Business Logic

Database

RE

ST

AP

I

Business Logic

Database

RE

ST

AP

I

Business Logic

Database

RE

ST

AP

I

23

Page 24: Data Integration in the world of Microservices

Supporting autonomy — Microservices

REST API

Nakadi Event Bus

REST API

Business Logic

Database

RE

ST

AP

I

App

A

Business Logic

Database

RE

ST

AP

I

App

B

Business Logic

Database

RE

ST

AP

I

App

C

Business Logic

Database

RE

ST

AP

I

App

D

24

Page 25: Data Integration in the world of Microservices

Supporting autonomy — Microservices

REST API

Nakadi Event Bus

REST API

Business Intelligence

Business Logic

Database

RE

ST

AP

I

App

A

Business Logic

Database

RE

ST

AP

I

App

B

Business Logic

Database

RE

ST

AP

I

App

C

Business Logic

Database

RE

ST

AP

I

App

D

25

Page 26: Data Integration in the world of Microservices

Supporting autonomy — Microservices

REST API

Nakadi Event Bus REST API

App A App B App DApp C BI

Data Warehouse

26

Page 27: Data Integration in the world of Microservices

Supporting autonomy — Microservices

REST API

Nakadi Event Bus REST API

App A App B App DApp C BI

Data Warehouse

?

27

Page 28: Data Integration in the world of Microservices

SAIKI

Page 29: Data Integration in the world of Microservices

SAIKI

Saiki Data Platform

REST API

Nakadi Event Bus REST API

App A App B App DApp C BI

Data Warehouse

29

Page 30: Data Integration in the world of Microservices

SAIKI

Saiki Data Platform

REST API

Nakadi Event Bus REST API

App A App B App DApp C BI

Data Warehouse

Buku

30

Page 31: Data Integration in the world of Microservices

SAIKI

Saiki Data Platform

REST API

Nakadi Event Bus REST API

App A App B App DApp C BI

Data Warehouse

Buku

AWS S3

Tukang

REST API

31

Page 32: Data Integration in the world of Microservices

Saiki Data Platform

Saiki Tukang● First cleansing of events (out of order, duplicates, etc.)

● Materialize data from Kafka in AWS S3

● Provide metadata via RESTful interface

● DWH downloads data directly from cloud storage

32

Page 33: Data Integration in the world of Microservices

SAIKI

Saiki Data Platform

REST API

Nakadi Event Bus REST API

App A App B App DApp C BI

Data Warehouse

Buku Tukang

REST API

AWS S3

33

Page 34: Data Integration in the world of Microservices

SAIKI

Saiki Data Platform

REST API

Nakadi Event Bus REST API

App A App B App DApp C BI

Data Warehouse

Buku Tukang

REST API

AWS S3

34

E.g. Forecast DB

Page 35: Data Integration in the world of Microservices

Saiki Data Platform

Old Load Process New Load Process

relied on Delta Loads relies on Event Stream

JDBC Connection RESTful HTTPS Connections

data quality could be controlled by BI independently

Trust for correctness of data in the delivery teams

PostgreSQL dependent Independent of the source technology stack

N to 1 data stream N to N stream, no single data sink

35

Page 36: Data Integration in the world of Microservices

BI

Data Warehouse E.g. Forecast DB

SAIKI

Saiki Data Platform

REST API

Nakadi Event Bus REST API

App A App B App DApp C

Buku Tukang

REST API

AWS S3

36

Page 37: Data Integration in the world of Microservices

BI

Data Warehouse E.g. Forecast DB

SAIKI

Saiki Data Platform

REST API

Nakadi Event Bus REST API

App A App B App DApp C

Buku Tukang

REST API

Stream Processing via Apache Flink

AWS S3

37

Page 38: Data Integration in the world of Microservices

Saiki Data Platform

Apache Flink

● true stream processing framework● process events at a consistently high rate with relatively

low latency● scalable● support from Berlin/Europe

https://tech.zalando.com/blog/apache-showdown-flink-vs.-spark/

38

Page 39: Data Integration in the world of Microservices

Saiki Data Platform

Apache Flink● connectors

○ Kafka○ Elasticsearch○ etc.

39

Page 40: Data Integration in the world of Microservices

BI

Data Warehouse E.g. Forecast DB

SAIKI

Saiki Data Platform

REST API

Nakadi Event Bus REST API

App A App B App DApp C

Buku Tukang

REST API

Stream Processing via Apache Flink

AWS S3

40

Page 41: Data Integration in the world of Microservices

BI

Data Warehouse E.g. Forecast DB

SAIKI

Saiki Data Platform

REST API

Nakadi Event Bus REST API

App A App B App DApp C

Buku Tukang

REST API

Stream Processing via Apache Flink

Elasticsearch

AWS S3

41

Page 42: Data Integration in the world of Microservices

Saiki Data Platform

For example: Real-time Business Process Monitoring

● Check if technically the platform works● Analyze data on the fly● Visualization with Python/Flask and Chart Frameworks

42

Page 43: Data Integration in the world of Microservices

BI

Data Warehouse E.g. Forecast DB

SAIKI

Saiki Data Platform

REST API

Nakadi Event Bus REST API

App A App B App DApp C

Buku Tukang

REST API

Stream Processing via Apache Flink

Elasticsearch

AWS S3

43

Page 44: Data Integration in the world of Microservices

BI

Data Warehouse E.g. Forecast DB

SAIKI

Saiki Data Platform

REST API

Nakadi Event Bus REST API

App A App B App DApp C

Buku Tukang

REST API

Stream Processing via Apache Flink

Elasticsearch

Data Lake .

AWS S3

44

Page 45: Data Integration in the world of Microservices

Free the data from the silos!

Page 46: Data Integration in the world of Microservices

BI

Data Warehouse E.g. Forecast DB

SAIKI

Saiki Data Platform

REST API

Nakadi Event Bus REST API

App A App B App DApp C

Buku Tukang

REST API

Stream Processing via Apache Flink

Elasticsearch

Data Lake .

AWS S3

46

Page 47: Data Integration in the world of Microservices

Open source @ZalandoTech

● https://zalando.github.io/● https://tech.zalando.de/blog● https://github.com/zalando/saiki/wiki● STUPS.io for responsible organizations in AWS

● REST API on Swagger (OpenAPI)○ https://github.com/zalando/restful-api-guidelines○ https://github.com/zalando/connexion○ https://github.com/zalando/play-swagger

47