from big to fast data. how #kafka and #kafka-connect can redefine you etl and #stream-processing

Post on 16-Apr-2017

915 Views

Category:

Software

1 Downloads

Preview:

Click to see full reader

TRANSCRIPT

2010

2014

- Error handling first class citizen 

schema registry

Your App

Producer

Serializer

Check is format is acceptable

Retrieve schema ID

Topic

Incompatible data error

Schema ID + Data

Kafka

producerProps.put(“key.serializer", "io.confluent.kafka.serializers.KafkaAvroSerializer");producerProps.put("value.serializer","io.confluent.kafka.serializers.KafkaAvroSerializer");

shipments topic

sales topic

low inventory topicspark

streaminggenerate

data

let’s see some code

Define the data contract / schema in Avro format

generate data

1,9 M msg / secusing 1 thread

https://schema-registry-ui.landoop.com

Schemas registered for us :-)

Defining the typed data format

Initiate the streaming from 2 topics

The business logic

shipments topic

sales topic

low inventory topicspark

streaming

elastic-search

re-ordering

Simple is beautiful

landoop.com/bloggithub.com/landoop

top related