kafka short

Post on 27-Jan-2015

130 Views

Category:

Software

1 Downloads

Preview:

Click to see full reader

DESCRIPTION

 

TRANSCRIPT

1

Kafka for Kafka for BigData BigData

ProcessingProcessing

Yanai Franchi , TikalYanai Franchi , Tikal

2

Find “Hot” Places

3

4

gogobot checkinHeat Map Service

Lets' Develop“Gogobot Checkins Heat-Map”

5

Key Notes● Collector Service - Collects checkins as text addresses

– We need to use GeoLocation ServiceWe need to use GeoLocation Service

● Upon elapsed interval, the last locations list will be displayed as Heat-Map in GUI.

● Web Scale service – 10Ks checkins/seconds all over the world (imaginary, but lets do it for the exercise).

6

Heat-Map Context

Text-Address

Checkins Heat-MapService

Gogobot System

GogobotMicro Service

GogobotMicro Service

GogobotMicro Service

Geo LocationService

Get-GeoCode(Address)

Heat-Map

Last Interval Locations

7

Tons of Addresses Arriving Every Second

8

First Reaction...

9

Checkin HTTP Reactor Checkins

Topic

Storm Heat-Map Topology

Hotzones Topic

Web App

Push via WebSocket

Publish Checkins

HDFS

Checkin HTTP Firehose

10

11

They all are GoodBut not for all use-cases

12

KafkaA little introduction

13

14

Why ?

15

LinkedIn Original Architecture

16

17

What LinkedIn Want...

18

Looks Familiar : Use Messaging

(i.e. JMS, RabbitMQ)

19

20

21

22

23

It Didn't Scale...

24

Paradigm Change : Do NOT track message

consumption

25

26

27

28

Stateless Broker &Doesn't Fear the File System

29

Topics● Logical collections of partitions (the physical fi les). ● A broker contains some of the partitions for a topic

30

A partition is Consumed byExactly One Group's Consumer

31

Distributed & Fault-Tolerant

32

Broker 1 Broker 3Broker 2

Zoo Keeper

Consumer 1 Consumer 2

Producer 1 Producer 2

33

Broker 1 Broker 4Broker 3Broker 2

Zoo Keeper

Consumer 1 Consumer 2

Producer 1 Producer 2

34

Broker 1 Broker 4Broker 3Broker 2

Zoo Keeper

Consumer 1 Consumer 2

Producer 1 Producer 2

35

Broker 1 Broker 4Broker 3Broker 2

Zoo Keeper

Consumer 1 Consumer 2

Producer 1 Producer 2

36

Broker 1 Broker 4Broker 3Broker 2

Zoo Keeper

Consumer 1 Consumer 2

Producer 1 Producer 2

37

Broker 1 Broker 4Broker 3Broker 2

Zoo Keeper

Consumer 1 Consumer 2

Producer 1 Producer 2

38

Broker 1 Broker 4Broker 3Broker 2

Zoo Keeper

Consumer 1 Consumer 2

Producer 1 Producer 2

39

Broker 1 Broker 3Broker 2

Zoo Keeper

Consumer 1 Consumer 2

Producer 1 Producer 2

40

Broker 1 Broker 3Broker 2

Zoo Keeper

Consumer 1 Consumer 2

Producer 1 Producer 2

41

Broker 1 Broker 3Broker 2

Zoo Keeper

Consumer 1 Consumer 2

Producer 1 Producer 2

42

Broker 1 Broker 3Broker 2

Zoo Keeper

Consumer 1

Producer 1 Producer 2

43

Broker 1 Broker 3Broker 2

Zoo Keeper

Consumer 1

Producer 1 Producer 2

44

Broker 1 Broker 3Broker 2

Zoo Keeper

Consumer 1

Producer 1 Producer 2

45

Performance Benchmark1 Broker

1 Producer1 Consumer

46

47

48

LinkedIn Kafka Performance (2012)

● 8 nodes per datacenter

– ~20 GB RAM available for Kafka~20 GB RAM available for Kafka

– 6TB storage, RAID 10, basic SATA drives6TB storage, RAID 10, basic SATA drives

● 10 billion messages/day

● Sustained peak:

– 172,000 messages/second written172,000 messages/second written

– 950,000 messages/second read950,000 messages/second read

● 367 topics

● 40 real-time consumers

● Many ad hoc consumers

● 9.5TB log retained (~ 6 days)

● End-to-end delivery time: A few seconds

49

Thanks

top related