kafka-steaming-data

16
Kafka Streaming Data Platform

Upload: bryan-jacobs

Post on 11-Apr-2017

162 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: kafka-steaming-data

KafkaStreaming Data Platform

Page 2: kafka-steaming-data

Traditional Messaging System• Queue• Topic• After Consumed Removed• Out of order messaging

Page 3: kafka-steaming-data

What is Kafka• Messaging system• Polyglot Consumers / Producers• Topics and Partitions• Scalable• Configurable Message Retention• Guaranteed order

Topic

Page 4: kafka-steaming-data

Use Cases• Ordered Messaging• Log Aggregation• Metrics• Web Activity Tracking• Stream Processing

Page 5: kafka-steaming-data

Kafka Brokers – Clusters and Replication• Topics can be replicated• Data stored across various nodes• Kafka clusters require broker.id=0• Zookeeper• Offsets• Topic names• partitions

Page 6: kafka-steaming-data

Demo – Local Kafka• Startup zookeeper• bin/zookeeper-server-start.sh config/zookeeper.properties

• Start kafka• bin/kafka-server-start.sh config/server.properties

Page 7: kafka-steaming-data

Demo Command line tools• bin/kafka-topics.sh --create --zookeeper localhost:2181 --replication-

factor 1 --partitions 1 --topic test• bin/kafka-topics.sh --list --zookeeper localhost:2181• bin/kafka-console-producer.sh --broker-list localhost:9092 --topic test• bin/kafka-console-consumer.sh --zookeeper localhost:2181 --topic

test --from-beginning

Page 8: kafka-steaming-data

Example Producer• <CODE>

Page 9: kafka-steaming-data

Example Consumer• <CODE>

Page 10: kafka-steaming-data

Deployment Options• Stand alone deployment • Confluent.io• Horton Works• AWS

Page 11: kafka-steaming-data

HortonWorks Data Platform on AWS

Big Data in a one stop shop

Page 12: kafka-steaming-data

Determine Cluster Sizing• Implement a producer and consumer• Use your data structures• 3 Zookeeper nodes and 3 Kafka nodes• Java Heap = 2GB• Network Saturation (1 gigabit / 10 gigabit)• Avro Data Serialization

Page 13: kafka-steaming-data

Producer for testing throughput• <CODE>

Page 14: kafka-steaming-data

Architectural Possibilities• Streaming data platform• Common interface• High throughput

Page 15: kafka-steaming-data

WARNING• Kafka 0.8.x has a major bug…deletes data• Make sure to use 0.9.0.x

Page 16: kafka-steaming-data

Question & [email protected]