what's new in confluent 3.2 and apache kafka 0.10.2
TRANSCRIPT
2
Attend the whole series!
Simplify Governance for Streaming Data in Apache KafkaDate: Thursday, April 6, 2017Time: 9:30 am - 10:00 am PT | 12:30 pm - 1:00 pm ETSpeaker: Gwen Shapira, Product Manager, Confluent
Using Apache Kafka to Analyze Session WindowsDate: Thursday, March 30, 2017Time: 9:30 am - 10:00 am PT | 12:30 pm - 1:00 pm ETSpeaker: Michael Noll, Product Manager, Confluent
Monitoring and Alerting Apache Kafka with Confluent Control CenterDate: Thursday, March 16, 2017Time: 9:30 am - 10:00 am PT | 12:30 pm - 1:00 pm ETSpeaker: Nick Dearden, Director, Engineering and Product
Data Pipelines Made Simple with Apache KafkaDate: Thursday, March 23, 2017Time: 9:30 am - 10:00 am PT | 12:30 pm - 1:00 pm ETSpeaker: Ewen Cheslack-Postava, Engineer, Confluent
https://www.confluent.io/online-talk/online-talk-series-five-steps-to-production-with-apache-kafka/
What’s New in Apache Kafka 0.10.2 and Confluent 3.2
Date: Thursday, March 9, 2017Time: 9:30 am - 10:00 am PT | 12:30 pm - 1:00 pm ETSpeaker: Clarke Patterson, Senior Director, Product Marketing
3
Key themes for 3.2
Less Effort
Confluent Control Center brings visibility into the health of a
cluster so it’s easy to surface only those trouble spots that count. Confluent makes operating Kafka a snap.
Monitoring and Alerting in Confluent Control Center
More Apps
Confluent offers the most robust set of clients and connectors,
making it easy to onboard more apps in a streaming platform
.NET client
Bridge to Cloud
S3 Connector
Build real-time streaming pipelines directly to Amazon with
new S3 connector.
4
Apache KafkaTM Connect API – Streaming Data Capture
JDBC
Mongo
MySQL
Elastic
Cassandra
HDFS
Kafka Connect API
Kafka Pipeline
Connector
Connector
Connector
Connector
Connector
Connector
Sources Sinks
Fault tolerant
Manage hundreds of data sources and sinks
Preserves data schema
Part of Apache Kafka project
Integrated within Confluent Platform’s Control Center
5
Single Message Transforms for Kafka Connect
Modify events before storing in Kafka:• Mask sensitive information
• Add identifiers• Tag events• Store lineage
• Remove unnecessary columns
Modify events going out of Kafka:• Route high priority events to
faster data stores• Direct events to different
ElasticSearch indexes
• Cast data types to match destination
• Remove unnecessary columns
6
Single Message Transforms Use Cases
• Data masking: Mask sensitive information while sending it to Kafka.• Eg: Capture data from a relational database to Kafka, but the data includes PCI / PII information and your
Kafka cluster is not certified yet. SMT allows• Event routing: Modify an event destination based on the contents of the event. (applies to events
that need to get written to different database tables)• Eg: write events from Kafka to Elasticsearch, but each event needs to go to a different index - based on
information in the event itself. • Event enhancement: Add additional fields to events while replicating.
• Eg: Capture events from multiple data sources to Kafka, and want to include information about the source of the data in the event.
• Partitioning: Set the key for the event based on event information before it gets written to Kafka.• Eg: reading records from a database table, partition the records in Kafka based on customer ID)
• Timestamp conversion: Time-based data conversion standardization when integrating different systems
• Eg: There are many different ways to represent time. Often, Kafka events are read from logs, which use something like "[2017-01-31 05:21:00,298]" but the key-value store events are being written into prefer dates as "milliseconds since 1970"
7
Architecture of Kafka Streams API, a Part of Apache Kafka
KafkaStreams
API
Producer
Kafka Cluster
Topic TopicTopic
Consumer Consumer
Key benefits• No additional cluster
• Easy to run as a service• Supports large aggregations and joins
• Security and permissions fully integrated from Kafka
Example Use Cases• Microservices
• Continuous queries• Continuous transformations
• Event-triggered processes
9
Tumbling windows answer a different type of question
event-time
Alice
Bob
Dave
… …… …… …
5 mins.
Eg: How many downloads did we have per user in the last 5 minutes?”
10
Session windows allow us to group events based on periods of inactivity
event-time
Alice
Bob
Dave
… …… …… …
11
Session windows allow us to group events based on periods of inactivity
event-time
Alice
Bob
Dave
… …… …… …
Eg: How many shows does Alice watch on average per session?”
Inactivity period
12
Session windows allow us to group events based on periods of inactivity
event-time
Alice
Bob
Dave
… …… …… …
Eg: How many shows does Alice watch on average per session?”
17
Kafka Clients
Apache Kafka Native Clients
Confluent Native Clients
Community Supported Clients
Proxy http/REST
stdin/stdout
18
Confluent 3.2 – C# Client
High performance
Full support of Kafka protocol and features
Supported fully-featured native C# client
Integrates with Confluent’s Schema Registry
Works with any version of Apache Kafka
High reliability – honors Kafka ack settings and retries
19
Confluent 3.2 – JMS Client
Supported Kafka client, implementing the JMS interface
Secure clients with authentication, authorization and encryption
Integrates with Confluent’s Schema Registry
High reliability – Supports Kafka and JMS acknowledgments
Support for all JMS Message Types, Headers and Properties
20
Confluent 3.2 – Client Security
End-to-end encryption for REST Proxy
ActiveDirectory integration for C# client
21
Kafka Connect API Library of Connectors
* Denotes Connectors developed at Confluent and distributed by Confluent. Extensive validation and testing has been performed.
Databases
*
Datastore/File Store
*
Analytics
*
Applications / Other
*
22
CP 3.2 – New Certified & Supported Connectors
S3 Connector• Write Avro and JSON files• Date and time based partitions• Exactly-once delivery
23
Confluent 3.2 – Cluster Health & Administration
Cluster health dashboard• Monitor the health of your Kafka clusters
and get alerts if any problems occur• Measure system load, performance,
and operations• View aggregate statistics or drill down
by broker or topicCluster administration• Monitor topic configurations
24
Feature Benefit Apache Kafka Confluent Open Source Confluent Enterprise
Single messagetransformations Modify single events before storing in Kafka or as they leave Kafka
Session windows Group events in a stream based on session windows
C# client Simple library that enables streaming application development within the Kafka framework
Client security Active directory integration for C# and end-to-end encryption for REST proxy
S3 connector Easily write Avro and Parquet files to Amazon S3
JMS client Central registry for the format of Kafka data – guarantees all data is always consumable
Cluster health monitoring Monitor the health of Kafka clusters and get alerts when problems occur
Cluster administration Simplify the process of administering a Kafka cluster
What’s new in Confluent 3.2?
25
Feature Benefit Apache Kafka Confluent Open Source Confluent Enterprise
Apache Kafka High throughput, low latency, high availability, secure distributed streamingplatform
Kafka Connect API Advanced API for connecting external sources/destinations into Kafka
Kafka Streams API Simple library that enables streaming application development within the Kafka framework
Additional Clients Supports non-Java clients; C, C++, Python, .NET and several others
REST Proxy Provides universal access to Kafka from any network connected device via HTTP
Schema Registry Central registry for the format of Kafka data – guarantees all data is always consumable
Pre-Built Connectors HDFS, JDBC, Elasticsearch, Amazon S3 and other connectors fully certified and supported by Confluent
Confluent Control Center Enables easy connector management, monitoring and alerting for a Kafka cluster
Auto Data Balancer Rebalancing data across cluster to remove bottlenecks
Replicator Multi-datacenter replication simplifies and automates MDC Kafka clusters
Support Enterprise class support to keep your Kafka environment running at top performance Community Community 24x7x365
Confluent Completes Kafka
26
Attend the whole series!
Simplify Governance for Streaming Data in Apache KafkaDate: Thursday, April 6, 2017Time: 9:30 am - 10:00 am PT | 12:30 pm - 1:00 pm ETSpeaker: Gwen Shapira, Product Manager, Confluent
Using Apache Kafka to Analyze Session WindowsDate: Thursday, March 30, 2017Time: 9:30 am - 10:00 am PT | 12:30 pm - 1:00 pm ETSpeaker: Michael Noll, Product Manager, Confluent
Monitoring and Alerting Apache Kafka with Confluent Control CenterDate: Thursday, March 16, 2017Time: 9:30 am - 10:00 am PT | 12:30 pm - 1:00 pm ETSpeaker: Nick Dearden, Director, Engineering and Product
Data Pipelines Made Simple with Apache KafkaDate: Thursday, March 23, 2017Time: 9:30 am - 10:00 am PT | 12:30 pm - 1:00 pm ETSpeaker: Ewen Cheslack-Postava, Engineer, Confluent
https://www.confluent.io/online-talk/online-talk-series-five-steps-to-production-with-apache-kafka/
What’s New in Apache Kafka 0.10.2 and Confluent 3.2
Date: Thursday, March 9, 2017Time: 9:30 am - 10:00 am PT | 12:30 pm - 1:00 pm ETSpeaker: Clarke Patterson, Senior Director, Product Marketing
27
Why Confluent? More than just enterprise software
Confluent PlatformThe only enterprise open
source streaming platform based entirely on Apache
Kafka
Professional ServicesBest practice consultation for future Kafka deployments and optimize for performance and
scalability of existing ones
Enterprise Support24x7 support for the entire
Apache Kafka project, not just a portion of it
Complete support across the entire adoption lifecycle
Kafka TrainingComprehensive hands-on
courses for developers and operators from the Apache
Kafka experts
28
Get Started with Apache Kafka Today!
https://www.confluent.io/downloads/
THE place to start with Apache Kafka!
Thoroughly tested and quality assured
More extensible developer experience
Easy upgrade path to Confluent Enterprise