Download - [@NaukriEngineering] Messaging Queues
Message Queue
Message Queuing (MSMQ) technology enables applications running at different times to and
systems that may be temporarily offline.
Applications send messages to queues and read messages from queues.
Messaging enables software applications to connect and scale.
Messaging is asynchronous, decoupling applications by separating sending and receiving
data.
Sending emails in bulk
Video encoding: if you're building Youtube, you probably are not going to ask your user to wait until his video has been converted from avi to flash / x264 / webM before telling him his file was uploaded correctly.
Pushing a tweet: if User has set authorized account on your app. (Defer tweets, api limit reached etc).
Gamification in referral (Multiple api calls through different applications).
Real World Scenarios
Components of AMQPMessage Broker - manages exchanges and queues
Channel - logical state of a connection (name, durability, persistence, delivery
mode etc.)
Exchanges - where message needs to be sent.
Queues - queues messages for the consumption by client.
Bindings - relationship between queues and exchanges
Messages - messages . . . consisting of two parts routing key & body
Message Headers
Routing Key - used to route messages, dependent on type of exchange (*,#)
Priority - ranging from 0-9 provides priority over other messages
Delivery Mode - if message needs persistence or not (Durability)
Expiration - time for message broker to decide if message is routable. (TTL for both messages/queues)
a message goes to the queues whose binding key exactly matches the routing key of the message
Direct Exchanges
a message sent with a particular routing key will be delivered to all the queues that are bound with a matching binding key
Topic exchange
Why Kafka?When we have….
Aren’t they Good?
*Apache ActiveMQ, JBoss HornetQ, Zero MQ, RabbitMQ are respective brands of Apache Software Foundation, JBoss Inc, iMatix Corporation and Vmware Inc.
● Transportation of logs
● Activity Stream in Real time.
● Collection of Performance Metrics◦ CPU/IO/Memory usage
◦ Application Specific⚫ Time taken to load a web-page.
⚫ No of requests.
⚫ No of hits on a particular page/url.
So what are the Use-cases…
● Scalable: Need to be Highly Scalable. A lot of Data.
● Reliability of messages, What If, I loose a small no. of messages. Is it fine
with me ?.
● Distributed : Multiple Producers, Multiple Consumers
● High-throughput: (100k+/sec)
What is Common?
Introduction● A High-throughput distributed Publish-Subscribe based messaging system.
● A Kind of Data Pipeline
● Does not follow JMS Standards
Zookeeper Consumer 1(groupId1)
Consumer 2(groupId1)
Handshake
Event Push
Handshake Kafka Broker(Partition 1)
Coordination
Store Consumed Offset and Watch for Cluster event
Event Polling
Kafka Broker(Partition 2)
Producer
Producer
Producer
Producer
.
.
.
.
.
Event Push
Event Push Consumer 3(groupId1)
* two consumer of same consumer group can’t access a partition
*
● Feeds of messages are categories in Topics● Topics are broken up into ordered commit logs called partitions.● #paralalism = #partitions
Topic/Partitions
● Multiple consumer can read from the same topic● Each consumer manage it’s own offset● Messages stay on kafka for a duration● Max #Consumer = #partitions
Consumer
● Filesystem Cache: It reduces # disc read
● Zero-copy transfer of messages - http://www.ibm.com/developerworks/library/j-zerocopy/
● Batching of Messages: It reduces network calls by the factor of batch size
● Eventual consistency: trade off in consistency and durability
● Broker does not Push messages to Consumer, Consumer Polls messages from Broker.
● Automatic Producer Load balancing.
● Zookeeper helps in cluster formation
and rebalancing of Broker/Consumer
Design Elements
Performance
Credit : http://research.microsoft.com/en-us/UM/people/srikanth/netdb11/netdb11papers/netdb11-final12.pdf
Producer Performance Consumer Performance