slash n: technical session 2 - messaging as a platform - shashwat agarwal, vivek y s
DESCRIPTION
TRANSCRIPT
Messaging as a :Platform
You only worry about processing
Messaging
Messaging is a form of communication to exchange information
Messaging is the means to distribute/share/seek information in software systems
Alan Kay on Messaging
http://c2.com/cgi/wiki?AlanKayOnMessaging
Messaging
● Enables us to build distributed systems
Producer
ConsumerConsumer
Messaging
Messaging
... Massively distributed message processing systems
Type of processing
● Consider a scenario - counting ERRORs in access log
● Largely a batch execution
cat access.log | grep ERROR |sed -e 's/.* (http://.*)/\1/g' | sort | uniq -c > errors.txt
grepcatsed
uniq
Type of processing
● Replacing 'batch' with 'real time' processing
cat access.log | grep ERROR |sed -e 's/.* (http://.*)/\1/g' | sort | uniq -c > errors.txt
Is this expression still valid?
tail -f access.log
Type of processing
tail -f access.log | grep ERROR |sed -e 's/.* (http://.*)/\1/g' | sort | uniq -c > errors.txt
grepstdout sed uniq
batch
Batch Processors
Messaging
● What about processing logs from different systems?
● What about distributing processing on multiple systems?
● Can I tap into one of the pipes without stopping the system?
● How can I add to already running pipeline?
Introducing...
Messaging as a Platform
● You never have to write code to exchange messages
● You only write processes (sed, grep etc) and string them together; can even provide rich set of built ins
● Process definition will remained unchanged even when the nature of the processing changes!
MaaP - Architecture
Messaging Services as the pipe
grepcat sed uniq
Q Q Q
Wra
pper
s to
ena
ble
Mes
sage
Ser
vice
MaaP - Messaging Services
Provides
● Allows access from multiple hosts○ Enables distributed processing
● Capacity buffering where there is throughput difference
● Transactions and Durability○ Enables failures and fault tolerance
● Ordering, Retries ...
Next we will need Message Brokers to enable dynamic routing
MaaP - Architecture
Producer Process
Consumer Processes
Broker
Q Q Q Q Wra
pper
s to
ena
ble
Mes
sage
Ser
vice
MaaP - Brokers
● Manages the route from producer to consumer queues
● Consumers can join or opt out anytime● Can optionally enable consumers to consume
messages from beginning, or, some point in past or the live messages
● Enables consumers to move between different messaging systems
A message batching system that dumps the messages on a rule basis to enable batch processing
Q
MaaP - Architecture
Periodic Batch
HDFS
MaaP - Architecture
A mechanism to pump processed data back to event-based processing
Pull PushHDFS Listener Broker
Batch process complete notification
MaaP - Architecture
And finally a Process Manager
● that accepts the process binaries and distribute it on the hosts allocated to the platform.
● Load balance based on computation capacity available
Demo
Event processing system on a simple MaaP platform
MaaP - Uses
● Log processing○ Real time event filtering, routed to ->○ Aggregation (batched) and then ->○ Near real-time monitoring
● Product Feed Processing and Aggregation○ Crawler, emails, ftp has feeds : routed to ->○ Extract product info (batched) to ->○ Pipeline to update into the store and invalidate
caches in near real time
@ Flipkart
A lot of messages and a lot of solutions
● Work Queues (Rabbitmq)● SOA via Restbus● Event Replication across systems
○ Notification (cache updates)
● Events to batch processing and back○ mysql <-> hadoop <-> mysql (analytics)
We need to connect them! That's all :P
Questions
?
?
?
??
? ?
? ?