the weblogic messaging kernel: design challenges greg brail senior staff software engineer bea...
TRANSCRIPT
The WebLogic Messaging Kernel: Design Challenges
Greg Brail
Senior Staff Software Engineer
BEA Systems
Background
JMS has been part of WebLogic Server since version 5.1
New requirements came up for WLS 9.0Store-and-forward messaging
Asynchronous web services messaging (WS-RM)
Smaller-footprint messaging
Improved performance
Clean up our growing code base
Messaging Kernel
The Messaging Kernel provides in-process messaging services divorced from any particular API or protocol
Used in WebLogic Server 9.0JMS 1.1
WebServices-Reliable Messaging (WS-RM)
Kernel itself is invisible to end users
ImplementationDesign work started in mid-2003
Implementation in 2004
Shipped with WebLogic Server 9.0 in August, 2005(Credit for the kernel idea goes to “Zach” Zachwieja, now at Microsoft)
Messaging Kernel Features
Message QueuingPersistent and non-persistent queuing
Transactional queuing, with two-phase commit support
Pluggable message sort order
Message redirection for expiration and error handling
Message ordering and sequencing features
Publish/SubscribeTopic subscribers are queues
Pluggable subscription filter
Core FeaturesThresholds, quotas, statistics, pause/resume for manageability
Paging to control memory usage
Three Design Challenges
#1: Persistence Mechanism
#2: Message Ordering for User Applications
#3: Cluster-Wide Message Routing
Challenge #1: Persistence Mechanism
What are some ways we could handle message persistence?RDBMS approach:
INSERT each new message as a row in the tableSELECT every time we want to get a message and flag it somehowDELETE each message when it has been acknowledged
Log approach:Use a linear transaction logAppend records for “send” and “acknowledge” and recover by reading the logTruncate the log when it gets too big and move messages elsewhere
Classical approach:Create persistent data structures on disk blocks just like a database
Use ARIES or some other well-documented algorithmTraverse the persistent data structures to get and update each message
Heap approach:Write each message to disk in a convenient locationRemember that location and mark messages deleted when acknowledgedRecover by reading back the whole file
Persistence Pros and Cons
RDBMSSimple to implement, at least at firstBut SELECTING to find each new message is not great
WebLogic JMS 5.1 did this and it was very slow
Low memory usage, but also low throughput and high latency
LogTransaction logs are well-understood and relatively simple to implementBut once the log is full, performance drops dramaticallyGood memory usage, throughput, and latencyRecovery time depends on the (configurable) size of the log
ClassicalWell-understood, although not simple to implementMemory usage and recovery time depend on size of the cache and logThroughput is lower due to overhead of persistence algorithms
At least for us, in Java!
Heap-Based Persistence
The Heap method works well for usMore memory than Classical or RDBMS approaches
Potentially longer recovery time than Log or Classical approaches
But best throughput
Allows for both file- and RDBMS-based implementations
File-based heap used in WLS 6.1 through 8.1Not very sophisticated; relied too much on the filesystem
Latest version is lot smarterMake sure we do no more than one I/O per commit
Platform-specific code to access “direct I/O” feature in most O/Ss
New records are located on disk to reduce rotational latency
Runs faster than a traditional transaction log on a single disk
“A High-Performance, Transactional File Store for Application Servers”, Gallagher, Jacobs, and Langen, SIGMOD 2005
File-Based Persistence Performance
Performance test results: One JMS queue, with persistent 100-byte messages
Messages sent and received simultaneously
JMS clients on separate machines from the server
Result based on receive throughput
(Hardware: Dual-CPU 3.4 GHz Intel, 4GB RAM, 15,000 RPM SCSI disk, Windows Server 2003. One such machine used for WLS server, two for clients.)
48
507573
5178
0
1000
2000
3000
4000
5000
6000
1 producer, 1consumer
32 producers, 32consumers
8.1
9.0
Challenge #2: Message Ordering
Problem:Applications require certain sets of messages to be processed in orderQueuing systems usually give you two choices:
One thread and one queue for all processing (poor throughput)Or, lots of threads and lots of queues (poor manageability)
Solution: the “Unit of Order” featureControls concurrency of message delivery based on application requirementsMessages are tagged with a “unit of order” (UOO) nameOnly one message for each UOO is delivered at a timeNext message not available until previous message has been processedProcessing ends with transaction resolution
Result: For each UOO, messages are processed in the order they arrived on the queue
Better throughputLess lock contention in the database
Unit of Order Example
• When first blue message is dequeued, blue UOO has an “owner”• Next consumer skips blue messages and gets the green message• When blue message is acknowledged, next blue message available for consumption
• Throughput is excellent when messages are well-interleaved (like above)• In theory, throughput drops when a consumer must skip many messages because they are not well-interleaved (like below)
Unit of Order Performance
0
500
1000
1500
2000
2500
3000
1 2 8 100 0*
Number of UOOs
Mes
sag
es/s
eco
nd
1 Thread8 Threads16 Threads
Test receives non-persistent messages from a queue and sleeps 5 milliseconds per message to simulate actual processing.
(*With zero units of order, messages are processed out of order. So, this number is just on the chart as a baseline.)
Challenge #3: Cluster Routing
UOO is great for a single queue
What if new messages are load balanced across many queues?
Each UOO must see that all messages go to the same queueAnd other problem domains have similar requirements
We implemented two solutions:Hashing: Hash on the unit of order name to determine where in the cluster each message should go
Hashing is based on number of servers configured in the cluster (C)
“Path Service:” A persistent cluster-wide database
Maps keys (UOO names) to values (cluster members)
One master database for the whole cluster
Caches on all other cluster nodes
Cluster Routing Issues
Both approaches have flaws
Hashing is fast and scales wellBut if any one server is down, 1 / C units of order cannot be handled
C is the number of configured servers, not running servers
Queuing elsewhere decreases throughput and adds race conditions
If C changes, messages may fall out of order
This makes it difficult to grow and shrink the cluster based on load
Path Service is much more flexibleOne server in the cluster is the master and handles all updates
So if it is down, new paths cannot be created
Future: We would like to do betterUse Paxos to elect a master, with replicated persistent state?
Generate UOO names on the server rather than on the client?
Conclusion
With the messaging kernel, we have implemented some old solutions, and found a few new ones
We think “unit of order” is quite useful and bears future research
We have quite a few more problems to solveContinuously-available cluster routing
Continuously-available messaging
Performance, performance, performance
The messaging world needs to pay more attention to the research world to help solve these kinds problems
The research world might have more to study in the messaging world too!