Spring is a framework which sits on top of Apache Camel, which runs on Oracle WebLogic web server. It simplifies writing java applications for processing messages.
I was involved in tracking down some performance problems when the round trip time for a simple application was over 1 second – coming from a z/OS background I thought 50 milliseconds was a long time – so over 1 second is an age!
The application is basically a Message Drive Bean. It does
- Listening applications get a message from a queue. There can be multiple listening threads
- Get an application thread
- Pass it via an internal queue to the application running on another thread – the listening thread is blocked while the application is running.
- This application sends (puts) a message to an MQ queue for the backend server and waits for the response.
- Return to the listening application
- Free the application thread.
As this is essentially the same as a classic MDB, we had configured the number of application threads in the thread pool the same number as the listener thread pool.
Shortage of threads
The symptoms of the problem looked like a shortage of threads problem.
When we increased the number of threads in the application pool ( we gave it 4* the number of listener threads) The response time dropped – good news. I dont know how many threads you need – is it n+1 or 2* n. I’ll leave finding the right number as an exercise for the reader!
The hard coded 1 second wait before get
One symptom we saw was the queue depth on the replyTo queue on the server was deeper than normal.
For the reply coming back from the server, I believe there is one thread getting from the queue.
When the reply to queue is not shared
The application thread has sent the request off, and is now waiting. This getting thread does an MQ destructive GET with wait. When the message arrives, it looks at the content, and decides which application thread is waiting for the reply, and wakes it up.
When the reply queue is shared between instances
For example you have configured two instances for availability. The above logic is not used.
Instance1 cannot destructively get a message because the message could be destined for instance2. Similarly instance2 cannot get destructively get the message because it could be destined for instance1.
One way to solve this would be to do a get next browse of the message, and if it is for the instance do a get_message_under_cursor. This works great for MQ, but not other message providers which do not have this capability.
The implementation used is to use polling!
If there are 3 applications tasks waiting for a replies, reply1, reply3, reply4. The logic is as follows
- For each reply id, use MQ message selectors to try to get reply1, reply3, reply4. This is not a get by messageID or correlID – for which the queue manager has indexes to quickly locate the message, this is a get with message selector, which means look in every message on the queue.
- For any messages found – pass them to the waiting applications
- Do an MQGET with a message selector which does not exist – with a wait time of receiveTimeout (defaults to 1 second). Every message is scanned looking for the message selector string, and it is not found.
Looking at a time line – in seconds and tenths of seconds
0.1 send request1, wait for reply
0.2 getting task does MQGET wait for 1 second with non exising selector
0.2 send request 2, wait for reply
0.3 reply1 comes back
0.4 send request3, wait for reply
0.5 reply2 comes back
0.6 reply3 comes back
1.2 getting task waiting for 1 second times out
1.2 getting tasks gets reply1 and posts task1
1.2 getting tasks gets reply2 and posts task2
1.3 getting tasks gets reply3 and posts task3
So although reply1 was back within 0.2 seconds (0.3 – 0.1) it was not got until time 1.2. The message had been waiting on the queue for 0.9 seconds.
- Total wait time for reply1 was 1.1 seconds.
- Total wait time for reply2 was 1.2 – 0.2 = 1.0 seconds
- Total wait time for reply3 was 1.3 – 0.4 = 0.9 seconds
Wow – what a response time killer!
You can tune this time by specifying receiveTimeout. If you make it smaller, the wait time will be shorter, so messages will be processed faster, but the CPU cost will go up as more empty gets are being done.
This solution does not scale.
You have had a slow down, and there are now 1000 messages on this queue. (990 of these are unprocessed messages, due to timeout . There is no task waiting for them – they never get processed – nor do they expire!)
- MQGET for reply1. This scans all 1000 messages – looking in each message for the message with the matching selector. This takes 0.2 seconds.
- MQGET for reply2. This scans all 1000 messages – looking in each message for the message with the matching selector. This takes 0.2 seconds.
- You have 10 threads waiting for messages, so each message has to wait for typically 10 * 0.2 seconds = 2 seconds a message!
What can you do about it.
See the Camel documentation Request-Reply over JMS, parameters,
concurrentConsumers, and Request-Reply over JMS Using an Exclusive Fixed Reply Queue
- Avoid sharing the queue. Give each instance its own queue. Set the Exclusive flag
- Tune the ReceiveTime out – making it a shorter time interval can increase the CPU as you are doing more empty gets. You might want to set it, to a value which is 95% percentile time between the send to the server, and the reply comes back. So if the average time is 40 ms, set it to 60 ms or 80 ms.
- If you are going to share the queue, make sure you clean out old messages from the queue – for example use expiry, or have a process which periodically scans the queue and moves messages older than 1 minute.
- Did I mention avoid sharing the queue.
- If you get into a vicious spiral where the response time gets longer and longer, and the reply queue from the server gets deeper – be brave and purge server reply queue.
- Avoid sharing the queue.