No, No, think before you create a naming convention

I remember doing a review of a large customer who had grown by mergers and acquisitions.  We were discussing naming conventions, and did they have them.

“Naming conventions”, he said “we love them.  We have hundreds of them around the place”. He said it was to hard and disruptive to try to get it down to a small number of naming conventions.

I saw someone’s MQ configuration and wished they had thought through their naming convention, or asked someone with more experience.  This is what I saw

  • The MQ libraries were called CSQ910.SCSQAUTH
    • This is OK as it tells you what level of MQ you are using
    • It would be good to have a dataset alias of CSQ pointing to CSQ910.  Without this you have to change the JCL for all job, compiles, runs etc which had CSQ910.  When you moved from CSQ810 to CSQ900 you have change the JCL. If you then decide to go back to CSQ810 for a week, you have to change the JCL again.  With the alias is is easy – change the alias and the JCL does not need to change.    Change the alias again – and the JCL does not need to change.
  • The MQ logs were called CSQ710.QM.LOGCOPY1.DS01, … DS02,…DS03
    • This shows the classic problem of having the queue manager release as part of the object names.  It would have been better to have names like CSQ.QM.LOGCOPY1.DS01 without the MQ version in it.
    • The name does include a queue manager name of sorts, but a queue manager name of QM is not very good.  If you need another queue manager you will have names like QM, QMA, QMB so an inconsistent name.
    • It is good to have the queue manager name as part of the data set name, so if the queue manager was QM01 then have CSQ.QM01.
    • This shows the naming standard problem as it evolved over time.  They added more page sets, and used the MQ release as the High Level Qualifier.  The page sets are CSQ710,… CSQ810…,  CSQ910… – following the naming standard.

You do not invent a naming convention in isolation, you need to put an architect’s hat on and see the bigger picture, where you have production and test queue managers, different versions of MQ, and see MQ is just a small part of the z/OS infrastructure.

  • People often have one queue manager per LPAR, and call MQ after the LPAR.
  • You are likely to have multiple machines – for example to provide availability, so plan for multiple queue managers.
  • You may want different HLQ to be able to identify production queue manager data sets and test queue manager data sets..
  • The security team will need to set up profiles for queue managers. Having MQPROD and MQTEST as a HLQ may make it easier to set up.
  • The storage team (what I used to call data managers)  set up SMS with rules for data set placement. For example production pagesets with a name like MQPROD.**.PSID* go on the newest, fastest, mirrored disks.  MQTEST.** go on older disks.
  • As part of the SMS definitions, the storage team define how often, and when, to backup data sets.   A production page set may be backed up using flash copy once an hour.   (This is within the Storage subsystem and takes seconds.   It takes a copy by taking a copy of the pointers to the records on disk).   Non production get backed up overnight.


Lessons learned

  • For the IBM provided libraries, include the VRM in the data set names.
  • Define an alias pointing to the current libraries so applications do not need to change JCL.   You could have a Unix Services alias for the files in the zFS.
  • Do not put the MQ release in the queue manager data sets names.
  • Use queue manager names that are relevant and scale.
  • Talk to your security and storage managers about the naming conventions; what you want protected, and how you want your queue manager data sets to be managed.

NETTIME does not just mean net time

I saw a post which mentioned NETTIME and where people assume it is the network time.   It is more subtle than that.

If NETTIME is small then dont worry.   If NETTIME is large it can mean

  • Slow network times
  • Problems processing messages at the remote end

Consider a well behaved channel where there are 2 messages on the transmission queue

Sender end Receiver end
  • MQGET first message from the queue
  • TCP Send message in  buffer 1
  • MQGET second message from the queue
  • TCP Send message in buffer 2
  • MQGET – no message found
  • Do end of batch processing
    • TCP Send “End of batch” in buffer 3
    • Start timer
    • Wait
    • buffer arrives, wake up application
    • Stop timer. Interval is “Sender wait time”
  • Extract “receiver processing time” from reply buffer
  • Calculate NETTIME = “sender wait time” – “receiver processing time”
  • buffer 1 arrives, wake up Receiver channel application
  • buffer 2 arrives
  • TCP receive buffer 1 from network
  • MQPUT message 1 to the queue
  • buffer 3 arrives
  • TCP receive buffer 2 from network
  • MQPUT message 2  to the queue
  • TCP receive buffer 3 from the network – get  “end of batch” flag
    • Start timer
    • Issue commit
    • Stop timer
    • Send “OK + time interval back to Sender

In this case the NETTIME is the total time less the time at the receiver end.  So NETTIME is the network time.

In round numbers

  • it takes 2 millisecond from sending the data to it being received
  • get + send takes 0 ms ( the duration of these calls is measured in microseconds)
  • receive (when there is data) + MQPUT and put works, takes 0 ms
  • commit takes 10 ms
  • it takes 1 ms between sending the response and it arriving.
  • “10 ms” is sent in the response message

This is a simplified picture with details omitted.

The sender channel sees 13 ms  between the last send and getting the response.  (13 ms – 10 m)s is 3 ms – the time spent in the network.


Now introduce a queue full situation at the receiver end

Sender end Receiver end
  • MQGET first message from the queue
  • TCP Send message in buffer 1
  • MQGET second message from the queue
  • TCP Send message in buffer 2
  • MQGET – no message found
  • Do end of batch processing
    • TCP Send “End of batch” in buffer 3
    • Start timer
    • Wait
    • buffer arrives, wake up application
    • Stop timer. Interval is “Sender wait time”
  • Extract “receiver processing time” from reply buffer
  • Calculate NETTIME = “sender wait time” – “receiver processing time”
  • buffer 1 arrives, wake up Receiver channel application
  • buffer 2 arrives
  • TCP receive buffer 1 from network
  • MQPUT message 1 to the queue – it gets queue full, it pauses
  • buffer 3 arrives.  All of the data is in the buffers in TCP at this end.
  • after 500 ms the MQPUT succeeds.
  • TCP receive buffer 2 from network
  • MQPUT message 2 to the queue
  • TCP receive buffer 3 from the network – get “end of batch” flag
    • Start timer
    • Issue commit
    • Stop timer
    • Send “OK + time interval back to Sender

In round numbers

  • it takes 2 millisecond from sending the data to it being received
  • get + send takes 0 ms ( it is in microseconds)
  • receive (when there is data) takes 0 ms
  • the pause and retry took 500 ms
  • the second receive and MQPUT takes 0 ms
  • commit takes 10 ms
  • it takes 1 ms between sending the response and it arriving.
  • “10 ms” is sent ( as before) in the response message (the time between the channel code seeing the “end of batch” flag and the end of its processing
  • Buffer 3 with the “end of batch” flag was sitting in the TCP buffers for 500 ms

The sender channel sees 513 ms  between the last send and getting the response.  513 ms – 10 ms is 503  ms – and reports this as ” the time spent in the network” when in fact the network time was 3 ms, and 500 ms was spent wait to put the message.

Regardless of the root cause of the problem, a large nettime should be investigated:

  • do a TCP ping to do a quick check of the network
  • check the error logs at the receiver end
  • check events etc to see if the queues are filling up at the receiver end

Using Activity Trace to show a picture of which queues and queue managers your application used.

I used the midrange MQ activity trace to show what my simple application, putting a request to a cluster server queue and getting the reply, was doing.  As a proof of concept (200 lines of Python), I  produced the following

This output is a .png format.   You can create it as an HTML image, and have the nodes and links as clickable html links.

Ive ignored any SYSTEM.* queues, so the SYSTEM.CLUSTER.TRANSMIT.QUEUE does not appear.

The red arrows show the “high level” flow between queue managers at the “architectural”, hand waving level.

  • The application oemput on QMA did a put to a clustered queue CSERVER, there is an instance of the queue on QMB and QMC.   There is a red line from QMA.oemput to the queue CSERVER on QMB and QMC
  • The server programs, server running on QMB and QMC put the reply message to queue CP0000 on queue manager A

The blue arrows show puts to the application specified queue name – even though this may map to the S.C.T.Q.  There are two blue lines from QMA.oemput because one message went to QMC.CSERVER, and another went to QMB.CSERVER

The yellow lines show the path the message took between queue managers.  The message was put by QMA.oemput to queue CSERVER; under the covers this was put to the SCTQ.  From the accounting trace record this shows the remote queue manager and queue name:  the the yellow line links them.

The black line is getting from the local queue

The green line is the get from the underlying queue.  So if I had a reply queue called CP0000, with a QAlias of QAREPLY. If the application does a get from QAREPLY,  There would be a black line to CP0000, and a green line to QAREPLY

How did I get this?

I used the midrange activity trace.

On QMA I had in mqat.ini

ApplClass=USER # Application type
ApplName=oemp* # Application name (may be wildcarded)
Trace=ON # Activity trace switch for application
ActivityInterval=30 # Time interval between trace messages
ActivityCount=10 # Number of operations between trace msgs
TraceLevel=MEDIUM  # Amount of data traced for each operation
TraceMessageData=0 # Amount of message data traced

I turned on the activity trace using the runmqsc command


I ran some work load, and turned the trace off few seconds later.

I processed the trace data into a json file using

/opt/mqm/samp/bin/amqsevt -m QMA -q SYSTEM.ADMIN.TRACE.ACTIVITY.QUEUE -w 1 -o json > aa.json

I captured the trace on QMB, then on QMC, so I had three files aa.json, bb.json, cc.json.  Although I captured these at different times, I could have collected them all at the same time.

jq is a “sed” like processor for processing json data.   I used it to process these json files and produce one output file which the Python json support can handle.

jq . --slurp aa.json bb.json cc.json  > all.json

The two small python files are zipped here. AT.

I used python script to process the all.json file and extract out key data in the following format:


  • server, the name of the application program
  • COLIN, the channel name, or “Local”
  •, the IP address, or “Local”
  • QMC, on this queue manager
  • Put1, the verb
  • CP0000, the name of the object used by the application
  • SYSTEM.CLUSTER.TRANSMIT.QUEUE, the queue actually used, under the covers
  • QMC, which queue manager is the SCTQ on
  • CP0000, the destination (remote) queue name
  • QMA, the destination queue manager
  • 400 the number of times this was used, so 400 puts to this queue.

I had another python program which took this table and used python graphviz to draw the graph of the contents.  This produces a file with DOT  (graph descriptor language)parameters, and used one of the many programs to draw the chart.

This shows you what can be done, it is not a turn-key solution, but I am willing to spend a bit of time making it easier to use, so you can automate it.  If so please send me your Activity Trace data, and I’ll see what I can do.

When is mid-range accounting information produced?

I was using the mid-range accounting information to draw graphs of usage, and I found I was missing some data.

There is a  “Collect Accounting” Time for every queue every ACCTINT seconds (default 1800 seconds = 30 minutes).  After this time, any MQ activity will cause the accounting record to be produced.  This does not mean you get records every half hour as you do on z/OS, it means you get records with a minimum interval of 30 minutes for long running tasks.


I had a server which got from its input queue and put a reply message to the reply-to-queue.

Every minutes an application started once a minute which put messages to this server, got the replies and ended.

When are the records produced?

Accounting data is produced (if collecting is enabled) when:

  • an MQDISC is issued, either explicitly implicitly
  • for long running tasks  the accounting record(s) seems to be produced at when the current time is past the “Collect Accounting time”, when there has been some MQ activity. For example  there were accounting records for a server at the following times
    • The queue manager was started at 12:35:51, and the server started soon afterwards
    • 12:36:04 to 13:06:33.   An application put a message to the server queue and got the response back.   This is 27 seconds after the half hour
    • 13:06:33 to 13:36:42  The application had been putting messages to the server and getting the responses back.   This is 6 seconds after the half hour
    • 13:36:42 to 14:29:48 this interval is 57 minutes.  The server did no work from 1400 to 14:29:48 ( as I was having my lunch).  At 14:29:48 a message arrived, and the accounting record was written for the server.
    • 14:29:48 to 15:00:27 during this time messages were being processed, the interval is just over the 30 minutes.

What does this mean?

  • If you want accounting data with an interval “on the half hour”, you need to start your queue manager “just before the half hour”.
  • Data may not be in the time period you expect.  If you have accounting record produced at 1645, the data collected between 1645 and 17:14  may not appear until the first message is processed the next day. The record havean  interval  from 16:45 to  09:00:01 the next day.  You may not want to display averages if the interval is longer than 45 minutes.
  • You may want to stop and restart the servers every night to have the accounting data in the correct day.


Stackoverflow: What throughput can a standalone Java program achieve?

There was a question on the MQ section on StackOverflow

I have a standalone multi threaded java application which listen messages from IBM MQ.
Current system take around 500ms for processing of 1 message after it read from queue and till it commit.
I want to know how many messages I can consume

  • Concurrently:
  • Max number of messages can be processed? or throttle limit

A good meaty performance question I thought.  Let me break this into pieces.

Current system take around 500ms for processing of 1 message after it read from queue and till it commit.

Processing one messages and commit should take about 10 milliseconds or less( say 30 ms for a two phase commit).    There is clearly something else going on.  Fix this first.

  1. A long database call.   This could be due to database locking, or a badly designed statement, for example a query which needs to access thousands or millions of rows.
  2. A request to a server far far away
  3. A file system with the speed of writing an illuminated letter to parchment

How many messages I can consume: Concurrently:

Take the worst case of using persistent messages, which require log IO during commit.

For one thread, processing multiple messages before doing a commit means the thread can do more work.  Consider a get taking 1 millisecond, and a commit taking 10 ms. This is one message processed every 11 ms.  If you did 50 gets – taking 50 ms and a commit taking 10 ms, this is 50 messages in 50 + 10 ms which equates to one message every 1.2 milliseconds almost 10 times faster.    This is how channels can send messages efficiently.   There is a “sweet spot” of messages per commit to give you maximum data processed per second.   This depends on the message size, logging rates and other factors.  For a 100MB message it is one message per commit.  For 10KB messages,  this may be 1000 messages per commit.

This may be selfish

This is clearly a great improvement, but possibly selfish.  If the application logic is a get followed by a database insert, followed by a commit, then doing 50 gets, 50 inserts and a commit, will work much faster.  The down side is that the database requests will keep locks until the commit.  These locks may prevent other applications from accessing data, either the recently inserted  records, page locks, or index locks. So overall MQ throughput goes up – but the business transaction suffers.    You need to understand the database and find the optimum number of requests per commit for your business transaction.

How long before the data is visible?

Rather than have one thread process 1000 messages per commit (taking 1010 ms) you may want to have multiple threads processing 10 messages per commit – taking 20 ms.  This means that the data in the database (or replies etc) are visible earlier.    This may be important to your business transaction if you have to worry about response time.

Parallel  threads

  1. Using more threads should improve throughput, unless this is delayed by external factors – such as database locks.
  2. One customer found one thread was optimum because there was no database delays.

How many messages I can consume: Max number of messages can be processed? or throttle limit

There are papers written on this but here is a one minute overview

As fast as the queue manager can process data

  1. The rate at which MQ can write its logs
  2. Keep queue data in memory – ( buffer pools on z/OS, queue buffer on midrange), so few messages on the queue.


  1. Having parallel threads gives you better throughput than one thread.  You get overlapped writing to the log, the units of work are shorter in duration, you can get parallel IO.
  2. You may be limited by the network.   Having multiple threads from an application means the network can be better utilized.  One thread can be receiving data down the wire, while another thread is waiting in commit.
  3. You may be limited by where your programs run – eg short of CPU, or slow IO (for your System.out.println statements)

Application design

  1. You may get delays due to serialization if all thread are using the same queue.
  2. Remove the debug printf or System.out.println statements.
  3. Using a queue per business application is better than all applications sharing the same queue
  4. Using one reply to queue per web server may be better than a shared reply to queue – especially if you use Apache Camel.
  5. Use get first if possible.  Avoid scans of the queues.


The short answer….

You should be able to get thousands of 1KB messages a second through your Java application when using multiple threads.


What’s the difference between an MQ Message and a JMS Message

I had problems using the MQI Interface  to create a message for a JMS program to receive.

To see what was in the JMS message,  I used a Java program using JMS to write a message, and used my trusty C program to display it.

I could see that there were message properties in the message

Property 0 name <mcd.Msd> value <jms_text>
Property 1 name <jms.Dst> value <queue:///JMSQ1>
Property 2 name <jms.Rto> value <queue:///JMSQ2>
Property 3 name <jms.Tms> value <1571902099742>
Property 4 name <jms.Dlv> value <2>

These are described here.

The mcd.Msd value is one of jms_none, jms_text, jms_bytes, jms_map, jms_stream, jms_object.   This depends on whether you use Message message, BytesMessage message etc to define your message type.  The jms program receiving the message may be expecting a particular type

The jms.Rto comes from the message.setJMSReplyTo(…).  It was set in the MQMD.ReplyToQ  as well as the message property.

It took me some time to find how to specify value such as for deliveryMode.  I found it here.  For example  message.setDeliveryMode(DeliveryMode.NON_PERSISTENT).   (This comes from javax.jms.DeliveryMode.NON_PERSISTENT,not a…. file).

I converted my simple program from JMQI to JMS, in a couple of hours, and was surprised to find it used fewer lines of code than using the JMQI.   Of course I may find I omitted some work, such as error handling, but it seems to be working OK.

Magic methods to decode Java MQ constants to strings.

I had been struggling with MQ and java, and decoding what the return codes numbers were, and found some well gem methods here.

String reasonCode = MQConstants.lookup(2035, “MQRC_.*”);  gave MQRC_NOT_AUTHORIZED


String decode  = MQConstants.decodeOptions(gmo.options,”MQGMO_.*”);  gave me


I wish I had these a couple of years ago – it would have saved me a lot of time!


The methods are

static java.lang.String decodeOptions(int optionsP,
java.lang.String optionPattern)

This helper method takes an integer representing a set of IBM MQ options for an MQI structure, and converts them into a string displaying the constants that the options represent.
static int getIntValue(java.lang.String name)

Returns the value of the named MQSeries constant as an int.
static java.lang.Object getValue(java.lang.String name)

Returns the value of the named MQSeries constant.
static java.lang.String lookup(int value,
java.lang.String filter)

Returns the MQSeries constant name or names for the supplied int value.
static java.lang.String lookup(java.lang.Object value,
java.lang.String filter)

Returns the MQSeries constant name or names for the supplied value of type Integer, String, byte[], or char[].
static java.lang.String lookupCompCode(int reason)

Convenience method for finding the constant name for a completion code.
static java.lang.String lookupReasonCode(int reason)

Convenience method for finding the constant name for a reason code.
static void main(java.lang.String[] args)

How do I get a client to disconnect?

I had a question from a customer who asked how they can reduce the number of client connections in use.  They had tried setting a disconnect interval (DISCINT) on the channel, but the connections were like weeds – you kill them off, and they grow back again.

DISCINT is “the length of time after which a channel closes down, if no message arrives during that period”.  This sounds perfect for most people.   The application is in an MQGET, and if no messages arrive, the channel can be disconnected, and the application gets connection broken.   The application can then decide to disconnect or reconnect.
If the application is not in an MQGET, then it will get notified of the broken connection next time it tries to use MQ.

Independent applications

Many applications are well written in that when they get Connection Broken, they just reconnect again, and so the DISCINT has no effect on reducing the number of connections. This may be good for availability but not for resource usage.   It may be good to have 1000 application instances running the day, but perhaps not overnight when there is no work to do.   Ive seen instances where the applications do an MQGET every minute, and with 1000 instances this can use a lot of CPU and doing no useful work.  In this case you want unused application instances to stop, and be restarted when needed.

You cannot use triggering with client connections (unless you have a very smart trigger monitor to produce an event which says start a client program over there).

Use automation periodically check the queue depth, and number of input handles. If there is a high queue depth, or a low number of handles(eg 2)  then start more application instances, across your back-end servers.  Your applications can then disconnect if they have not received a message within say 10 minutes.  This should keep the right number of application instances active.

An administrator should be able to get this automation set up, but getting the application to connect could be a challenge, as this requires the application developer to change the code!

Running under a web server

If your applications are running under a web server you may have mis-configured connection pools.  You can specify the initial size of the pool, and this many connections are made.  As more connections are needed, then more can be added to the pool until the pool maximum is reached. You should specify a time out value, so periodically the pool gets cleaned up, and unused connections are removed, until the pool is back to the initial size.  You should review the initial size of the pools ( is it too large), and the value of the time out value.

This should just be an administrative change.

Good luck, you may be successful in reducing the number of client connections, but do not set your hopes too high.

How do I make my MDB transactional?

I found from the application trace  that my MDB was doing MQGET, MQCMIT in the listener, and MQOPEN, MQPUT, MQCLOSE and no MQCMIT in my application.    Digging into this I found that the MQPUT was NO_SYNCPOINT, which was a surprise to me!

My application had session = connection.createSession(true, 1); // true = transactional. So I expected it to work.

The ejb-jar.xml had

    transaction-type Container
    trans-attribute NotSupported

I changed NotSupported to Required and it worked.


The application trace for the Listener part of the MDB gave me

Operation      CompCode MQRC HObj (ObjName) 
MQXF_XASTART            0000 -
MQXF_GET       MQCC_OK  0000    2 (JMSQ2 )
MQXF_XAEND              0000 -
MQXF_XAPREPARE          0000 -
MQXF_XACOMMIT           0000 -

The trace for the application part of the MDB gave me

Operation                    CompCode MQRC HObj (ObjName)
MQXF_XASTART                             0000         –
MQXF_OPEN             MQCC_OK   0000         2 (CP0000 )
MQXF_PUT                MQCC_OK   0000          2 (CP0000 )
MQXF_CLOSE           MQCC_OK   0000          2 (CP0000 )
MQXF_XAEND                                0000         –
MQXF_XAPREPARE                       0000 –
MQXF_XACOMMIT                        0000 –

and the put options had _SYNCPOINT.

I had read documentation saying that you needed to have XAConnectionFactory instead of ConnectionFactory.  I could not get this work,  but found it was not needed for JMS;  it may be needed for JDBC.

On Weblogic why isnt my MDB scaling past 10 instances?

This is another tale of one step back,  two steps sideways.  I was trying to understand why the JMX data on the MDBs was not as I expected, and why I was not getting tasks waiting.  I am still working on that little problem, but in passing I found I could not get my MDBs to scale.  I have rewritten parts of this post multiple times, as I understand more of it.  I believe the concepts are correct, but the implementation may be different to what I have described.

There are three parts to an MDB.

  1. A thread gets a message from the queue
  2. The message is passed to the application “OnMessage() method of the application
  3. The application uses a connection factory to get a connection to the queue manager and to the send the reply.

Expanding this to provide more details.

Thread pools are used to reuse the MQ connection, as the MQCONN and MQDISC are expensive operations.  By using a thread pool, the repeated MQCONN and MQDISC can be avoided.

There is a specific pool for the application, and when threads are released from this pool, they are put into a general pool.   Periodically  threads can be removed from the general pool, by issuing MQDISC, and then deleting the thread.

Get the message from the queue

The thread has two modes of operation Async consume – or plain old fashioned MQGET.

If the channel has SHARECNV(0) there is a  listener thread which browses the queue, and waits a short period( for example 5 seconds)  for a message.  There is a short wait, so that the thread can take action if required ( for example stop running).  This means if there is no traffic there is an empty MQGET every 5 seconds.   This can be expensive.

If the channel has SHARECNV(>0) then Asyn consume is used.  Internally there is a thread which browses the queue, and multiple threads which can get the message.

The maximum number of threads which can get messages is defined in the ejb-jar.xml activation-config-property-name maxPoolDepth  value.

These threads are in a pool called EJBPoolRuntime.  Each MDB has a thread pool of this name, but from the JMX data you can identify the pool as the JMS descriptor has a descriptor like MessageDrivenEJBRuntime=WMQ_IVT_MDB, Name=WMQ_IVT_MDB, ApplicationRuntime=MDB3, Type=EJBPoolRuntime, EJBComponentRuntime=MDB3/… where my MDB was called MDB3.

The parameters are defined in the ejb-jar.xml file.   The definitions are documented here.  The example below shows how to get from a queue called JMSQ2, and there will be no more than 37 threads able to get a message.

          activation-config-property-name maxPoolDepth  
            activation-config-property-value 37
          activation-config-property-name destination 
            activation-config-property-value JMSQ2

Note:  I did get messages like the following messages which I ignored ( as I think they are produced in error)

    • <Warning> <EJB> <BEA-015073> <Message-Driven Bean WMQ_IVT_MDB(Application: MDB3, EJBComponent: MDB3.jar) is configured with unknown activation-config-property name maxPoolDepth>
    • <Warning> <EJB> <BEA-015073> <Message-Driven Bean WMQ_IVT_MDB(Application: MDB3, EJBComponent: MDB3.jar) is configured with unknown activation-config-property name destination>

The default value of maxPoolDepth is 10 – this explains why I only had  10 threads getting messages from the queue.

Passing the message to the application for processing.

Once a message is available it will pass it to the OnMessage method of the application. There is some weblogic specific code, which seems to add little value. The concepts of this are

  1. There is an array of handles/Beans of size max-beans-in-free-pool.
  2. When the first message is processed, create “initial-beans-in-free-pool” beans and populate the array invoke the EJBCreate() method of the application.
  3. When a message arrives, then find a free element in this array,
    1. If the slot has a bean, then use it
    2. else allocate a bean and store it in the slot.   This allocation invokes the EJBCreate() method of the application.  On my laptop it took a seconds to allocate a new bean, which means there is a lag when responding to a spike in workload.
    3. call the OnMessage() method of the application.
  4. If all of the slots are in use – then wait.
  5. On return from the OnMessage() flag the entry as free
  6. Every idle-timeout-seconds scan the array, and free beans to make the current size the same as the in initial-beans-in-free-pool.  As part of this the EJBRemove() method of the application is invoked.

The definitions are documented here.

      max-beans-in-free-pool 47
      initial-beans-in-free-pool 17 
      idle-timeout-seconds 60

I could find no benefit in using this pool.

The default max-beans-in-free-pool is 1000 which feels large enough.  You should make the initial-beans-in-free-pool the same or larger than the number of threads getting messages, see maxPoolDepth above.

If this value is too small, then periodically the pool will be purged down to the initial-beans-in-free-pool and then beans will be allocated as needed.  You will get a periodic drop in throughput.

Note the term max-beans-in-free-pool is not entirely accurate: the maximum number of threads for the pool is current threads in pool + active threads.   The term max-beans-in-free-pool  is accurate when there are no threads in use.

In the JMX statistics data, there is information on this pool.   The data name is likecom.bea:ServerRuntime=AdminServer2, MessageDrivenEJBRuntime=WMQ_IVT_MDB, Name=WMQ_IVT_MDB, ApplicationRuntime=MDB3, Type=EJBPoolRuntime, where WMQ_IVT_MDB is the display name of the MDB, and MDB3 is the name of the jar file.  This allows you to identify the pool for each MDB.

Get a connection and send the reply – the application connectionFactory pool.

The application typically needs to issue an MQCONN, MQOPEN of the reply to queue, put the message, and issue MQDISC before returning.   This MQCONN, MQDISC is expensive so a pool is used to save the queue manager connection handle between calls.  The connections are saved in a thread pool.

In the MDB java application there is code like ConnectionFactory cf = (ConnectionFactory)ctx.lookup(“CF3”);

Where the connectionFactory CF3 is defined in the resource Adapter configuration.

The connectionFactory cf can then be used when putting messages.

The logic is like

  • If there is a free thread in the connectionFactory pool then use it
  • else there is no free thread in the connectionFactory pool
    • if the number of threads in the connectionFactory pool at the maximum value, then throw an exception
    • else create a new thread (doing an MQCONN etc)
  • when the java program issues connection.close(), return the thread to the connectionFactory pool.

It looks like the queue handle is not cached, so there is an MQOPEN… MQCLOSE of the reply queue for every request.

You configure the connectionFactory resource pool from: Home, Deployments, click on your resource adapter, click on the Configuration tab, click on the + in front of the javax.jms, ConnectionFactory, click on the connectionFactory name, click on the Connection Pool tab, specify the parameters and click on the save button.
Note: You have to stop and restart the server or redeploy the application to pick up changes!

This pool size needs to have enough capacity to handle the case when all input threads are busy with an MQGET.

JMX provides statistics with a description like com.bea: ServerRuntime=AdminServer2, Name=CF3, ApplicationRuntime=colinra, Type=ConnectorConnectionPoolRuntime, ConnectorComponentRuntime=colinra_colin where CF3 is the name of the connection pool defined to the resource adapter, colinra is the name I gave to the resource adapter when I installed it, colin.rar is the name of the resource adapter file.

Changing userids

The application connectionFactory pool can be used by different MDBs.  You need to make sure this pool has enough capacity for all the MDBs using it.

If the pool is used by MDBs running with different userids, then when a thread is obtained, it the thread was last used for a different userid, then the thread has to issue MQDISC and MQCONN with the current userid, this defeats the purpose of having a connection pool.

To prevent this you should have a connection pool for MDBs running with the same userid.

Getting a userid from the general pool may have the same problem, so you should ensure your pools have a maxium limit which is suitable for expected peak processing, and initial-beans-in-free-pool for your normal processing.   This should reduce the need to switching userids.

Cleaning up the connectionFactory

When the connectionFactory is configured, you can specify

  • Shrink Frequency Seconds:
  • Shrink Enabled: true|false

These parameters effectively say after after the “Shrink Frequency Seconds”, if the number of threads in the connectionFactory pool is larger than the initial pool size, then end threads (doing an MQDISC) to reduce the number of threads to the  initial pool size.   If the initial pool size is badly chosen you may get 20 threads ending, so there are 20 MQDISC, and because of the load, 20 threads are immediately created to handle the workload.  During this period  there will be insufficient threads to handle the workload, so you will get a blip in the throughput.

If you have one connectionFactory pool being used by a high importance MDB and by a low importance MDB, it could be that the high importance MDB is impacted by this “release/acquire”, and the low priority MDB is not affected.  Consider isolating the connectionFactory pools and specify the appropriate initial pool size.

To find out what was going on I used

  • DIS QSTATUS(inputqueue)  to see the number of open handles.   This is  listener count(1) + current number of threads doing MQGETS, so with maxPoolDepth = 19, this value was up to 20.
  • I changed my MDB application to display the instance number when it was deployed.
    •  ...
       private final static AtomicInteger count = new AtomicInteger(0); 
      public void ejbCreate() {
        SimpleDateFormat sdftime = new SimpleDateFormat("HH:mm:ss.SSS"); 
        Timestamp now = new Timestamp(System.currentTimeMillis());
        instance = count.addAndGet(1); 
        System.out.println(sdftime.format(now) +":" + this.getClass().getSimpleName()+":EJBCreate:"+instance);
      public void ejbRemove()
                           +" messages processed "+messageCount);

This gave me a message which told me when the instance was created, so I could see when it was started.   I could then see more instances created as the workload increased.


  • By using a client connection, I could specify the appltag for the connection pool and so see the number of MQCONNs from the application connectionFactory.

What happens if I get the numbers wrong?

  1. If the input queue is slow to process messages, or the depth is often too high, you may have a problem.
  2. If ejb-jar.xml maxPoolDepth is too small, this will limit the number of messages you can process concurrently.
  3. The weblogic max-beans-in-free-pool is too small. If the all the beans in the pool(array) are busy, consider making the pool bigger.   The requests queue in the listeners waiting for a free MDB instance.   However the JMX data has fields with names like “Wait count”.   In my tests these were always zero, so I think these fields are of no value.
  4. The number of connections in the connectionFactory is too small.  If the number of requests exceeded the pool size the MDB instance got an exception.  MQJCA1011: Failed to allocate a JMS connection.  You need to change the resource adapter definition Max Capacity for the connectionFactory pool size.
  5. If you find you have many MQDISC and MQCONNs happening at the same instance, consider increasing the initial size of the connectionFactory pool.
  6. Make the initial values suitable for your average workload.  This will prevent  the periodic destroy and recreate of the connections and beans.


You may want to have more than one weblogic server for availability and scalability.

You could also deploy the same application with a different MDB name, so if you want to stop and restart an MDB, you have another MDB processing messages.