MQ has the ability to trace the activity of a message to its destination. As chart-ware, this statement this is 100% accurate and looks like a great facility. However, as with many bits of chart-ware the implementation is very difficult. I can see why people tend not to use this function.
What MQ facilities are used
- With an MQMD you can set a report option MQRO_COA, confirm on arrival. This says when the message is put to the (remote) queue, send a copy of the MQMD (optionally + payload) with msgtype of report, to the replyTo queue and replyTo queue manager .
- With an MQMD you can set a report option MQRO_COD, confirm on delivery. This says when the message application process the message on the (remote) queue, send a copy of the MQMD (optionally + payload) with msgtype of report, to the replyTo queue and replyTo queue manager.
- With an MQMD you can set a report option MQRO_ACTIVITY. When the message moves through the network, the channels reports the get and send, and the receive and put of the message. The messages produced have an embedded PCF message within the payload. (A message of format Embedded PCF, MQHEPCF, allow you to have a PCF followed by other data, for example application data).
- Your application does some work and sends a reply to the replyTo queue and replyTo queue manager with msgtype of reply. (For example “Transfer £1,000,000 to Colin Paice – has been done”)
You tie all of these pieces together with the msgid and correlid of the original message.
When a message is put on a transmission queue, it has a transmission header, with its own MQMD etc. Your message is embedded within this transmission queue message, so your msgid and correlid are still available.
What information is available to you?
- When your message is put to the queue, the MQMD has the time the message was put.
- From the activity record for the sending channel, you have the time the message was got from the transmission queue, and from the embedded message you have the put time . You can now calculate the duration spent on the transmission queue.
- From the activity record for the sending channel, you have the time the data was sent over the network.
- From the activity record for the receiving channel, you have the time the data was received over the network. This should be close to the time it was sent. If not then the clocks on the two systems may not be in sync, and you can calculate a correction factor.
- From the activity record for the receiving channel you have the time the message was put to the queue.
- From the Confirm on Arrival record, this should be the same (or close) to the time the receiving channel put the message to the queue. Apply the correction factor.
- From the Confirm on Delivery record, you know what time the message was got by the application, and so you can calculate duration = Time of Delivery – Time of Arrival, to see how long the message was on the server queue. If the message is persistent, then this will include time for the channel to force the data to the log, and do disk IO. You should be able to calculate the duration between the message being put, and the time it is processed to see the total life time of the message.
The time of day values are in the format HHMMSShh where hh is hundredths of a second. This allows you to get granularity down to 0.01 of a second.
You now have a detailed picture of where time is being spent en route to the remote queue. Is it the send channel being slow to the message, or does the message gets to the remote end quickly, and the application is slow to process it. Wow – really useful stuff.
It now starts to get messy.
These messages arrive at the specified reply to queue, and reply to queue manager.
To support using these report options, your application needs some logic like
If this is a reply message, then carry on as usual.
Else move message using MQPUT1 to MY.ACTIVITY.PROCESSING.QUEUE, so you can process these outside of the application.
How do you process MY.ACTIVITY.PROCESSING.QUEUE ?
This is the hard bit.
You have a mixture of PCF type messages, and your original message, (perhaps with a format of MQSTR) with the MQMD saying “report message”.
MQ provides programs which can process some PCF format messages on a queue, and another program for printing (in hex) non PCF messages. There is no program which can handle both message types and do the calculations to extract all of the useful data.
Im working on getting my Python programs to do all of this processing.
If I have made this more complex than necessary, because I have overlooked some capability, please let me know.
Select a queue manager on the main window as your source queue manager and then choose Action->Trace Message from the menu. Fill in your target queue and queue manager, and press send. You’ll get a picture like this:-
You can view the data is several ways, right click on the picture and choose Display Type. The Display Options part of the right click context menu changes what is shown in the above picture. Another display type that might be useful is a breakdown of the steps by looking at Display Type->Nodes:-