How do I process messages on the dead letter queue (DLQ)?

I was setting up security on my system, and using AMS to protect messages. I kept getting messages on the Dead Letter Queues. As messages on the DLQ have been around from before MQ V1 was shipped (they hit this problem in development), I was expecting that to process them would be easy. There are some good bits and some not so good bits with the IBM supplied solution. I was reminded of a “call and response narration” game we enjoyed in the pub from when I was a student which went ..

They are building a house in the street – (audience) Boo!
A public house – (audience)Hooray!
They don’t sell beer – (audience) Boo!
They give it away – (audience) Hooray!

For a supplied Dead Letter Queue handler it goes…

MQ provides a Dead Letter Handler program (runmqdlq) – Hooray!
On z/OS (CSQUDLQH) and midrange (runmqdlq). – Hooray.
It is rule based and can handle many scenarios – Hoorary!
But not some of the difficult ones – Boo!
The provide a set of sample programs on mid range (amqsdlq) – Hooray!
But they are not well documented, didn’t build straight off, and not available on z/OS – Boo.
It can process many similar messages in one go- Hooray,
But not process just one message – Boo.

Why are messages put on the DLQ?

If a local application tries to put a message to a queue, and the queue is full then the application gets a return code, and takes an action. The message is not lost – it wasn’t created, and the DLQ was not used. If a message comes in from another queue manager, and the channel tries putting the message and gets queue full, it cannot just throw the message away. It puts it onto the DLQ.

Messages could be put on a DLQ for many reasons.

  • A message came in from a remote queue manager and was put to a local queue, but the queue was at max depth, so was put to the DLQ. This may be due to a short lived problem. The DLQ handler can process the DLQ queue, and every 60 seconds try moving the message from the DLQ back to the original. You can configure the rules so if it tries 5 times and fails, then it moves the message to a different queue.
  • A message came in from a remote queue queue manager, but the channel userid was not authorised to put to the queue. In this case retrying every 60 seconds is unlikely to solve the problem. The administrator needs to take an action, such as grant access and retry the put, or remove the message.
  • When AMS is used, if an ID tries to get the message and there are problems, such as the ID of the signer of the message is not authorised, the message is put to the SYSTEM.PROTECTION.ERROR.QUEUE queue. To resolve this, the AMS configuration needs to be changed, or the message moved to a quarantine queue. Once the configuration has been changed, put the message back on the queue for retry.

The runmqdlq handler provided with MQ

This is a bit of a strange beast. It is rule based so you can configure rules to select messages with certain properies and take actions, such as retry, or move to a different queue.

The program on midrange is runmqdlq, and on z/OS CSQUDLQH.

The syntax for runmqdlq is

runmqdlq [-u userid] MYDEAD.QUEUE QMA <qrule.rul

you have to pipe the file into stdin, until an empty line is processed. I would have preferred a -f filename option.

To end runmqdlq, set the input queue to get(DISABLED) because Ctrl-C does not work.

It processes message silently, unless there are any problems, for example I got

Dead-letter queue handler unable to put message: Rule 6 Reason 2035.

I had several problem messages on the DLQ, but I could not specify one message and get runmqdlq to process it, so I had to write a program to move one message to a different queue, then I could use runmqdlq. There is lots of good stuff in runmqdlq, but doesn’t quite do the job.

Understanding the rules.

The rules are the same for z/OS as mid-range.

Messages are read from the specified DLQ queue, and processed with a set of rules. The rules are described here. You can select on properties in the MQMD or the DLQ header. For example

DESTQ(MYQUEUE) REASON(MQRC_Q_FULL) ACTION(RETRY) RETRY(5)

DESTQ(MYQUEUE) REASON(MQRC_Q_FULL) ACTION(FWD) FWD(MYQUEUEOVERFLOW) HEADER(YES)

DEST(INQ*) PERSIST(MQPER_NON_PERSISTENT ACTION(DISCARD)

DEST(INQ*) PERSIST(MQPER_PERSISTENT ACTION(LEAVE)

Runmqdlq wakes up on new messages, and scans the queue periodically (the default RETRYINT is 60 seconds). It keeps track of messages on the queue, for example how many times it has retried an operation. For each message it scans the rules until it finds the first matching rule, then takes the action.

For for the rules above

DESTQ(MYQUEUE) REASON(MQRC_Q_FULL) ACTION(RETRY) RETRY(5)

DESTQ(MYQUEUE) REASON(MQRC_Q_FULL) ACTION(FWD) FWD(MYQUEUEOVERFLOW) HEADER(YES)

If a messages destination was MYQUEUE, and the reason code was MQRC_Q_FULL, it retries the put to the queue, at most 5 times. After 5 attempts, the next time the first rule is skipped, the second rule is used, and the message is forwarded to the queue MYQUEUEOVERFLOW keeping the DLQ header.

DEST(INQ*) PERSIST(MQPER_NON_PERSISTENT ACTION(DISCARD)

For message destination INQ* and non persistent messages, then just discard them.

DEST(INQ*) PERSIST(MQPER_PERSISTENT ACTION(LEAVE)

For message destination INQ* and persistent messages, then just leave them on the queue, for some other processing.

If runmqdlq is restarted, then all processing is reset, as all state information is kept in memory.

You should have a strategy for processing the DLQ.

For example, see Planning for MQ Dead Letter Queue handling, because you do not want thousands of non persistent inquiry messages filling up the DLQ, and preventing important persistent messages from being put onto the DLQ.

You may want to provide an audit trail of messages on the DLQ, so when someone phones up and says “MQ has lost my message”, you can look in the DLQ error logs, and say, “no… it is still in MQ, on the PENDING_SECURITY_ACTION queue, waiting for the security people to give the userid permission to process the message”.

Writing your own DLQ handler

While the MQ provided program is pretty good, there are times when you need a bit more, for example

  • Writing an audit message for each message processed, and what action was taken.
  • Printing out information about the message, such as queue name, putter, reason code etc
  • Moving one message, based on message ID or Correlid to another queue.

A one pass application is not difficult to create, it is a typical server application. A multi-pass application is much harder as you need to remember which messages have been processed.

  • I do not know if it is better to get with convert or not, especially if you are using AMS.
  • Print message information. You can use printMD from the amqsbcg0.c sample to print the MD.
  • You can create a similar function for printing the DLQ header. You may have to handle conversion yourself, for example big-indian/little endian numbers
  • You can print a hex string such as msgid using

for (ii = 0 ; ii < sizeof(msgid) ; ii++)
printf(“%02hhX”,msgid[ii])

  • If you specify a msgid as a parameter, you can read a hex string into a byte array using the following. The arrray had to be unsigned char to for it to work,otherwise you get negative numbers

unsignchar msgid[24];
int i;
for (i = 0; i < sizeof(msgid); i++)
{
sscanf(pIn + (i * 2), “%2hhx”, &msgid[i]);
}

Remove the DLQ header if needed.

mqoo_server =… MQOO_SAVE_ALL_CONTEXT ;

MQGET(hConn,
serverHandle,
&mqmd,
&mqgmo,
lBuffer,
pBuffer,
&messageLength,
&mqcc,
&mqrc);

// move the format and CCSID from the DLQ back to the mqmd
memcpy(&mqmd.Format,&pMQDLH -> Format,sizeof(mqmd.Format));
memcpy(&mqmd.CodedCharSetId,&pMQDLH -> CodedCharSetId,sizeof(mqmd.CodedCharSetId));

mqpmo.Options += MQPMO_PASS_ALL_CONTEXT;
mqpmo.Context = serverHandle;
long lDLQH = sizeof(MQDLH);

MQPUT1( hConn,
&replyOD ,
&mqmd ,
&mqpmo,
messageLength -lDLQH, // reduce the data by the size of the DLQ
pBuffer+lDLQH,// point past the DLQ
&mqcc,
&mqrc );

Leave a comment