Planning for MQ Dead Letter Queue handling.

With MQ, if a message cannot be successfully delivered, it can be put on a Dead Letter Queue for later processing.

You can have multiple queues

The system dead letter queue, where the MQ puts messages it cannot processed,
Application dead letter queues, and application can put messages to a queue,
The AMS dead letter queue for messages which had errors during get or put, for example a certificate mismatch.

Messages can be put to these queues for a variety of reasons.

Transient problems
- If a channel is putting a message to a queue, and the queue is full, then the channel can put the message to the Dead Letter Queue. The DLQ handler can then try to put the message to the original queue, and retry a number of times after an interval. If the queue full condition was transient, then the DLQ handler is likely to succeed. If an application stops processing a queue, you can get quickly get thousands of messages on the DLQ queue.
- The queue is put disabled. A queue can be set to put disabled, for example to stop messages from going onto a queue during queue maintenance. Once the maintenance has been done the queue can have put enabled.
Administration
- The putting channel is not authorised to put to the queue, so the message gets put to the DLQ. An administrator needs to check to see if the putter is allowed to put the message. If so, fix the security and put the message back on original queue. If not remove the message, and educate the developer.
- An AMS protected message has a problem, for example an unauthorised user has signed a message, or the id getting the message does not have a certificate to decrypt a message. You need to resolve any local certificate problems, or send the original message back to the requester saying it is in error.
Application
- The message is too large for the queue. The administrator needs to educate the developer and/or make the queue max message size larger.

You may have a policy that non persistent messages for a particular queue which end up on the dead letter queue should be purged. Persistent message for another queue should have special treatment.

You may want administrators to be able to look at the meta data about a message, destination queue, MSGID, the list of recipients who can decrypt a message; but not to look at the message content.

Setting up your environment to cover these areas need considerable planning.

Implementing a solution

You want to try to keep the main DLQ close to empty, for example if your DLQ fills up with non persistent inquiries, then putting an important persistent message to the DLQ may fail.

You can use the runmqdlq program on midrange or CSQUDLQH on z/OS, to specify rules for automatic processing of messages on the DLQ.

You can select on attributes like original destination queue name, the reason why the message was on the DLQ, userid in the MQMD; and specify an action

Retry the put to the original queue
Move to another queue
Purge it
Leave it

When a message is processed on the DLQ, the rules are applied, and the action of the first matching rule is applied. For example

DESTQ(MYQUEUE) REASON(MQRC_Q_FULL) ACTION(RETRY) RETRY(5)
DESTQ(MYQUEUE) REASON(MQRC_Q_FULL) ACTION(FWD) FWD(MYQUEUEOVERFLOW) HEADER(YES)

This says that if a messages destination was MYQUEUE, and the reason code was MQRC_Q_FULL, it retries the put to the queue, at most 5 times. After 5 attempts, the first rule is skipped, the second rule is used, and the message is forwarded to the queue MYQUEUEOVERFLOW keeping the DLQ header.

DEST(INQ*) PERSIST(MQPER_NON_PERSISTENT ACTION(DISCARD)

For message destination INQ* and non persistent messages, then just discard them.

DEST(INQ*) PERSIST(MQPER_PERSISTENT ACTION(LEAVE))

For message destination INQ* and persistent messages, then just leave them on the queue, for some other processing.

If runmqdlq or CSQUDLQH is restarted, then all processing is reset.

Generic rules

For transient type problems you may want to consider

Non persistent messages for a set of queues get purged, dont even try to put them back on the queue.
Persistent messages for INVOICE* queues get moved to INVOICE_DLQ queue, where you have another DLQ monitor running on the queue.

For administrator type problems

You could pass non persistent messages to an admin_DLQ_NP queue, and have a program which reads the meta data, and prints it to a file, then deletes the original message
You could pass persistent messages to an admin_DLQ_P queue
- have a program which reads the meta data, and prints it to a file, and leaves the message on the queue.
- Using the meta information resolve the problem.
- Have another program which takes the msgid and correlid as input parameters, then puts the message on the original queue. (If there is only one message, you could use the default DLQ handler to do this.)

For AMS problems

This is complicated by having to use a different queue. If the DLQ handler tries to put to the AMS protected queue, it will be “protected” (enciphered) again. You need to use put the message to an alias queue, with the original queue as the target. On midrange Java and C clients can disable AMS processing, either by using an environment variable, or through the MQCLIENT.ini file. See here.
This is also complicated by possibly needing access to information in the payload, such as the list of recipients, and decrypting the message to get the DN of the signer.
- Once you have resolved the problem, have another program which takes the msgid and correlid as input parameters, and puts the message on the alias queue (if there is only one message you could use the default DLQ handler to do this).

How do I check it I have got it right?

It is worth putting a process in place to monitor the depth of the dead letter queue, and if it does not become empty a few a minutes, display the contents of the queue, and add rules to handle the residual messages.

I do not think that IBM provides a list of return codes of messages that it puts onto the DLQ, I think you’ll have to go through the list (over 500!), and put a rule in place for each one. If an application invents its own return codes, you may need rules for these as well.

My quick look at the list includes

Common problems

2053 (0805) (RC2053): MQRC_Q_FULL
2056 (0808) (RC2056): MQRC_Q_SPACE_NOT_AVAILABLE
2071 (0817) (RC2071): MQRC_STORAGE_NOT_AVAILABLE

3 thoughts on “Planning for MQ Dead Letter Queue handling.”

Pingback: How do I process messages on the dead letter queue (DLQ)? – ColinPaice
Eli Adler says:

November 29, 2021 at 7:24 pm

Hey, Colin.
Sounds to me like you never left IBM, after all.
Thanks for this piece; it was really informative.

Quick question now.

For your examples:
DEST(INQ*) PERSIST(MQPER_NON_PERSISTENT ACTION(DISCARD)
DEST(INQ*) PERSIST(MQPER_PERSISTENT ACTION(LEAVE)
————————
Shouldn’t there be 2 )) at the tail ends of these rules? Like:
DEST(INQ*) PERSIST(MQPER_NON_PERSISTENT ACTION(DISCARD) )
DEST(INQ*) PERSIST(MQPER_PERSISTENT ACTION(LEAVE) )
Otherwise, the number of ‘(‘ do not match the number of ‘)’ .

LikeLike

1. colin paice says:
  
  November 30, 2021 at 8:28 am
  
  Good spot – thank you.. I’ll update it
  
  LikeLike