Connect to Liberty, the clever way, to give different qualities of service.

While I was investigating two TCP/IP stacks I discovered you can set up Liberty Web Server to support different classes of service depending on TCP Host name, and port number.

You can configure <httpEndpoint…> with a host and port number, and point to other set up parameters and so configure

the host name
the httpsPort number
the maximum number of active connections for this definition
which keyring to use as the trust store
which keyring to use as the key store
which certificate the server should use in the key store
which TLS protocols for example TLS 1.2 or 1.3
what logging you want done: date,time, userid, url, response time
which file you want the access logging information to be written to
which sites can/cannot use this, the addressExcludeList and addressIncludeList.

How do you set up another http address and port ? It is really easy – just define another set of definitions!

Why would you want to do this?

You may want to restrict people’s access to the server. For example external people are told to access the server using a specified port, and you can specify which cipher specification should be used, and what trust store is used to validate a client authentication request.

You may want want to restrict the number of connections into a port, and have a port for administrators so they can always logon.

How do I do this?

You need to define another httpEndpoint. This in turn points to

I set up a file called colin.xml and included it in the server.xml file.

<server> 
 <httpEndpoint id="colinstHttpEndpoint" 
   host="10.1.1.2" 
   accessLoggingRef="colinaccessLogging" 
   sslOptionsRef="colinSSLRefOptions"
   httpsPort="29443"> 

   <tcpOption   
     addressIncludeList="10.1.*.*" 
     maxOpenConnections="3" /> 
 </httpEndpoint> 
 
 <sslOptions 
   id="colinSSLRefOptions" 
   sslRef="colinSSLOptions" 
 /> 

 <httpAccessLogging id="colinaccessLogging" enabled="true"/> 

 <ssl clientAuthentication="true" 
   clientAuthenticationSupported="true" 
   id="colinSSLOptions" 
   keyStoreRef="racfKeyStore" 
   trustStoreRef="racfTrustStore"                                                                             
   serverKeyAlias="ZZZZ" 
   sslProtocol="TLSv1.2" /> 
                                                                                
 <keyStore filebased="false" id="racfKeyStore" 
   location="safkeyring://START1/KEY" 
   password="password" readOnly="true" type="JCERACFKS"/> 
                                                                                                   
 <keyStore filebased="false" id="racfTrustStore" 
   location="safkeyring://START1/TRUST" 
   password="password" readOnly="true" type="JCERACFKS"/> 

</server>

Where do the security violations go for MQ on z/OS?

This question came in from a customer who was reviewing the subsystem security on z/OS. For example CICS reports its own violations.

MQ security violations are reported by the security manager, RACF, and are displayed on the job log.

MQ delegates the security checks to RACF, so auditing is mostly done by RACF. The only exception is the RESLEVEL profile, which MQ writes its own audit records to RACF.

See a section in the IBM documentation.

For example, userid COLIN is not authorised to issue MQ commands, so there are messages on the job log.

%CSQ9 START CHINIT
ICH408I USER(COLIN ) GROUP(SYS1 ) NAME(COLIN PAICE )
CSQ9.START.CHINIT CL(MQCMDS )
INSUFFICIENT ACCESS AUTHORITY
FROM CSQ9.** (G)
ACCESS INTENT(CONTROL) ACCESS ALLOWED(NONE )
%CSQ9 DEF QL(AAAA)
ICH408I USER(COLIN ) GROUP(SYS1 ) NAME(COLIN PAICE )
CSQ9.DEFINE.QLOCAL CL(MQCMDS )
INSUFFICIENT ACCESS AUTHORITY
FROM CSQ9.** (G)
ACCESS INTENT(ALTER ) ACCESS ALLOWED(NONE )

Trying to use a queue

ICH408I USER(COLIN ) GROUP(SYS1 ) NAME(COLIN PAICE )
CSQ9.ZZZZ CL(MQQUEUE )
INSUFFICIENT ACCESS AUTHORITY
ACCESS INTENT(UPDATE ) ACCESS ALLOWED(NONE )

The queue had been define with AUDITING FAILURES(READ)

Another queue had been defined with NOTIFY(COLIN). This means that whenever there was a violation, userid COLIN got a message sent to its TSO session.

RACF reports violations and audit information to SMF. You can use standard RACF facilities, such as RACF report writer, to process the SMF data.

Using RACF report writer

This RACFRW command is documented in the Z/OS Security Server RACF Auditors Guide. (Note this is deprecated, but the replacement seems to leave it to the user to do all the summarising etc.)

//SMFDUMP EXEC PGM=IFASMFDP,REGION=0M
//SYSPRINT DD SYSOUT=A
//ADUPRINT DD SYSOUT=A
//OUTDD DD DISP=(MOD,CATLG),DSN=IBMUSER.SMF,
// SPACE=(CYL,(5,5)),
// DCB=(BLKSIZE=13000,RECFM=VB)
//SMFDATA DD DISP=SHR,DSN=SYS1.S0W1.MAN1
//SMFDATB DD DISP=SHR,DSN=SYS1.S0W1.MAN2
//SMFOUT DD DISP=(NEW,PASS,DELETE),SPACE=(CYL,(10,1)),
// DSN=&&SMFOUT
//SYSIN DD *
  INDD(SMFDATA,OPTIONS(DUMP))
  INDD(SMFDATB,OPTIONS(DUMP))
  OUTDD(SMFOUT,TYPE(020,030,080,081,083))
  DATE(2020221,2022221)
  START(0000)
  ABEND(NORETRY)
  USER2(IRRADU00) 
  USER3(IRRADU86) 
/* 
//S1  EXEC PGM=IKJEFT01,REGION=0M 
//SYSPRINT DD SYSOUT=* 
//SORTWK01 DD  DISP=(NEW,PASS,DELETE),SPACE=(CYL,(10,1)), 
//             DSN=&&SORT1 
//SYSTSPRT DD SYSOUT=* 
//RSMFIN  DD DISP=(SHR,DELETE),DSN=*.SMFDUMP.SMFOUT 
//SYSTSIN DD * 
  RACFRW TITLE('RACF REPORTS') GENSUM 
  SELECT VIOLATIONS 
  SUMMARY RESOURCE BY(USER) 
  END 
/*

The report gave


USER/                                               -------- I N T E N T S--------           
    *JOB                  SUCCESS WARNING VIOLATION ALTER CONTROL UPDATE READ TOTAL 

MQCMDS =+CSQ9.REFRESH.SECURITY                                                                                    
    COLIN      COLIN PAICE      0       0         1     1       0      0    0     1 

MQQUEUE=CSQ9.ZZZZ 
    ADCDC      ADCDC            0       0         1     0       0      1    0     1 
    COLIN      COLIN PAICE      0       0         6     0       0      6    0     6

From this we can see userid COLIN (with owner’s name COLIN PAICE) had 6 violations trying to get UPDATE access to the queue(MQQUEUE) ZZZZ in queue manager CSQ9.

The userid COLIN also tried to use the REFRESH SECURITY command. The + in +CSQ9, means that a generic profile was used. There was one violation, needing ALTER access.

Auditing successes

When the queue had AUDITING ALL(READ) it wrote a record for all accesses to the queue – success or failure.

using

//SYSTSIN DD *
RACFRW TITLE('RACF REPORTS') GENSUM
SUMMARY RESOURCE BY(USER)
END
/*

and no Select statement, it reported all records. I had an application which opened a queue for output, put a message to it, opened the queue for input, got the message. The output of RACFRW had

USER/                                               -------- I N T E N T S--------           
    *JOB            SUCCESS WARNING VIOLATION ALTER CONTROL UPDATE READ TOTAL                                                                     
MQADMIN = 
    COLIN   COLIN PAICE    8     0         0     0       0      0    0      8
MQQUEUE = CSQ9.ZZZZ 
    COLIN   COLIN PAICE   14     0         1     0       0     15    0     15
    IBMUSER                2     0         0     2       0      0    0      2

For every open/close of the ZZZZ queue, there were two opens for update, and and open of the MQADMIN class – with no object.

With AUDITING FAILURES(READ), so only failures of READ access or above are logged, the output was

USER/                                               -------- I N T E N T S--------           
    *JOB            SUCCESS WARNING VIOLATION ALTER CONTROL UPDATE READ TOTAL                                                                     
MQADMIN = 
    COLIN   COLIN PAICE    2     0         0     0       0      0    0      2

With an entry once for each job.

How to administer AMS policies, and use the set policy command.

I had been using the setmqspl command (on z/OS and midrange) to manage my AMS policies. This command has the drawback that if you want to change a policy, for example add a new recipipient, you had to specify the whole command. Jon Rumsey pointed out the mid range MQSC commands “set policy” and “display policy” which allow you to add, delete, or replace; recipients and signers.

Examples of midrange runmqsc set policy command

Exporting parameters
Add or remove recipients or signer
Changing other parameters

Exporting parameters

If you want to keep a copy of the AMS definitions you can use display policy command, but this gives output like RECIP(CN=BBB,C=GB), without quotes. The set policy command needs the value within single quotes. The dmpmqcfg command does not support AMS policies.

To be able to capture the output so you can reuse it, you need to use the dspmqspl -export command. This gives output like

setmqspl -m QMA -p ABC -s SHA512 -e AES256 -r “CN=BBB,C=GB” -c 0 -t 0

This gives the parameters if a format that can be used directly.

Add or remove recipients or signers

Using runmqsc define a policy using the default action(replace)

set policy(ABC) signalg(SHA512) recip(‘CN=AAA,C=GB’) ENCALG(AES256)

You can add a new recipient

set policy(ABC) signalg(SHA512) recip(‘CN=BBB,C=GB’) ENCALG(AES256) action(ADD)

You can now display it

DIS policy (ABC)
AMQ9086I: Display IBM MQ Advanced Message Security policy details.
POLICY(ABC) SIGNALG(SHA512)
ENCALG(AES256)
RECIP(CN=BBB,C=GB)
RECIP(CN=AAA,C=GB)
KEYREUSE(DISABLED)
ENFORCE

You can delete a recipient

set policy(ABC) SIGNALG(SHA256) ENCALG(AES128) RECIP(‘CN=AAA,C=GB’) action(remove)

and display it

DIS policy(Abc)
AMQ9086I: Display IBM MQ Advanced Message Security policy details.
POLICY(ABC) SIGNALG(SHA512)
ENCALG(AES256) RECIP(CN=BBB,C=GB)

KEYREUSE(DISABLED)
ENFORCE

You have to specify SIGNALG and/or ENCALG each time, but for action(REMOVE|ADD) it can have any valid value (except NONE). The value is only used when ACTION(REPLACE) is used, or ACTION() is omitted. The following will add the recipient, and not change the signalg or encalg values.

set policy(ABC) recip(‘CN=CCC,C=GB’) action(ADD) signalg(MD5) encalg(RC2)

You can specify multiple RECIP

set policy(ABC) signalg(SHA512) recip(‘CN=BBB,C=GB’) recip(‘CN=DDD,C=GB’) ENCALG(AES256) action(ADD)

or multiple signers

set policy(ABC) signalg(SHA512) signer(‘CN=BBB,C=GB’) signer(‘CN=DDD,C=GB’) ENCALG(AES256) action(ADD)

or multiple signers and recipients.

Changing other parameters

If want to change an algorithm, the tolerate|enforce that every message must be protected, or the key reuse, then you must use the action(replace), and specify all the parameters, so it might be easier to use setmqspl -m … -policy … -export, and output it to a file, then modify the file.

Administering AMS on z/OS

On z/OS (and mid-range) you have dspmqspl and setmqspl commands. With the setmqspl command, you replace the entire statement.

It is good practice to have a PDSE with all of your definitions in, one member per policy, or perhaps all policies in one member – depending on how many policies you have. If you have a problem with your queue manager, you have a copy of the definitions.

Another good practice is to take a copy of a definition before you make the change (and keep it unchanged), so you can roll back to it if you need to undo a change.

You can use the export command, to output all policies, or a selected policy. You can have this going into a sequential data set or a PDSE member. You might want to have two copies,

The before image – from before the change
The copy you update.

Of course you could always use the previous copy, but you cannot tell if someone has updated the definitions outside of your change control system, so taking a copy of the existing definitions is a good idea. You could always compare the previous copy, with the copy you just created to check there were no unauthorised changes.

You may want to make the same change to multiple queue managers, so having updates in a PDSE member is a good way of doing it. Just change the queue manager name and rerun the job.

On z/OS, remember to use the refresh command on the AMS address space for it to pick up any changes.

Compiling a C program on z/OS

You can compile programs in USS, or with JCL. I tend to prefer JCL, but do use USS (but it takes time to get it the command right). It took me several attempts to compile and bind a program that uses USS services.

I thought people might be interested in the JCL I use, and the C Compiler options I specified.

I’ll give the JCL (so you can see how much you understand of it) then I’ll annotate it

//ADCDC4 JOB 1,MSGCLASS=H,COND=(4,LE)
//S1 JCLLIB ORDER=CBC.SCCNPRC
// SET LOADLIB=IBMUSER.LOAD
// SET LIBPRFX=CEE
//COMPILE EXEC PROC=EDCCB,
// LIBPRFX=&LIBPRFX,
// CPARM='OPTFILE(DD:SYSOPTF),NOLSEARCH,LSEARCH(/usr/include/)',
// BPARM='SIZE=(900K,124K),RENT,LIST,RMODE=ANY,AMODE=31'
//COMPILE.SYSOPTF DD *
LIST,SOURCE
aggregate(offsethex) 
xref
SEARCH(//'ADCD.C.H',//'SYS1.SIEAHDR.H')
TEST
RENT ILP32
OE
INFO(PAR,USE)
NOMARGINS EXPMAC SHOWINC XREF
LANGLVL(EXTENDED) sscom dll
DEFINE(_ALL_SOURCE)
DEBUG/*
//COMPILE.SYSIN DD *
#pragma linkage(IRRSDL00 ,OS)
#line 26
#pragma runopts(POSIX(ON))
/*Include standard libraries */
#include <stdio.h> 
#include <stdlib.h> 
#include <string.h> 
#include <stdarg.h> 

int main( int argc, char *argv??(??))
{
printf("I'm here program %s. ",argv[0]);
for (int i = 1;i < argc; i++)
printf("Arg %d %s",i,argv[i]);
printf("\n");
}//BIND.SYSLMOD DD DISP=SHR,DSN=&LOADLIB.
//BIND.SYSLIB DD DISP=SHR,DSN=&LIBPRFX..SCEELKED
//BIND.OBJLIB DD DISP=SHR,DSN=COLIN.OBJLIB
//BIND.GSK DD DISP=SHR,DSN=SYS1.SIEALNKE
//BIND.CSS DD DISP=SHR,DSN=SYS1.CSSLIB
//BIND.SYSIN DD *
INCLUDE CSS(IRRSDL00)
NAME MAINPROG(R)
//ISTEST EXEC PGM=MAINPROG,REGION=0M,PARM='PARM1, Parm2'
//STEPLIB DD DISP=SHR,DSN=&LOADLIB
//SYSPRINT DD SYSOUT=,DCB=(LRECL=200) //SYSOUT DD SYSOUT=
//SYSERR DD SYSOUT=*

Annotate JCL

The compile options are defined here

//ADCDC4 JOB 1,MSGCLASS=H,COND=(4,LE) If the return code from each step is less equal to 4, then it does the next step. If the compile fails with return code 8, the job stops.
//S1 JCLLIB ORDER=CBC.SCCNPRC Use the procedures from this library
// SET LOADLIB=IBMUSER.LOAD A symbol used to say where to store the load module
// SET LIBPRFX=CEE This is used by the Compile procedure
//COMPILE EXEC PROC=EDCCB, Execute the C (component EDC) Compile and Bind
// LIBPRFX=&LIBPRFX, using this library prefix (defined above)
// CPARM=’OPTFILE(DD:SYSOPTF),LSEARCH(/usr/include/)’, Read C options from //SYSOPTF, and use /usr/Include to find header files
// BPARM=’SIZE=(900K,124K),RENT,LIST,RMODE=ANY,AMODE=31′ Binder parameters
//COMPILE.SYSOPTF DD * These additional C compiler options
LIST, Give the assembler output
SOURCE display the source of the program (useful for showing compile errors)
aggregate(offsethex) Show c structures with hex offsets
xref show where thing are used
SEARCH(//’ADCD.C.H’,//’SYS1.SIEAHDR.H’) Look for C header files in these libraries
TEST produce information for some debug tools.
RENT Produce re-entrant code
ILP32 produce 32 bit code ( compare to option LP64)
OE Use Posix standards when looking for header files
INFO(PAR,USE) Print out information messages. Par=Emits warning messages on unused. parameters. USE=Emits information about usage of variables ( eg defined but not used).
NOMARGINS Which columns to use. Default is columns 1-72
EXPMAC expand all macros in the source
SHOWINC display the source of any include in the output
LANGLVL(EXTENDED) Use language extensions, for example use of long long.
sscom Use of Slash Slash COMents “// comments…”
dll Produce code which can be used in a DLL. Useful for some USS type programs
DEFINE(_ALL_SOURCE) Create a #define ALL_SOURCE variable
DEBUG produce debugging information, such as line numbers in stack traces. It turns optimisation off.
/*
//COMPILE.SYSIN DD * Main program follows
#pragma linkage(IRRSDL00 ,OS) A C program called a z/OS function called IRRSDL00. This says it uses a standard z/OS parameter list
#line 26 This tells the C compiler to reset its line number – so the error messages come with the correct line number.
#pragma runopts(POSIX(ON)) This is so posix (Uss) posix functions can be used
/*Include standard libraries */
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <stdarg.h>
int main( int argc, char *argv??(??))
{
printf(“I’m here program %s. “,argv[0]);
for (int i = 1;i < argc; i++)
printf(“Arg %d %s”,i,argv[i]);
printf(“\n”);
}
//BIND.SYSLMOD DD DISP=SHR,DSN=&LOADLIB. Where to store the output from the bind
//BIND.SYSLIB DD DISP=SHR,DSN=&LIBPRFX..SCEELKED Use these libraries to resolve external routines
//BIND.OBJLIB DD DISP=SHR,DSN=COLIN.OBJLIB Load the compiled output from here
//BIND.CSS DD DISP=SHR,DSN=SYS1.CSSLIB My program needs a z/OS specific function. Get it from here
//BIND.SYSIN DD *
- INCLUDE CSS(IRRSDL00) My program needed the z/OS callable service IRRSDL00 which is in the library SYS1.CSSLIBs
NAME MAINPROG(R) This is the name of the program. It is stored in //BIND.SYSLMOD above
//ISTEST EXEC PGM=MAINPROG,REGION=0M,PARM=’PARM1, Parm2′
//STEPLIB DD DISP=SHR,DSN=&LOADLIB
//SYSPRINT DD SYSOUT=,DCB=(LRECL=200) it prints long lines of output up to 200 chars long
//SYSOUT DD SYSOUT=
//SYSERR DD SYSOUT=*

checkAMS: program to check your AMS defintions are consistent with z/OS keyring

A C program to verify that the certificates in MQ AMS configuration are in a RACF keyring. See here.

Overview of program

With AMS you specify the Distinquished Names(DN) of users who are allowed to sign or encrypt MQ messages. The certificates for these DN’s need to be in the xxxxAMSM’s drq.ams.keyring. If they are not present, or have problems, such as they are not valid, the messages from AMS are not very helpful. The messages are as helpful as “one of the DN’s in the configuration has a problem but I am not telling you which DN it was, nor what the problem was”.

CheckAMS has two parts:

Provide a useful list of information in the keyring
Takes the output of the AMS dspmqspl command, and checks the DN’s are in the key store

Provide a useful list of the contents of a keyring.

With the RACDCERT commands you can list the contents of a keyring, for example owner and label; and you can display details about a certificate, such as the DN of the subject, and the Certificate Authority, but you cannot issue one command to display all the important information, nor ask, “is the DN for this issuer in the keystore”.

Example output from checkAMS, listing certificates in keyring:

Subject CN=SSCARSA1024,OU=CA,O=SSS,C=GB                                                         
Issuer  CN=SSCARSA1024,OU=CA,O=SSS,C=GB                                                         
Self signed                                                                                     
Valid date range 21/02/13 12:32:33 to 24/02/13 12:32:33                                         
Owner irrcerta/LINUXCA                                                                          
Usage:Certauth Status:Trust                                                                     
                                                                                                           
Subject CN=colin,OU=longou,O=SSS                                                                
Issuer  CN=TEMP4Certification Authority,OU=TEST,O=TEMP                                          
Valid date range 21/03/25 00:00:00 to 22/03/25 23:59:59                                         
Owner COLIN/TEST                                                                                
Usage:Site Status:Trust

The first certificate is owned by irrcerta and has label LINUXA. Userid irrcerta means it belongs to CERTAUTH. The certificate is self signed, and has a long validity date. It has a usage of CERTAUTH, and is trusted.

The second certificate belongs to userid COLIN, it has label TEST. It has a subject DN of Subject CN=colin,OU=longou,O=SSS, and was issued by CN=TEMP4Certification Authority,OU=TEST,O=TEMP. It has a usage of Site, and is trusted.

Check the AMS set up

The program takes as input the output of the dspmqspl -m… -export command, and checks the DN against certificates in the keyring.

Example output

Userid START1, ring drq.ams.keyring                                                                                  
* Exported on Mon Mar 29 09:23:31 2021                                                                               
                                                                                                                      
dspmqspl -m CSQ9  -export                                                                                          
setmqspl -m CSQ9                                                                                                     
 -p AMSQ                                                                                                             
 -s SHA256                                                                                                           
 -a "CN=COLIN,O=SSS"                                                                                                 
   Owner COLIN/AMS Usage:Site Status:Trust Valid date range 21/03/21 00:00:00 to 22/03/21 18:45:00                  
 -a "O=aaaa, C=GB,CN=ja2"                                                                                            
 ! O=aaaa,C=GB,CN=ja2 Not found in key ring                                                                           
 -e AES256                                                                                                           
 -r "CN=COLIN,O=SSS"                                                                                                 
  Owner COLIN/AMS Usage:Site Status:Trust Valid date range 21/03/21 00:00:00 to 22/03/21 18:45:00                  
 -r "CN=ADCDB,O=SSS"

This shows the keyring was START1/drq.ams.keyring.

It prints out the exported file, and for the -a and -r records, it adds information about the certificate, or reports if it is not found.

It reports that “CN=COLIN,O=SSS” was found, the certificate belongs to userid COLIN,label AMS, it has usage of Site, it is trusted, and has a valid date.

It also reports O=aaaa,C=GB,CN=ja2 Not found in key ring This is because the definition in AMS has the wrong order. The standard order is CN=ja2,O=aaaa,c=GB. This certificate is in the keyring , but the program could not find it. I could not see a way of converting bad format DNs to good DNs.

Contents of package.

The package is on git.

FTP the amscheck.xmit.bit to z/OS as binary. Then use TSO receive indsn(amscheck.xmit) to create the load module in a PDS.

Upload runamsch, ccasmch, asmcheck. and parmlist.h to a PDS.

Edit and submit runamsch. It runs dspmqspl and puts the output into a temporary file. The parm PARM=’START1 drq.ams.keyring’ is for userid START1 and the keyring drq.ams.keyring. Your userid will need access to the userid’s keyring.

if you want to compile the program

If you want to compile the program, you can edit ccasmch, and change the SYSIN, and where the header file is imported from.

How do I process messages on the dead letter queue (DLQ)?

I was setting up security on my system, and using AMS to protect messages. I kept getting messages on the Dead Letter Queues. As messages on the DLQ have been around from before MQ V1 was shipped (they hit this problem in development), I was expecting that to process them would be easy. There are some good bits and some not so good bits with the IBM supplied solution. I was reminded of a “call and response narration” game we enjoyed in the pub from when I was a student which went ..

They are building a house in the street – (audience) Boo!
A public house – (audience)Hooray!
They don’t sell beer – (audience) Boo!
They give it away – (audience) Hooray!

For a supplied Dead Letter Queue handler it goes…

MQ provides a Dead Letter Handler program (runmqdlq) – Hooray!
On z/OS (CSQUDLQH) and midrange (runmqdlq). – Hooray.
It is rule based and can handle many scenarios – Hoorary!
But not some of the difficult ones – Boo!
The provide a set of sample programs on mid range (amqsdlq) – Hooray!
But they are not well documented, didn’t build straight off, and not available on z/OS – Boo.
It can process many similar messages in one go- Hooray,
But not process just one message – Boo.

Why are messages put on the DLQ?

If a local application tries to put a message to a queue, and the queue is full then the application gets a return code, and takes an action. The message is not lost – it wasn’t created, and the DLQ was not used. If a message comes in from another queue manager, and the channel tries putting the message and gets queue full, it cannot just throw the message away. It puts it onto the DLQ.

Messages could be put on a DLQ for many reasons.

A message came in from a remote queue manager and was put to a local queue, but the queue was at max depth, so was put to the DLQ. This may be due to a short lived problem. The DLQ handler can process the DLQ queue, and every 60 seconds try moving the message from the DLQ back to the original. You can configure the rules so if it tries 5 times and fails, then it moves the message to a different queue.
A message came in from a remote queue queue manager, but the channel userid was not authorised to put to the queue. In this case retrying every 60 seconds is unlikely to solve the problem. The administrator needs to take an action, such as grant access and retry the put, or remove the message.
When AMS is used, if an ID tries to get the message and there are problems, such as the ID of the signer of the message is not authorised, the message is put to the SYSTEM.PROTECTION.ERROR.QUEUE queue. To resolve this, the AMS configuration needs to be changed, or the message moved to a quarantine queue. Once the configuration has been changed, put the message back on the queue for retry.

The runmqdlq handler provided with MQ

This is a bit of a strange beast. It is rule based so you can configure rules to select messages with certain properies and take actions, such as retry, or move to a different queue.

The program on midrange is runmqdlq, and on z/OS CSQUDLQH.

The syntax for runmqdlq is

runmqdlq [-u userid] MYDEAD.QUEUE QMA <qrule.rul

you have to pipe the file into stdin, until an empty line is processed. I would have preferred a -f filename option.

To end runmqdlq, set the input queue to get(DISABLED) because Ctrl-C does not work.

It processes message silently, unless there are any problems, for example I got

Dead-letter queue handler unable to put message: Rule 6 Reason 2035.

I had several problem messages on the DLQ, but I could not specify one message and get runmqdlq to process it, so I had to write a program to move one message to a different queue, then I could use runmqdlq. There is lots of good stuff in runmqdlq, but doesn’t quite do the job.

Understanding the rules.

The rules are the same for z/OS as mid-range.

Messages are read from the specified DLQ queue, and processed with a set of rules. The rules are described here. You can select on properties in the MQMD or the DLQ header. For example

DESTQ(MYQUEUE) REASON(MQRC_Q_FULL) ACTION(RETRY) RETRY(5)
DESTQ(MYQUEUE) REASON(MQRC_Q_FULL) ACTION(FWD) FWD(MYQUEUEOVERFLOW) HEADER(YES)
DEST(INQ*) PERSIST(MQPER_NON_PERSISTENT ACTION(DISCARD)
DEST(INQ*) PERSIST(MQPER_PERSISTENT ACTION(LEAVE)

Runmqdlq wakes up on new messages, and scans the queue periodically (the default RETRYINT is 60 seconds). It keeps track of messages on the queue, for example how many times it has retried an operation. For each message it scans the rules until it finds the first matching rule, then takes the action.

For for the rules above

DESTQ(MYQUEUE) REASON(MQRC_Q_FULL) ACTION(RETRY) RETRY(5)
DESTQ(MYQUEUE) REASON(MQRC_Q_FULL) ACTION(FWD) FWD(MYQUEUEOVERFLOW) HEADER(YES)

If a messages destination was MYQUEUE, and the reason code was MQRC_Q_FULL, it retries the put to the queue, at most 5 times. After 5 attempts, the next time the first rule is skipped, the second rule is used, and the message is forwarded to the queue MYQUEUEOVERFLOW keeping the DLQ header.

DEST(INQ*) PERSIST(MQPER_NON_PERSISTENT ACTION(DISCARD)

For message destination INQ* and non persistent messages, then just discard them.

DEST(INQ*) PERSIST(MQPER_PERSISTENT ACTION(LEAVE)

For message destination INQ* and persistent messages, then just leave them on the queue, for some other processing.

If runmqdlq is restarted, then all processing is reset, as all state information is kept in memory.

You should have a strategy for processing the DLQ.

For example, see Planning for MQ Dead Letter Queue handling, because you do not want thousands of non persistent inquiry messages filling up the DLQ, and preventing important persistent messages from being put onto the DLQ.

You may want to provide an audit trail of messages on the DLQ, so when someone phones up and says “MQ has lost my message”, you can look in the DLQ error logs, and say, “no… it is still in MQ, on the PENDING_SECURITY_ACTION queue, waiting for the security people to give the userid permission to process the message”.

Writing your own DLQ handler

While the MQ provided program is pretty good, there are times when you need a bit more, for example

Writing an audit message for each message processed, and what action was taken.
Printing out information about the message, such as queue name, putter, reason code etc
Moving one message, based on message ID or Correlid to another queue.

A one pass application is not difficult to create, it is a typical server application. A multi-pass application is much harder as you need to remember which messages have been processed.

I do not know if it is better to get with convert or not, especially if you are using AMS.
Print message information. You can use printMD from the amqsbcg0.c sample to print the MD.
You can create a similar function for printing the DLQ header. You may have to handle conversion yourself, for example big-indian/little endian numbers
You can print a hex string such as msgid using

for (ii = 0 ; ii < sizeof(msgid) ; ii++)
printf(“%02hhX”,msgid[ii])

If you specify a msgid as a parameter, you can read a hex string into a byte array using the following. The arrray had to be unsigned char to for it to work,otherwise you get negative numbers

unsignchar msgid[24];
int i;
for (i = 0; i < sizeof(msgid); i++)
{
sscanf(pIn + (i * 2), “%2hhx”, &msgid[i]);
}

Remove the DLQ header if needed.

mqoo_server =… MQOO_SAVE_ALL_CONTEXT ;
MQGET(hConn,
serverHandle,
&mqmd,
&mqgmo,
lBuffer,
pBuffer,
&messageLength,
&mqcc,
&mqrc);
…
// move the format and CCSID from the DLQ back to the mqmd
memcpy(&mqmd.Format,&pMQDLH -> Format,sizeof(mqmd.Format));
memcpy(&mqmd.CodedCharSetId,&pMQDLH -> CodedCharSetId,sizeof(mqmd.CodedCharSetId));
mqpmo.Options += MQPMO_PASS_ALL_CONTEXT;
mqpmo.Context = serverHandle;
long lDLQH = sizeof(MQDLH);
…
MQPUT1( hConn,
&replyOD ,
&mqmd ,
&mqpmo,
messageLength -lDLQH, // reduce the data by the size of the DLQ
pBuffer+lDLQH,// point past the DLQ
&mqcc,
&mqrc );

You can teach an old MQ program(mer) new tricks!

I wrote a program which could be used with local bindings on Linux, or as a client. Doing what I have done for 25 years, and following the IBM documentation I had a makefile with a create for each type.

gcc -m64 -o mer me.o -L/opt/mqm/lib64 -Wl,-rpath=/opt/mqm/lib64 -Wl,-rpath=/usr/lib64 -lmqm
gcc -m64 -o merc me.o -L/opt/mqm/lib64 -Wl,-rpath=/opt/mqm/lib64 -Wl,-rpath=/usr/lib64 -lmqic

Where -lmqm was for local bindings, and -lmqic was for client bindings.

For about the last 10 years, you have only needed one executable, not two!

Thanks to Morag Hughson of MQGem who pointed this out and said You can make a client connection using something linked with mqm.lib. Just set MQ_CONNECT_TYPE to CLIENT. See here.

I only need one program mer, and do not need the client version merc. I used

export MQ_CONNECT_TYPE=CLIENT
export MQCCDTURL=/home/colinpaice/c/ccdt.json
./mer CSQ9 CP0000

and it worked! (First time)

This support has been there since MQ 7.1, so as long as you have compiled your programs with MQ 7.1 or later you can use this support.

I’ll drop an email to Hursley because the documentation for generating a program says, for example

C client application, 64-bit, non-threaded

gcc -m64 -o amqsputC_64 amqsput0.c -I MQ_INSTALLATION_PATH/inc -L MQ_INSTALLATION_PATH/lib64 -Wl,-rpath=MQ_INSTALLATION_PATH/lib64 -Wl,-rpath=/usr/lib64 -lmqic

C server application, 64-bit, non-threaded

gcc -m64 -o amqsput_64 amqsput0.c -I MQ_INSTALLATION_PATH/inc -L MQ_INSTALLATION_PATH/lib64 -Wl,-rpath=MQ_INSTALLATION_PATH/lib64 -Wl,-rpath=/usr/lib64 -lmqm

It would be good if they told you about this great facility, and not only have it hidden away.

You could just build it once, and set the environment variable.

Using it

The documentation for MQ_CONNECT_TYPE says this is for MQCONNX.

If your application uses MQCONNX, then it will try local, then try as a client (using MQCCDTURL environment variable), and you do not even need to specify MQ_CONNECT_TYPE. You can force it to use local or client by speciying MQ_CONNECT_TYPE.

My application was using the old style of MQCONN. For this to work I had to specify MQ_CONNECT_TYPE=CLIENT (and the MQCCDTURL).

You also might consider upgrading your application so you use MQCONNX instead of the MQCONN. All you need is

MQCNO cno = {MQCNO_DEFAULT}; /* Connect Options*/
cno.Options = … ;
change MQCONN to MQCONNX and add the &cno.

plus testing it(for several weeks) of course.

Convert MQCONN to MQCONNX and you get connection to the local machine or to a client automatically – you do not need the MQ_CONNECT_TYPE.

See, you can get an old application to do new tricks.

Planning for MQ Dead Letter Queue handling.

With MQ, if a message cannot be successfully delivered, it can be put on a Dead Letter Queue for later processing.

You can have multiple queues

The system dead letter queue, where the MQ puts messages it cannot processed,
Application dead letter queues, and application can put messages to a queue,
The AMS dead letter queue for messages which had errors during get or put, for example a certificate mismatch.

Messages can be put to these queues for a variety of reasons.

Transient problems
- If a channel is putting a message to a queue, and the queue is full, then the channel can put the message to the Dead Letter Queue. The DLQ handler can then try to put the message to the original queue, and retry a number of times after an interval. If the queue full condition was transient, then the DLQ handler is likely to succeed. If an application stops processing a queue, you can get quickly get thousands of messages on the DLQ queue.
- The queue is put disabled. A queue can be set to put disabled, for example to stop messages from going onto a queue during queue maintenance. Once the maintenance has been done the queue can have put enabled.
Administration
- The putting channel is not authorised to put to the queue, so the message gets put to the DLQ. An administrator needs to check to see if the putter is allowed to put the message. If so, fix the security and put the message back on original queue. If not remove the message, and educate the developer.
- An AMS protected message has a problem, for example an unauthorised user has signed a message, or the id getting the message does not have a certificate to decrypt a message. You need to resolve any local certificate problems, or send the original message back to the requester saying it is in error.
Application
- The message is too large for the queue. The administrator needs to educate the developer and/or make the queue max message size larger.

You may have a policy that non persistent messages for a particular queue which end up on the dead letter queue should be purged. Persistent message for another queue should have special treatment.

You may want administrators to be able to look at the meta data about a message, destination queue, MSGID, the list of recipients who can decrypt a message; but not to look at the message content.

Setting up your environment to cover these areas need considerable planning.

Implementing a solution

You want to try to keep the main DLQ close to empty, for example if your DLQ fills up with non persistent inquiries, then putting an important persistent message to the DLQ may fail.

You can use the runmqdlq program on midrange or CSQUDLQH on z/OS, to specify rules for automatic processing of messages on the DLQ.

You can select on attributes like original destination queue name, the reason why the message was on the DLQ, userid in the MQMD; and specify an action

Retry the put to the original queue
Move to another queue
Purge it
Leave it

When a message is processed on the DLQ, the rules are applied, and the action of the first matching rule is applied. For example

DESTQ(MYQUEUE) REASON(MQRC_Q_FULL) ACTION(RETRY) RETRY(5)
DESTQ(MYQUEUE) REASON(MQRC_Q_FULL) ACTION(FWD) FWD(MYQUEUEOVERFLOW) HEADER(YES)

This says that if a messages destination was MYQUEUE, and the reason code was MQRC_Q_FULL, it retries the put to the queue, at most 5 times. After 5 attempts, the first rule is skipped, the second rule is used, and the message is forwarded to the queue MYQUEUEOVERFLOW keeping the DLQ header.

DEST(INQ*) PERSIST(MQPER_NON_PERSISTENT ACTION(DISCARD)

For message destination INQ* and non persistent messages, then just discard them.

DEST(INQ*) PERSIST(MQPER_PERSISTENT ACTION(LEAVE))

For message destination INQ* and persistent messages, then just leave them on the queue, for some other processing.

If runmqdlq or CSQUDLQH is restarted, then all processing is reset.

Generic rules

For transient type problems you may want to consider

Non persistent messages for a set of queues get purged, dont even try to put them back on the queue.
Persistent messages for INVOICE* queues get moved to INVOICE_DLQ queue, where you have another DLQ monitor running on the queue.

For administrator type problems

You could pass non persistent messages to an admin_DLQ_NP queue, and have a program which reads the meta data, and prints it to a file, then deletes the original message
You could pass persistent messages to an admin_DLQ_P queue
- have a program which reads the meta data, and prints it to a file, and leaves the message on the queue.
- Using the meta information resolve the problem.
- Have another program which takes the msgid and correlid as input parameters, then puts the message on the original queue. (If there is only one message, you could use the default DLQ handler to do this.)

For AMS problems

This is complicated by having to use a different queue. If the DLQ handler tries to put to the AMS protected queue, it will be “protected” (enciphered) again. You need to use put the message to an alias queue, with the original queue as the target. On midrange Java and C clients can disable AMS processing, either by using an environment variable, or through the MQCLIENT.ini file. See here.
This is also complicated by possibly needing access to information in the payload, such as the list of recipients, and decrypting the message to get the DN of the signer.
- Once you have resolved the problem, have another program which takes the msgid and correlid as input parameters, and puts the message on the alias queue (if there is only one message you could use the default DLQ handler to do this).

How do I check it I have got it right?

It is worth putting a process in place to monitor the depth of the dead letter queue, and if it does not become empty a few a minutes, display the contents of the queue, and add rules to handle the residual messages.

I do not think that IBM provides a list of return codes of messages that it puts onto the DLQ, I think you’ll have to go through the list (over 500!), and put a rule in place for each one. If an application invents its own return codes, you may need rules for these as well.

My quick look at the list includes

Common problems

2053 (0805) (RC2053): MQRC_Q_FULL
2056 (0808) (RC2056): MQRC_Q_SPACE_NOT_AVAILABLE
2071 (0817) (RC2071): MQRC_STORAGE_NOT_AVAILABLE

Building amqsdlq sample Dead Letter Handler.

MQ provides a sample Dead Letter Queue Handler in /opt/mqm/samp/dlq/ (on Ubuntu). It looks and behaves just like runmqdlq. I think trying to extend it, for example to filter on MSGID or CORRELID would be difficult, and it would be easier to write a small program just to process one message with input queue, and pass filter parameters.

Build it

There are no instructions in the IBM documentation on how to build this. I used the following

create a directory to contain the code, for example mkdir ~/dlq
go to this directory cd ~/dlq
copy the code from MQ cp -r /opt/mqm/samp/dlq/* .
make the directory rw chmod +w ~/dlq
install a yacc compiler sudo apt install byacc
issue the make command make all -f amqodqx.mk

If you get

gcc -m64 -I. -I/opt/mqm/inc -c -o amqodqka.o amqodqka.c
Assembler messages:
Fatal error: can’t create amqodqka.o: Permission denied

Your directory is not read/write. Use the chmod command.

If you get

yacc amqodqma.y
make: yacc: Command not found

You need to install a yacc compiler. I used sudo apt install byacc .

I dont undestand why I got this message, because the build for the yacc files is commented out. It may be there is a default make for yacc.

The YACC ( compiler generator) produces a c file. If you are not changing the keywords, you could rename the .y and .l files, so they are not used, and so use the shipped c files. In this case you will not need the yacc compiler.

Run it

export ODQ_MSG=amqsdlq.msg This is needed for the message catalog
./amqsdlq SYSTEM.PROTECTION.ERROR.QUEUE QMA < /home/colinpaice/mqams/dlq.rul
- for a while it failed with a syntax error in line 1 of the file, I rebuilt it, then suddenly it worked.
- Ctrl-c ended it
./amqsdlq ? gave me a segmentation core dump

I changed the make file to build a client

cc -m64 -o amqsdlqc… -lmqic_r

You should ensure you use a TLS channel if you do run this as a client.

I have a message on the AMS DLQ – what can I do about it?

If AMS has problems with a protected message, AMS can put the message on the SYSTEM.PROTECTION.ERROR.QUEUE queue. This blog post discusses what you can do about it. I consider this a hard problem – not in the same league as trying to simulate the beginnings of the Universe – more like climbing Ben Nevis mountain in Scotland, when you are only used to walking down to the shops.

What are the problems?

There are several problems you need to consider

Why is the message on the queue? Is it a problem with the putting application, or with the getting environment?
Which user had the problem. For example it may not be obvious which application instance had a problem, if applications come in through one channel, and many users have the same MCA userid.
What you need to do about it to get the message reprocessed, and prevent future problems.

Why is the message on the queue

The messages could be on the queue because

The certificate was signed, but the DN of the signer is not in the setmqspl list of authorised signers (-a). This is an example of an MQ configuration problem
The user getting a message was not able to verify the signers certificate sent in the message, for example it is missing the CA of the signer, or missing the signers self signed certificate. This is an example of a user’s configuration problem.
The message was encrypted, but the user getting the message did not have access to a private certificate to allow the message to be decrypted. The user’s DN needs to be added to the recipients when the message is put and enciphered (or the user needs to be stopped from getting messages from this queue). This is an example of the putting of the message message is missing configuration information.

What other information is there to help me?

If you know which id had the problem, there should be information in the error logs. For example a problems within a Java JMS client program may write to mqjms.log.0. A local application may write to the queue manager’s error log, for example /var/mqm/qmgrs/QMA/error/*01*

In the mqjms.log.0 I got

5 April 2021 at 14:53:23 BST[main] com.ibm.mq.ese.prot.MessageProtectionBCImpl
The receiver of this encrypted message is not on the message recipient list ‘CN=ja2,O=aaaa,C=GB’
The certificate of a user that is receiving a message is not on the message RecipientsInfo list.
Verify that the user is on a recipients list in a security policy definition.
5 April 2021 at 14:53:23 BST[main] com.ibm.mq.ese.intercept.JmqiGetInterceptorImpl
The IBM MQ Advanced Message Security Java interceptor failed to unprotect the received message.
An error occurred when the IBM MQ Advanced Message Security Java interceptor was unprotecting the received message.
See subsequent messages in the exception for more details about the cause of the error
5 April 2021 at 14:53:23 BST[main] com.ibm.mq.ese.service.EseMQServiceImpl
The IBM MQ Advanced Message Security interceptor has put a defective message on error handling queue ‘SYSTEM.PROTECTION.ERROR.QUEUE ‘.

(On z/OS the messages are less helpful.)

On the SYSTEM.PROTECTION.ERROR.QUEUE queue there was a message with a Dead Letter Header (DLH) with a reason 2063 0x0000080f MQRC_SECURITY_ERROR.

From the MQMD you can see the time the message was put to the queue, the putting application, and the user identification. This userIdentified may have been set for example by the channel MCAUSER, or CHLAUTH, and so you do not always know where the original request came from.

If you are testing then you will know what caused the error.

If you are in a production like environment, you know the application, and as you will have configured all user keystores the same you may not need to know which specific user caused the problem. If there is a problem with a missing certificate, then you fix the problem, redeploy the keystore to all your users (as part of your automated process) and try again.

How do you tell what the problem is.

Your systems administrator needs a process for extracting meta data about messages, while keeping the application payload protected. You could build a process around displaying the recipients and signer from How do I find the recipients and signer of an AMS message? The systems administrator needs to know

the original queue name with the problem
the time, date, and user that put the message
the msgid and correlid of the message – so if you put it back on the queue, you know which message to process.
from the message, the type of protection: Integrity (it was signed), Encryption (it was encrypted), Privacy (it was encrypted with a signed payload)
any id information from the message, such as recipient DN’s, and the signer DN. See this post.
you may have to have some special processing to decrypt the Privacy payload, just to extract the signer information.

If this process can be automated, then any application content can be kept secure.

With this information and your “up to date application work book” (do you have one of these?) , you should be able to identify the problem.

Once you have fixed the root cause of the problem….

The fix may be to change the setmqspl to add an authorised signer, or to add a certificate to the recipients keystore, you need to get the message reprocessed.

You get the specific message from the message id and correlid.
You need to remove the DLQ header from the message.
You need some special set up for the queue. If you try to put to the original queue, it will get the AMS protection again, for example re-encrypted or resigned. You need a queue alias so you put to the queue alias bypassing any AMS processing.

There are lots of things you need to consider, which is why I consider this a hard problem.

This application would be a good example where message handle is used to “move” the message.

MQCRTMH(hConn,&cmho,&hMsg,&CompCode,&Reason);
gmo.MsgHandle = hMsg;
MQGET(hConn,….);
pmo.Action = MQACTP_FORWARD;
pmo.OriginalMsgHandle =hmsg
MQPUT(…)

I found Learn to code the MQ Message Property MQI calls from MQGEM software useful in understanding message handle and message properties, and how to delete the DLQ header.