Are your client connections not configured for optimum high availability?.

I would expect the answer for most people is – no, they are not configured for optimum high availability.

In researching my previous blog post on which queue manager to connect to, I found that the the default options for CLNTWGHT and AFFINITY may not be the best. They were set up to provide consistency from a previous release. The documentation was missing words “once you have migrated then consider changing these options”. As they are hard to understand, I expect most people have not changed the options.

The defaults are

  • CLNTWGHT(0)
  • AFFINITY(PREFERRED)

I did some testing and found some bits were good, predicable, and gave me High Availability other bits did not.

My recommendations for high availability and consistency are the complete opposite of the defaults:

  • use CLNTWGHT values > 0, with a value which would give you the appropriate load balancing
  • use AFFINITY(NONE)

There are several combination of settings

  • all clients use AFFINITY(NONE) – was reliable
    • CLNTWGHT > 0 this was reliable, and gave good load balancing
    • CLNTWGHT being >= 0 was reliable and did not give good load balancing
  • all clients use AFFINITY(PREFERRED) – was consistent, and not behave as I read the documentation
  • a mixture of clients with AFFINITY PREFERRED and NONE. This gave me weird, inconsistent behavior.

So as I said above my recommendations for high availability are

  • use CLNTWGHT values > 0, and with a value which would give you the appropriate load balancing.
  • use AFFINITY(NONE).

My set up

I had three queue managers set up on my machine QMA,QMB,QMC.
I used channels
QMACLIENT for queue manager QMA,
QMBCLIENT for queue manager QMB,
QMCCLIENT for queue manager QMC.

The channels all had QMNAME(GROUPX)

A CCDT was used

A batch C program does (MQCONN to QMNAME *GROUPX, MQINQ for queue manager name, MQDISC) repeatedly.
After 100 iterations it prints out how many times each queue manager was used.

AFFINITY(NONE) and clntwght > 0 for all channels

  • QMACLIENT CLNTWGHT(50), chosen 50 % on average
  • QMBCLIENT CLNTWGHT(20), chosen 20 % on average
  • QMCCLIENT CLNYWGHT(30), chosen 30 % on average.

On average the number of times a queue manager was used, was the same as channel_weight/sum(weights).
For QMACLIENT this was 50 /(50+20+30) = 50 / 100 = 50%. This matches the chosen 50% of the time as seen above.
I shut down queue manager QMC, and reran the test and got

  • QMACLIENT CLNTWGHT(50), chosen 71 % on average
  • QMBCLIENT CLNTWGHT(20), chosen 28 % on average
  • QMCCLIENT CLNYWGHT(30) not selected.
    For QMACLIENT the weighting is 50/ (50 + 20) = 71%. So this works as expected.

AFFINITY(NONE) for all queue manager and clntwght >= 0

The documentation in the knowledge centre says any channels with CLNTWGHT=0 are considered first, and they are processed in alphabetical order. If none of these channel is available then the channel is select as in the CLNTWGHT(>0) case above.

  • QMACLIENT CLNTWGHT(50) not chosen
  • QMBCLIENT CLNTWGHT(0) % times chosen 100%
  • QMCCLIENT CLNYWGHT(30) not chosen

This shows that the CLNTWGHT(0) was the only one selected.
When CLNTWGHT for QMACLIENT was set to 0, (so both QMACLIENT and QMBCLIENT had CLNTWGHT(0) ), all the connections went to QMA – as expected, because of the alphabetical order.

If QMA was shut down, all the connections went to QMB. Again expected behavior.

With

  • QMACLIENT CLNTWGHT(0)
  • QMBCLIENT CLNTWGHT(20)
  • QMCCLIENT CLNYWGHT(30)

and QMA shut down, the connections were in the ratio of 20:30 as expected.

Summary: If you want all connections (from all machines) to go the same queue manager, then you can do this by setting CLNTWGHT to 0.

I do not think this is a good idea, and suggest that all CLNTWGHT values > 0 to give workload balancing.

Using AFFINITY(PREFERRED)

The documentation for AFFINITY(PREFERRED) is not clear.
For AFFINITY(NONE) it takes the list of clients with CLNTWGHT(0), sorts the list by channel name, and then goes through this list till it can successfully connect. If this fails, then it picks a channel at random depending on the clntwghts.

My interpretation of how PREFERRED works is

  • it builds a list of CLNTWGHT(0) sorted alphabetically,
  • then creates another list of the other channels selected at random with a bias of the CLNTWGHT and keeps that list for the duration of the program (or until the CCDT is changed).
  • Any threads within the process will use the same list.
  • For an application doing MQCONN, MQDISC and MQCONN it will access the same list.
  • With the client channels defined above, for different machines, or different applications instances you may get a list when CLNTWGHT >0 .

For example on different machines, or different application instances the lists may be:

  • QMACLIENT, QMBCLIENT, QMCCLIENT
  • QMBCLIENT, QMACLIENT, QMCCLIENT
  • QMACLIENT, QMBCLIENT, QMCCLIENT (same as the first one)
  • QMCCLIENT,QMACLIENT,QMCCLIENT

I’ll ignore the CLNTWGHT(0) as these would be at the front of the list in alphabetical order.

With

  • QMACLIENT CLNTWGHT(50) AFFINITY(PREFERRED)
  • QMBCLIENT CLNTWGHT(20) AFFINITY(PREFERRED)
  • QMCCLIENT CLNYWGHT(30) AFFINITY(PREFERRED)

According to the documentation, if I run my program I would expect 100% of the connections to one queue manager. This is what happened.

If I ran the job many times, I would expect the queue managers to be selected according to the CLNTWGHT.

I ran my program 10 times and in different terminal windows., and each time QMC got 100% of the connections. This was not what was expected!

I changed the QMBCLIENT CLNTWGHT from 20 to 10 and reran my program, and now all of my connections went to QMA!

With the QMBCLIENT CLNTWGHT 18 all the connections went to QMA, with QMBCLIENT CLNTWGHT 19 all the connections went to QMC.

This was totally unexpected behavior and not consistent with the documentation.

I would not use AFFINITY(PREFERRED) because it is unreliable and unpredictable. If you want to connect to the same queue manager specify the channel name in the MQCD and use mqcno.Options = MQCNO_USE_CD_SELECTION.

Having a mix of AFFINITY PREFERRED and NONE

With

  • QMACLIENT CLNTWGHT(50) AFFINITY(PREFERRED)
  • QMBCLIENT CLNTWGHT(20) AFFINITY(NONE)
  • QMCCLIENT CLNTWGHT(30) AFFINITY(NONE)

All of the connections went to QMA.

With

  • QMACLIENT CLNTWGHT(50) AFFINITY(NONE)
  • QMBCLIENT CLNTWGHT(20) AFFINITY(PREFERRED)
  • QMCCLIENT CLNTWGHT(30) AFFINITY(NONE)

there was a spread of connections as if the PREFERRED was ignored.

When I tried to validate the results – I got different results. (It may be something to do with the first or last object altered or defined).

Summary: Having a mix of AFFINITY with values NONE and PREFERRED, it is hard to be able to predict what will happen, so this situation should be avoided.

How do I know which queue manager to connect to ?

Question: How difficult can it be to decide which queue manager to connect to?

Answer: For the easy, it is easy, for the hard it is hard.
I would not be surprised to find that in many applications the MQCONN(x) are not coded properly!


That is a typical question and answer from me – but let me go into more detail so you understand what I am talking about.
If you have only one queue manager then it is easy to know which queue manager to connect to – it is the one and only queue manager.
If you have more than one – it gets more complex. If your application has just committed a funds transfer request and the connection is broken

  • you may just decide to connect to any available queue manager, and ignore a possibly partial funds transfer request
  • or you might wait for a period trying to connect to the same queue manager, and then give up, and connect to another, and later worry about any orphaned message on the queue.

You now see why the easy scenario is easy, and for the hard one, you need to do some hard thinking and some programming to get the optimum response.

There is an additional complexity that when you connect to the same instance – it may be a highly available queue manager, and it may have restarted somewhere else. For the purposes of this blog post I’ll ignore this, and treat it as the same logical queue manager.

I had lots of help from Morag who helped me understand this topic, and gave me the sample code.

You have only one queue manager.

This is easy, you issue an MQCONN for the queue manager. If the connect is not successful, the program waits for a while and then retries. See – I said it was easy.

You have more than one queue manager, and getting a reply back is not important.

For example, you are using non persistent messages.

Your application can decide which queue manager it tries to connect to, or you can exploit queue manager groups in the CCDT.

On queue manager QMA you can define a client channels for it and also for queue manager QMB

DEF CHL(QMA) CHLTYPE(CLNTCONN) QMNAME(GROUPX) 
CONNAME(LINUXA) CLNTWGHT(50)…
DEF CHL(QMB) CHLTYPE(CLNTCONN) QMNAME(GROUPX)
CONNAME(LINUXB) CLNTWGHT(50)…

DEF CHL(QMA) CHLTYPE(SVRCONN) ...

On Unix these are automatically put into the /var/mqm/qmgrs/../@ipcc/AMQCLCHL.TAB file. This is a binary file, and can be FTPed to the client machines that need it.

You can use the environment variables MQCHLLIB to specify the directory where the table is located, and MQCHLTAB to specify the file name of the table (it defaults to AMQCLCHL). See here for more information.

FTP the files in binary to your client machine, for example into ~/mq/.
I did
export MQCHLLIB=/home/colinpaice/mq
export MQCHLTAB=AMQCLCHL.TAB

I then used the command
SET |grep MQ
to make sure those variables are set, and did not have MQSERVER set.

Sample MQCONN code (from Morag)….

MQLONG  CompCode, Reason;
MQHCONN hConn = MQHC_UNUSABLE_HCONN;
char * QMName = "*GROUPX";
MQCONN(QMName,
&hConn,
&CompCode,
&Reason);
// and MQ will pick one of the two entries in the CCDT.

The application connected with queue manager name *GROUPX . Under the covers the MQ code found the channel connections with QMNAME of GROUPX and picked one to use. The “*” says do not check the name of the queue manager when you actually do the connect. If you omit the “*” you will get return code MQRC_Q_MGR_NAME_ERROR 2058 (080A in hex) because “GROUPX” did not match the queue manager name of “QMA” or “QMB”. I stopped QMA, and reconnected the application, and it connected to QMB as expected.

Common user error:When I tried connecting with queue manager name QMA, this failed with MQRC_Q_MGR_NAME_ERROR because there were no channel definitions with QMNAME value QMA. This was obvious once I had taken a trace, looked at the trace, and had a cup of tea and a biscuit, and remembering I had fallen over this before. So this may be the first thing to check if you get this return code.

Using channels defined with the same QMNAME, if your connection breaks, you reconnect with the same queue manager name “*GROUPX” and you connect to a queue manager if there is one available. You can specify extra options to bias which one gets selected. See CLNTWGHT and AFFINITY. See the bottom of this blog entry.

You can use MQINQ to get back the name of the queue manager you are actually connected to (so you can put it in your error messages).

//   Open the queue manager object to find out its name 
od.ObjectType = MQOT_Q_MGR; // open the queue manager object
MQOPEN(Hcon, // connection handle
  &od, // object descriptor for queue
  MQOO_INQUIRE + // open it for inquire          
  MQOO_FAIL_IF_QUIESCING, // but not if MQM stopping      
  &Hobj, // returned object handle
  &OpenCode, // MQOPEN completion code
  &Reason); // reason code
// report reason, if any
if (Reason != MQRC_NONE)
{
printf("MQOPEN of qm object rc %d\n", Reason);
.....
}
// Now do the actual INQ
Selector = MQCA_Q_MGR_NAME;
MQINQ(Hcon, // connection handle
  Hobj, // object handle for q manager
  1, // inquire only one selector
&Selector, // the selector to inquire
0, // no integer attributes are needed
NULL, // so no integer buffer
  MQ_Q_MGR_NAME_LENGTH, // inquiring a q manager name
ActiveQMName, // the buffer for the name
&CompCode, // MQINQ completion code
&Reason); // reason code

printf("Queue manager in use %s\n",ActiveQMName);

You have more than one queue manager, and getting a reply back >is< important

Your application should have some logic to handle the case when your queue manager is running normally, there is a problem in the back end, and so you do not get your reply message within the expected time. Typical logic for when the MQGET times out is:

  • Produce an event saying “response not received”, to alert automation that there may be a problem somewhere in the back end
  • Produce an event saying “there is a piece of work that needs special processing – to manually redo or undo – update number…..”.
    • At a later time a program can get the orphaned message and resolve it.
    • You do not want an end user getting a message “The status of the funds transfer request to Colin Paice is … unknown” because the reply message is sitting unprocessed on the queue.
    • Note: putting a message to a queue may not be possible as the application may not be connected to a queue manager.

When deciding to connect to any available queue manager, or connect to a specific queue manager, there are two key options in mqcno.Options field:

  • MQCNO_CD_FOR_OUTPUT_ONLY. This means, do not use any data in the passed in MQCD <as the field description says – use it for output only>, but pick a valid and available channel from the CCDT, and return the details.
  • MQCNO_USE_CD_SELECTION. This means, use the information in the MQCD to connect to the queue manager

Sample code (from Morag) showing MQCONNX

MQLONG  CompCode, Reason;
MQHCONN hConn = MQHC_UNUSABLE_HCONN;
MQCNO cno = {MQCNO_DEFAULT};
MQCD cd = {MQCD_CLIENT_CONN_DEFAULT};
char * QMName = "*GROUPX";
cno.Version = MQCNO_VERSION_2;
cno.ClientConnPtr = &cd;
// Main connection - choose freely from the CCDT
cno.Options = MQCNO_CD_FOR_OUTPUT_ONLY;
MQCONNX(QMName,
&cno,
&hConn,
&CompCode,
&Reason);
: :
// Oops, I really need to go back to the same connection to continue.

MQDISC(...); // without this you get queue manager name error

// Using same MQCNO as earlier, it already has MQCD pointer set.

cno.Options = MQCNO_USE_CD_SELECTION;
MQCONNX(QMName,
&cno,
&hConn,
&CompCode,
&Reason);

Let me dig into a typical scenario to show the complexity

  • set mqcno.Options = MQCNO_CD_FOR_OUTPUT_ONLY
  • MQCONN to any queue manager
  • MQPUT1 of a persistent message within syncpoint
  • set mqcno.Options = MQCNO_USE_CD_SELECTION, as you now want the application to connect to the same queue manager if there is a problem
  • MQCMIT. After this you want to connect to the specific queue manager you were using
  • MQGET with WAIT
  • MQCMIT
  • set mqcno.Options = MQCNO_CD_FOR_OUTPUT_ONLY. Because the application is not in a business unit of work it can connect to any queue manager.

The tricky bit is in the MQGET with WAIT. If your queue manager needs to be restarted you need to know how long this is likely to take. It may be 5 seconds, it may be 1 minute depending on the amount of work that needs to be recovered. (So make sure you know what this time is.)

Let’s say it typically takes 5 seconds between failure of the queue manager the application is connected to, and restart complete. You need some logic like

mqget with wait..
problem....
failure_time = time_now()
waitfor = 5 seconds
mqcno.Options = MQCNO_USE_CD_SELECTION
loop:
MQCONN to specific queue manager
If this worked goto MQCONN_worked_OK
try_time = time_now()
If try_time - failure_time > waitfor + 1 second goto problem;
sleep 1 second
go to loop:
MQCONN_worked_OK:
MQOPEN the reply to queue
Reissue the MQGET with wait

problem:
report problem to automation
report special_processing_needed .... msgid...
mqcno.Options = MQCNO_CD_FOR_OUTPUT_ONLY
Go to start of program and connect to any queue manager

If you thought the discussion above was complex, it gets worse!

I had a long think about where to put the set mqcno.Options = MQCNO_USE_CD_SELECTION.  My first thoughts were to put it after the first MQCMIT, but this may be wrong.

With the logic

MQCONN
MQPUT1
MQCMIT

If the MQCMIT fails, it could have failed going to the the queue manager so the commit request did not actually get to the queue manager, and the work was rolled back, or the commit could have worked, but the response did not get to your application.

The application should reconnect to the same queue manager, issue the MQGET WAIT. If the message arrives then the commit worked, if the MQGET times out, treat this as the MQGET WAIT timed out case (see above), and produce alerts. This is why I decided to put the set mqcno.Options = MQCNO_USE_CD_SELECTION before the commit. You could just as easily had logic which checks the return code of the MQCMIT and then set it.

A bit more detail on what is going on.

You can treat the MQCD as a black box object which you do not change, nor need to look into. I found it useful to see inside it. (So I could report problems with the channel name etc). The example below shows the fields displayed as an problem is introduced.

Before MQCONNX
set MQCNO_CD_FOR_OUTPUT_ONLY
pMQCD->ChannelName is '' - this is ignored
pMQCD->QMgrName is '' - this is ignored
QMName is '*GROUPX'. -this is needed
==MQCONNX had return code 0 0
pMQCD->ChannelName is 'QMBCLIENT'.
pMQCD->QMgrName is '';
QMName is '*GROUPX'.
MQINQ QMGR queue manager name gave QMB Sleep
during the sleep endmqm -i QMB and strmqm QMB
After sleep
MQOPEN of queue manager object ended with reason code MQ_CONNECTION_BROKEN = 2009.
Issue MQDISC, this ended with reason code 2009

set MQCNO_USE_CD_SELECTION
pMQCD->ChannelName is 'QMBCLIENT' - this is needed
pMQCD->QMgrName is ''
QMName is '*GROUPX'.
MQCONNX return code 0 0
MQINQ queue manager name is QMB

Anything else on clients?

It is good practice to periodically have the clients disconnect and reconnect to do work load balancing. For example

You have two queue managers QMA and QMB. On Monday morning between 0800 and 1000 QMA is shut down for essential maintenance. All the clients connect to QMB. QMA is restarted at 1000 – but does no work, because all the clients are all connected to QMB. If your clients disconnect and reconnect then over time some will connect to QMA.

It is a good idea to have a spread of times before they disconnect, so if 100 clients connected at 0900, they disconnect and reconnect between 10pm and 3am to avoid all 100 disconnecting and reconnecting at the same time.

To get the spread of connections to the various queue managers, you need to use CLNTWGHT with a non zero value. If you omit CLNTWGHT, or specify a value of 0, then the channel chosen is the first alphabetically, in my case they would all go to QMA, and not to QMB.

I feel there is enough material on this for another blog post.

Not for humans but for search engines

As I stumble around trying got get things to work, I keep falling over little problems, so I thought I would blog the problems and solutions. So as the title says – not for humans but for search engines. If you hit these problem, then the search engine finds it for you

 

AMQ9660 AMQ9660E: SSL key repository: password stash file absent or unusable.
Issue DIS QMGR SSLKEYR to display the location of the files. Check the value is accurate – and has mixed case if applicable.  Check the file name is specified without a suffix.

AMQ6125 AMQ6125E: An internal IBM MQ error has occurred.
AMQ6184 AMQ6184W: An internal IBM MQ error has occurred on queue manager QMA.

Check your client channel has QMNAME() with a value

AMQ9647 AMQ9637E  Channel is lacking a certificate.
DISPLAY QMGR CERTLABL and check it is in the keystore.

AMQ9518 AMQ9518E : File ‘….AMQCLCHL.TAB’ not found.
In mqclient.ini you specified a fully qualified file name. You need to specify

ChannelDefinitionDirectory/var/mqm/qmgrs/QMA/@ipcc
ChannelDefinitionFile=AMQCLCHL.TAB

 

AMQ9641E AMQ9641  Remote CipherSpec error for channel ‘… to host ‘.

EXPLANATION:The remote end of channel…  on host ‘… has
indicated a CipherSpec error ‘SSLCIPH(‘ ‘) -> SSLCIPH(????)’. The channel did
not start.

Defining a channel with MQSERVER environment variable or having mqclient.ini with Channels:
ServerConnectionParms=…

The doc says:  The ServerConnectionParms attribute defines only a simple channel; you cannot use it to define a TLS channel or a channel with channel exits

 

MQRC_EPH_ERROR 2420 (0974) (RC2420)

Can also be due to invalid PCF ( see this return code from MQPUT1 page).

My problem was the mqmd.CodedCharSetId was set to -1 ( MQCCSI_EMBEDDED) because I copied it from an input message.

 

MQRC_QUEUE_MGR_NAME_ERROR: 2058 (080A, 80A) (RC2058)

I got this using the new ccdt in json format.

I used tail -n50 /var/mqm/errors/*01*|less
and this reported
AMQ9518E: File ‘/var/mqm/AMQCLCHL.TAB’ not found.
EXPLANATION: The program requires that the file ‘/var/mqm/AMQCLCHL.TAB’ is present and available.
There is some unclear description which gave me the hint about mqclient.ini.

I had set up ccdt.json, but had not set up mqclient.int.  I created this file with
CHANNELS:
ChannelDefinitionDirectory=.
ChannelDefinitionFile=ccdt.json

MQRC_HOST_NOT_AVAILABLE: 2538 (09EA) (RC2538)

  • You have specified a channel in MQCONNX and this is not in the CCDT, so if you have a channel called QMACLIENT, and use use “QM” or “QM*” both will give MQRC_HOST_NOT_AVAILABLE.
  • You had a network problem, for example the application gets MQRC_CONNECTION_BROKEN. If the next MQ verb the application issues is MQCONN or MQCONNX this will fail with MQRC_HOST_NOT_AVAILABLE. You need to issue MQDISC, or retry the MQCONN(X) a second time.
  • You specified a connection address like 127.0.0.1:1414 when it was expecting 127.0.0.1(1414).

MQRC_UNKNOWN_OBJECT_QMGR: 2086 (0826) (RC2086) with a client application

This can be caused when using a client connection and specifying a queue manager name of the format “*name” (for availability) . The application takes this queue manager name, and uses it in the MQOD.
If the first character of the Queue Manager Name is “*” then MQINQ should be used to retrieve the actual queue manager name, or do not use the “*name”.

MQRC_NOT_AUTHORIZED: 2035 (07F3) (RC2035) with MQCONNX

Trying to use MQCONNX to connect to a queue manger. The info from the Knowledge centre and the AMQ message say a blank userid or password was given. I also found the following can cause the same return code

  • mqcno.SecurityParmsPtr = 0;
  • csp.CSPPasswordLength = 0;
  • sp.CSPUserIdLength = 0;
  • csp.CSPPasswordPtr= 0;
  • csp.CSPUserIdPtr = 0;
  • csp.AuthenticationType != MQCSP_AUTH_USER_ID_AND_PWD;

MQRC_ENVIRONMENT_ERROR: 2012 (07DC) (RC2012) with MQCONNX

Trying to use MQCONNX with MQCNO_RECONNECT_Q_MGR or MQCNO_RECONNECT;

  • Not using threaded application. My C program was built with -lmqic instead of -lmqic_r -lpthread
  • SHRCONV = 0 on the channel definitions

MQRC_Q_MGR_NAME_ERROR: 2058 (080A) (RC2058)

  • export MQCHLLIB not pointing to correct location
  • export MQCHLTAB pointing to the wrong name, or not set and AMQCLCHL.TAB not found in the location pointed to by MQCHLLIB
  • remember to update your .profile so this does not happen again
  • you are using a CCDT and passed in a QMNAME of XXXX, for all channels with QMNAME XXXX none could connect to the queue manager in the conname.
  • You think you were using a mqclient.ini file … but are now in a different directory
  • You tried to connect with the queue manager name, and need to connect to the QM group name.
  • You forgot the * in front of the queue manager name when using groups.

MQRC_KEY_REPOSITORY_ERROR: 2381 (094D) (RC2381)

  • MQSSLKEYR not set to the keystore path and file name
  • you specified …/key.kdb instead of /key without the .kdb
  • remember to update your .profile so this does not happen again

 

MQRC_OPTIONS_ERROR:2046 (07FE) (RC2046)

During MQCONNX: mqcno.Options = MQCNO_CD_FOR_OUTPUT_ONLY + MQCNO_USE_CD_SELECTION;

Solved it using

  • mqcno.Options = MQCNO_CD_FOR_OUTPUT_ONLY + MQCNO_USE_CD_SELECTION
  • or
  • mqcno.Options = MQCNO_CD_FOR_OUTPUT_ONLY
  • but not both

Trying to connect as a client, but the queue manager specified does not have a matching entry in the CCDT.  You might need to specify *qmgrname to say pick any of these.

MQRC_CD_ERROR2277 (08E5) (RC2277)

I received message in the /var/mqm/error/*.LOG saying

AMQ9498E: The MQCD structure supplied was not valid.

EXPLANATION: The value of the ‘ChannelName’ field has the value ‘0’. This value is invalid for the operation requested.

This is only partially true. If you specify mqcno.Options=MQCNO_CD_FOR_OUTPUT_ONLY, this returns the name of the channel to you. In this case specifying a blank channel name is valid. If this options value is not specified, then a channel name is required.

AMQ9202E: Remote host not available, retry later.

EXPLANATION:
The attempt to allocate a conversation using TCP/IP to host ” for channel
QMZZZ was not successful. However the error may be a transitory one and it may be possible to successfully allocate a TCP/IP conversation later.

This is not strictly accurate.

In my MQCONNX I specified a channel name of QMZZZ which did not exist in the Client Channel Definition Table (CCDT).

  • Check the channel name in ClientConn.ChannelName
  • Specify mqcno.Options = MQCNO_CD_FOR_OUTPUT_ONLY so it ignores what is in the channel, and picks one from the entries in the CCDT.

AMQ9498E: The MQCD structure supplied was not valid.

EXPLANATION:
The value of the ‘ChannelName’ field has the value ‘0’. This value is invalid for the operation requested.
ACTION:
Change the parameter and retry the operation.

  • I got this when I specified a blank (not ‘0’ ) in the ChannelName field. If I specified mqcno.Options = MQCNO_CD_FOR_OUTPUT_ONLY I did not get this error message, as the specified channelname value is ignored. I fixed the problem by changing the MQCNO, not the MQCD

PCF: MQRCCF_MSG_LENGTH_ERROR: 3016 (0BC8) (RC3016)

I got this when using PCF and got my lengths mixed up, for example StrucLength was longer than the structure.

PCF: MQRCCF_CFST_PARM_ID_ERROR: 3015 (0BC7) (RC3015)

I got this when I issued INQUIRE_Q and passed in a channel name

PCF:MQRC_UNEXPECTED_ERROR 2195 (0893) RC2195

I also got back section MQIACF_ERROR_IDENTIFIER (1013) with a value of 2031619. I cant find what this means.
My problem was I had specified an optional section – but not a required one.

PCF:MQRCCF_CFST_PARM_ID_ERROR 3015 (0BC7) RC3015

I got this when using MQCMD_INQUIRE_Q, and I had specified MQCACF_Q_NAMES instead of MQCACF_Q_NAME ( no ‘s’).

If you look at MQCMD_INQUIRE_Q  it lists the valid options, and MQCA_Q_NAME is listed – but not MQCA_Q_NAMES.

 

Oracle WebLogic

<BEA-320084> BEA-320084 The user principals=[] does not have authorization to view the logs.

I got this when using JMX to access the data.  The userid had not been set up to get this log data.  See here.

Specifically I got these trying to access webLogic Type=WLDFAccessRuntime JMX data.

com.bea:ServerRuntime=...,Name=Accessor, Type=WLDFAccessRuntime,WLDFRuntime=WLDFRuntime
com.bea:ServerRuntime=...,Name=DataSourceLog, Type=WLDFDataAccessRuntime,...
com.bea:ServerRuntime=...,Name=DomainLog, Type=WLDFDataAccessRuntime,...
com.bea:ServerRuntime=...,Name=EventsDataArchive,Type=WLDFDataAccessRuntime,...
com.bea:ServerRuntime=...,Name=HTTPAccessLog, Type=WLDFDataAccessRuntime,...
com.bea:ServerRuntime=...,Name=HTTPAccessLog, Type=WLDFDataAccessRuntime,...
com.bea:ServerRuntime=...,Name=ServerLog, Type=WLDFDataAccessRuntime,...

<BEA-240003> BEA-240003

<Administration Console encountered the following error: weblogic.application.ModuleException: The following exception occurred while processing annotations: No EJBs found in the ejb-jar file ..   I got this when redeploying an MDB, using Redeploy this application using the following deployment files…  If I used Update this application in place … it worked.

 

<BEA-149265>  BEA-149265 

Failure occurred in the execution of deployment request with …. Error is: “weblogic.application.ModuleException: java.lang.ClassCastException: com.ibm.mq.connector.DefaultRuntimeHelperImpl cannot be cast to com.ibm.mq.connector.JCARuntimeHelper”

I got this when redeploying the MQ resource adapter in webLogic, and selecting Redeploy this application using the following deployment files:

When I used Update this application in place with new deployment plan changes. (A deployment plan must be specified for this option) it worked successfully.

 

Weblogic. Remember to update your deployment to reflect the new plan when you are finished with your changes.

You have changed a configuration, such as a Resource Adapter or MDB.  You have to restart the server, or redeploy the application to pick up changes.

 

Java jar command java.io.FileNotFoundException: -C (No such file or directory)

jar cvfm abc.jar -C /home/colinpaice/xyz/  . 
java.io.FileNotFoundException: -C (No such file or directory)

In the jar cvfm command, the m says use the manifest provided.  In this case -C was taken as the manifest – and so was not found.
Solution

jar cvfm abc.jar ./META-INF/MANIFEST.MF -C …

 

JMXQuery

Error connecting to JMX endpoint: Failed to retrieve RMIServer stub: javax.naming.ServiceUnavailableException [Root exception is java.rmi.ConnectException: Connection refused to host: 127.0.0.1; nested exception is: java.net.ConnectException: Connection refused (Connection refused)]

The web server was not set up to listen on the specified port – or the web server was not active.

 

Using Java and MQ

Java HotSpot(TM) 64-Bit Server VM warning: You have loaded library /opt/mqm/java/lib/libmqjbnd.so which might have disabled stack guard.

Ensure  you have -Djava.library.path=/opt/mqm/java/lib64 not -Djava.library.path=/opt/mqm/java/lib

MQWEB gets killed off after starting.

Gwydion kindly helped me with the following…. ( his words).

strmqweb is a shell script that sets some environment variables then calls the server start command to start Liberty. But, when you run it with the remote shell command, such as Ansible shell module, when the shell terminates it also kills the JVM child process, which kills the server.   The solution is to use nohup when running strmqweb, as described (half way down the page) here: https://ansibledaily.com/execute-detached-process-with-ansible.

Migrating your queue manager in an enterprise.

Migrating an isolated queue manager is not too difficult and some of it is covered in the MQ Knowledge Center.

Since my first blog post on this topic, Ive had some comments asking for more detailed steps… so I’ve added a section at the bottom.

Below are some other things you need to think about when working in an enterprise when you have test and production, multiple queue managers per application (for HA or load balancing) and multiple applications. I am focusing on midrange MQ, and not z/OS though many of the topics are relevant to all platforms.

Consider the scenario where you have 4 queue managers
QMA and QMB supporting MOBILE and STATUS applications,
QMX and QMY supporting PAYROLL application.

You want a low risk approach, so you decide to upgrade the STATUS application first. This application uses QMA and QMB which are also used by business critical application MOBILE. This would be a high risk change.
It would be safer to to first migrate application PAYROLL on QMX and QMY.

Looking at QMX and QMY.
You could migrate both queue managers the same weekend – this would be least work, but has a risk that you do not have a good fall back plan if it does not work as expected.
You could migrate QMX this weekend, and QMY next weekend if there were no problems found.
If QMX has problems you can continue using QMY while you resolve problems. If QMX has problem, then if QMY has problems or is shut down you have an availability issue, so you may want to define a new environment with QMZ (and the web server etc – so not a trivial task).

As well as production QMX and QMY you have test systems: You need to plan to migrate and test these pre-production systems before considering migrating production. While the test and production levels of MQ are different, you may want to freeze making application changes, and factor this in the plan.

If you have a machine with one MQ level of code, and multiple queue managers on it, you cannot just migrate one queue manager, as you delete the MQ executables and install the new version. You can use multiple installed levels of MQ – but you may have to migrate to this before exploiting it. See Multiple Installations.

Clustering. Remember to migrate your full repositories first – you might want to consider creating dedicated queue managers for your repositories if this is a problem.

License: You will need licenses for the versions of MQ you use. The MQ command dspmqver gives you information about your existing installation. Some licenses entitle you to support from IBM, others are for development or trial use.

There are three stages to migrating applications.

  • Run the applications with no changes on the upgraded system. These should run successfully, but MQ may do more checks, for example some data is meant to be on a 4 byte boundary. MQ now polices this.
  • Recompile the applications to use the newer MQ libraries. Some application MQ control blocks may be larger, and this may uncover application coding problems. For example uninitialised storage.
  • Exploitation of new function. Do this once you have successfully migrated the existing queue managers.

Testing: You need to test the normal working application, plus error paths, such as messages being put to a Dead Letter Queue, and making sure this process works.

Update your infrastructure harness: You need to review what new messages your automation processes, and what actions to take.
You need to decide what additional statistics etc to use and what reports you want to product for capacity planning, health review and day to day running.

You have to worry about applications coming in to your queue manager. For example what levels of MQ are they using. They may need to be rebuilt with the newer libraries. The client code may need to up upgraded. You can use the DIS CHS(…) RVERSION to display the level of MQ client code. Of course your challenge will be to get people outside of you organization to update their code – especially when they say they have lost the source to the program.

MQ is rarely used in isolation. You may need to upgrade web servers to a newer level which support the new level of MQ.

You may need to upgrade the hardware and operating system.

Going down to the next level of detail.

Exits

You need to check any exits you have can support new functions and different levels of control blocks. For example there are shared connections, and the MQMD can change size from release to release.

If you cannot have one exit that supports all level of MQ. You’ll have to manage how you deploy the exit matching the queue manager level.

TLS and SSL setup

  • You need to review the TLS and SSL support. Newer levels of MQ removes support for weaker levels of TLS.
  • You need to review the end user certificates to make sure they are using supported levels of encryption.
  • You need to review the cipherspecs used by SSL channels, and upgrade them before you migrate the queue manager. (You could migrate to a newer version and see which channels fail to start, then fix them, but this is not so good).
  • As part of this cipherspec review you may wish to upgrade to strong cipher specs which use less CPU, or can be offload on z/OS.
  • You may have a problem sharing keystores, and make sure you include the keystore files in your backups. See APAR IT16295.

Building your applications

  • In some environments application developers compile programs on their own machines; in other environments, there is a process to generate applications on a central build machine. You will need to change the build environment to have the newer version header files, and change the build process to be able to use them.
  • You will need to set up a build environment so you can use the MQ V9 header files for just the application being migrated.
  • You many need to change your deploy tool so that the program compiled at MQ V9 is only deployed to TESTQMA, ( at MQ V9) and not to TESTQMB(still at MQ v7).
  • You need to change your deploy tool for test, pre-production and production.

Using the Client Channel Definition Table (CCDT)

  • Older clients must continue to use existing CCDT
  • Newer clients are able to understand older CCDTs.
  • For an application to use a newer version CCDT, you must update the MQ client.
  • So you need to be careful about moving the CCDT file around

System management applications

You may have home grown applications that are used to manage MQ. These need to be changed to support new object types( such as chlauth records and topics) and new fields on objects. You cannot rely on a field of interest being the 5th in the list as it was in MQ V5.

MQ Console (MQWEB)

If you are using the MQ Console server to provide a web browser or REST API to a queue manager, you may need to do extra work for this.

You have an instance of MQ Console to support MQ V9.0 and a different instance to support MQ 9.1

If you have multiple queue managers on a box, and plan to to use MQ Multiple Installation to migrate one queue manager at a time, then you will to consider the following

  • The box has QMA and QMB on it at MQ V9.0
  • These box use MQCONSOLE-XX with port 9090
  • Install MQ 9.1 on the same box.
  • Migrate QMA to 9.1
  • Create an MQCONSOLE-YY at MQ 9.1 with port 9191
  • Change your web browser URL and REST api apps to use port 9191
  • Wait for a week
  • Migrate QMB to 9.1
  • Migrate MQCONSOLE-XX to 9.1
  • Web browser URL and REST API url can continue using port 9090
  • Shutdown MQCONSOLE-YY
  • Undo any changes to change your web browser URL and REST api apps to use port 9191 and go back to port 9090

“The rest of the stuff”

I remember seeing a poster of child sitting on a potty with a caption saying “no job is complete until the paper work is complete”.

Someone said that doing the actual migration of a queue manager took 1 hour. Doing the paper work ; planning, change management, talking to user etc took two weeks per queue manager.

And yes, you do need to update your documentation!

Education

You need to talk to the teams around your organization. This is mainly applications – but other teams as well ( eg monitoring, networking)

  • Tell them what changes you will be making, the time scales etc..
  • There will be an application freeze during the migration.
  • The application teams will need to test their applications, and may need to make changes to them.
  • The application teams will get these new events/alerts which they need to handle.
  • You may learn about how they use MQ, and how this will affect your migration plans. (We used this unsupported program for which we have no source and no one knows how it works – which is critical to our business).
  • You may get a free trip to an exotic location to talk to the application teams (or you may get told to go to some hell hole)
  • You need to talk to people outside if your organization. The hard bit may be finding out who they are

Security

  • You need to protect any new libraries.
  • MQ may have new facilities such as topics which you need to develop and implement a security policy for. In V9 MQ midrange now publishes statistics to a topic.
  • Your tools for processing MQ security events, may need to be enhanced to handle new resource types or new events.

New messages and events

You need to review all new events or messages, and add automation to process them. You need to decide who gets notified, and what actions to take.

You need to review changed messages or alerts in case you are relying on “constant” information in particular place in the message, which has been changed.

Backups

People often dump the configuration of their queue managers every day, so they can use runmqsc to recreate the queues etc. You need to backup all objects including topics and chlauth records, and check you can recreated them in a queue manager.

Backup your mq libraries for queue manager and clients – or be able to redeploy them from your systems management software.

Performing the migration

This is documented in the Knowledge Centre. One path for migration involves deleting the old level of MQ and installing the new level of MQ. If you need to go back to the old level, you need to have a copy of the old level of MQ base + CSD level as you were running on!

Carefully check the documentation for the hops.

The Migration paths documentation says

  • You can migrate from V8.0 or later direct to the latest version.
  • To migrate from V7.0.1, you must first migrate to V8.0.
  • To migrate from V7.1 or V7.5, you must first migrate to V8.0 or V9.0.
  • You might have an extra step to go to MQ V9.1

I found some really old doc saying

“If you are still on MQ version 5.3, you should plan a 2 step migration: first migrate to MQ v7.0.1 then migrate to 7.1 or 7.5”. This could be a challenge as you can not get the MQ 7.0.1 or the MQ 7.1 product. One of the reasons for this two stage approach is that the layout of files changed, so you have to restart at MQ 7.0.1 to make these file system changes.

Finally…

If I have missed anything or got something wrong, please let me know and I’ll update the list

Checklist for implementation

Different stages

  • Pre-reqs
  • Education for team doing migration
  • Investigate – until you have done the investigation you cannot plan the work. For example how many exits are used, and how many need to be changed.
  • Plan. The first time you do something you may be slow. Successive times should be faster as you should know what you are doing.
  • Implement/Migrate
  • Exploit new features.

Before you start

  • People doing the work need access to systems
  • Need to draw up a schedule (but you may need to do the investigation work before you know how much work there is to do)
  • Appoint a team leader.
  • Determine what skills you need, eg TLS, application design, build
  • Which people do admin – which people handle code eg review exit programs
  • Reporting and status
  • Communication with other teams – we will be migrating in.. and you will be asked to do some work..
  • Extract configuration to common disk, so people do not need to access each queue manager.
  • External customers – provide one list of changes for them if possible. This is better than giving them multiple lists of changes, and will help them understand the size of their work.

Education

  • Ensure every one has basic knowledge
    • MQ commands
    • Unix commands
    • TLS and security( and stop using SSL)
    • Manage remote MQ from one site using remote runmqsc command or logon to each machine
    • Efficient way of processing data
      • Use GREP on a file to find something, pipe it … sort it, do not find things by hand
  • How the project will be tracked

Areas for migration

  • TLS parameters and using stronger encryption
  • TLS certificate strength
  • Exits
  • Applications
  • Queue manager
  • Clients using the queue manager. A client may be able to connect to many queue managers.

Investigate SSL/TLS

  • Which TLS parameters are being used
  • Which ones are not supported in newer versions of MQ?
    • SSLCIPH
    • Need to worry about both ends of the connection
  • Identify “right” TLS parameters to use
    • eg Strong encryption which can be offloaded on z/OS.
  • Will these cost more CPU? Is this a problem?
  • If TLS not being used – document this

Implement TLS

  • Need a plan to change any cipher specs which are out of support.
  • May need to make multiple changes across multiple queue managers at “same time” – coordinate different ends
  • Can be done before MQ migration.
  • Can be done AFTER MQ migration if you set a flag.
    • May make implementation easier
    • Still need a plan to change any which are out of support.
    • Still may need to make multiple changes across multiple queue managers at “same time”

Investigate certificates

  • Investigate if certificates are using weak encryption
    • Which certificates need to be changed? May need RACF/Security team to help report userids that need to change
  • Plan to roll out updated certificates
    • Include checking external Business partners
  • Investigate any other changes in MQ configuration
  • Check changes to your TLS keystore in APAR UT16295.

How to check a certificate

  • /opt/mqm/java/jre64/jre/bin/ikeycmd -cert -details -db key.kdb -label …
  • A password is required to access the source key database.
  • Please enter a password:
  • Label: CLIENT
  • Key Size: 2048
  • Serial Number: ….
  • Issued by: CN=colinpaiceCA, O=Stromness Software Solutions, ST=Orkney, C=GB ? Check this is still valid
  • Subject: CN=colinpaice, C=GB
  • Valid: From: Thursday, 17 January 2019 18:22:45 o’clock GMT To: Sunday, 31 May 2020 19:22:45 o’clock BST ? Check ‘to’ date
  • Signature Algorithm: SHA256withRSA (1.2.840.113549.1.1.11) ? I think this needs to be SHA256withRSA
  • Trust Status: enabled

Implement certificate change

  • This can be done at any time before migrating a queue manager

Investigate exits

  • Find which exits are being used
    • DIS CHL(*)… grep for EXIT
    • dis qmgr grep for exit
  • Queue manager and clients
  • /var/mqm/qmgrs/QMNAME/qm.ini, channel definition (grep for EXIT)
  • Check exits at the correct level on all queue managers and clients. (change date,size)
  • May need emails to business partner.
  • Do exits need to be converted from 31 bit to 64 bit?
  • Locate exit source
  • Review source
  • Control blocks may be bigger
  • May have to support new functions, eg shared connections
  • Is function still needed?
  • Document exit usage

Implement exit changes

  • Recompile all exits and deploy to all platforms before you do any migration work – check no problems
  • Change and test exits
  • Need to change build tools to allow builds with new levels of header files etc, and roll out to selected queue managers
  • Should work on old and new releases
  • May need a V9.1 MQ to test exits on before migration
  • Can be deployed before MQ Migration ? Or do you have requirements for specific levels of exits.
  • Create documentation for exits

Investigate applications

  • External business partners as well as internal
  • Need to get named contact for each application
  • Check level of MQ client code
  • Check TLS options
  • Identify where connection info is stored (AMQCHLTAB)
  • What co-req products need to be updated
    • Web servers
  • Is there test suite which includes error paths etc.
  • Identify build and deploy tools
  • Need capability to compile application using newer MQ header files, and deploy to one MQ

Implement application changes

  • Need to have change freeze during migration
  • Build project plan
  • Duration for testing
  • Which systems to be used for testing
  • Create process to update MQ client code
  • Make sure there is process to roll out changes in future
  • Need to allow buffer

Application recompile

  • Recompile programs using existing libraries and jar files -to make sure every think works before you migrate
  • Deploy and test
  • Change deploy process to use new versions of libraries
  • Recompile using newer versions of libraries
  • Deploy and test
    • Any problems found need to be validated at previous levels, or have conditional statements around it
  • Once all queue managers upgraded
    • Comment out code for compiling with previous libraries
    • To prevent accidents
    • In case of problems in production (before migration) needing a fix.

Investigate queue managers

  • Does the hardware need to be upgraded?
  • Are there any coreqs – eg multi instance or HA environments?
  • Any co-reqs eg upgrade web server, database?
  • Does the Operating System need to be upgraded
    • For example MQ now 64 bit. Early versions were 31 bit
    • Newer versions of Java
  • Identify which applications run on this queue manager
    • Need plan for each application
  • Identify pre-reqs
    • TLS
    • Exits

Plan how you are going to update the Client Channel Definiton Table

  • If you migrate a queue manager, then its CCDT will be migrated to the newer level.
  • Clients cannot use a CCDT from a higher level queue manager.
  • If you migrate your clients to the latest level you will have no problems with the CCDT
  • If you migrate the CCDT owner queue manager first, you need to be careful about copying the CCDT to other machines, to prevent a mismatch.

Plan queue managers

  • Plan software and hardware upgrades
  • Identify order of queue manager migration
    • Test, pre-prod, production
    • Full repositories then partial
      • Consider setting up new QM just for full repository?
    • Do one server, test applications, do other servers
    • Need to worry about multi instance and HA queue. These need to be coordinated and done at the same time.
  • Check license for MQ
  • May need to migrate queue manager multiple times
    • from MQ V5.3 to V7.x
    • from 7.x to V9.0
    • from 9.0 to 9.1
  • Clients first/later

Automation

  • Need to set up automation for new messages and new events

Backups etc

  • Make sure you have back up your queue managers (and other tools such as build configuration files before you make any changes).

Do the migration

  • Follow the MQ knowledge centre.