Midrange now DIS APSTATUS command

This is a new command on 9.1.3 mid-range, part of the “uniform clustering” support .  (Uniform clustering  is what I would call connection balancing see Uniform clustering gets a tick from me).

For example  I have two instances of program oemput and it gave

dis apSTATUS('oemput') 
AMQ8932I: Display application status details.
  APPLNAME(oemput) CLUSTER( )
  COUNT(2) MOVCOUNT(0) 
  BALANCED(NOTAPPLIC)

and

dis apSTATUS('oemput') type(local)
AMQ8932I: Display application status details.
  APPLNAME(oemput) 
  CONNTAG(MQCT4509BF5D0368DB23QMA_2018-08-16_13.32.14oemput)
  CONNS(1) IMMREASN(NOTCLIENT)
  IMMCOUNT(0) IMMDATE( )
  IMMTIME( ) MOVABLE(NO)
AMQ8932I: Display application status details.
  APPLNAME(oemput) 
  CONNTAG(MQCT4509BF5D017BDB23QMA_2018-08-16_13.32.14oemput)
  CONNS(1) IMMREASN(NOTCLIENT)
  IMMCOUNT(0) IMMDATE( )
  IMMTIME( ) MOVABLE(NO)

 

There is a  different conntag for each instances of the program.  DIS QMGR QMGRID gives QMID(QMA_2018-08-16_13.32.14) .

The tags are MQCT4509BF5D017BDB23QMA_2018-08-16_13.32.14oemput and  MQCT4509BF5D0368DB23QMA_2018-08-16_13.32.14oemput.
(Thanks to eagle eyed Morag for pointing out the difference.)

What’s the difference between an MQ Message and a JMS Message

I had problems using the MQI Interface  to create a message for a JMS program to receive.

To see what was in the JMS message,  I used a Java program using JMS to write a message, and used my trusty C program to display it.

I could see that there were message properties in the message

Property 0 name <mcd.Msd> value <jms_text>
Property 1 name <jms.Dst> value <queue:///JMSQ1>
Property 2 name <jms.Rto> value <queue:///JMSQ2>
Property 3 name <jms.Tms> value <1571902099742>
Property 4 name <jms.Dlv> value <2>

These are described here.

The mcd.Msd value is one of jms_none, jms_text, jms_bytes, jms_map, jms_stream, jms_object.   This depends on whether you use Message message, BytesMessage message etc to define your message type.  The jms program receiving the message may be expecting a particular type

The jms.Rto comes from the message.setJMSReplyTo(…).  It was set in the MQMD.ReplyToQ  as well as the message property.

It took me some time to find how to specify value such as for deliveryMode.  I found it here.  For example  message.setDeliveryMode(DeliveryMode.NON_PERSISTENT).   (This comes from javax.jms.DeliveryMode.NON_PERSISTENT,not a com.ibm…. file).

I converted my simple program from JMQI to JMS, in a couple of hours, and was surprised to find it used fewer lines of code than using the JMQI.   Of course I may find I omitted some work, such as error handling, but it seems to be working OK.

Magic methods to decode Java MQ constants to strings.

I had been struggling with MQ and java, and decoding what the return codes numbers were, and found some well gem methods here.

String reasonCode = MQConstants.lookup(2035, “MQRC_.*”);  gave MQRC_NOT_AUTHORIZED

and

String decode  = MQConstants.decodeOptions(gmo.options,”MQGMO_.*”);  gave me

MQGMO_WAIT | MQGMO_SYNCPOINT_IF_PERSISTENT | MQGMO_FAIL_IF_QUIESCING

I wish I had these a couple of years ago – it would have saved me a lot of time!

 

The methods are

static java.lang.String decodeOptions(int optionsP,
java.lang.String optionPattern)

This helper method takes an integer representing a set of IBM MQ options for an MQI structure, and converts them into a string displaying the constants that the options represent.
static int getIntValue(java.lang.String name)

Returns the value of the named MQSeries constant as an int.
static java.lang.Object getValue(java.lang.String name)

Returns the value of the named MQSeries constant.
static java.lang.String lookup(int value,
java.lang.String filter)

Returns the MQSeries constant name or names for the supplied int value.
static java.lang.String lookup(java.lang.Object value,
java.lang.String filter)

Returns the MQSeries constant name or names for the supplied value of type Integer, String, byte[], or char[].
static java.lang.String lookupCompCode(int reason)

Convenience method for finding the constant name for a completion code.
static java.lang.String lookupReasonCode(int reason)

Convenience method for finding the constant name for a reason code.
static void main(java.lang.String[] args)

MQRC_DATA_LENGTH_ERROR with client

We had an application working on one system, and we moved it to another system, and we got MQ RC 2010 data length error. It turns out that the

SYSTEM.DEF.SVRCONN had MAXMSGL of 1 – so the maximum message sized allowed on this channel was 1 bytes.

You can specify the maximum msg length on the client for example the MQCD or client table – but I think the negotiation is the lower of the values at each end.

 

Setting the value to one on the z/OS end was part of stopping people using the default channel definitons.

Any port in a storm? No.

Ive just spent a day resolving a problem with specifying a port value trying to connect to MQ.

I had

public long port = 1414;
String channel = “MYCHANNEL”;
String hostname = “127.0.0.1”;
Hashtable<String, Object> h = new Hashtable<String, Object>();
h.put(MQConstants.PORT_PROPERTY, dd.port);h.put(MQConstants.CHANNEL_PROPERTY, channel);
h.put(MQConstants.TRANSPORT_PROPERTY, MQConstants.TRANSPORT_MQSERIES_CLIENT);
h.put(MQConstants.HOST_NAME_PROPERTY, hostname);
queueManager = new MQQueueManager(“QMA”,h);

(did you spot the problem?)

This failed with

MQConnection to QMA com.ibm.mq.MQException: MQJE001: Completion Code ‘2’, Reason ‘2538’.
Caused by: com.ibm.mq.jmqi.JmqiException: CC=2;RC=2538;AMQ9204: Connection to host ‘127.0.0.1(0)’ rejected.

This is saying it tried to connect with port 0!

I tried

String port = “1414”;, that failed the same way.

If I used

MQEnvironment.port=”1414″; it worked.

This was tough to resolve, as there is no documentation to help me.

Someone suggested public int port = 1414; and it worked!  What a way to spend a nice autumn day.

Whoops -deploying MDB in weblogic

I was quite happily using my MDB in webLogic, but when I changed its configuration, it did not pick up the new changes.  It took a day to find out why,  and I have learned much more about deploying MDBS.

My connection factory was using SYSTEM.DEF.SVRCONN, I changed it to use a different client channel. I stopped SYSTEM.DEF.SVRCONN, ( so I could check the change had worked), and restarted the webLogic instance.  I was surprised when my MDB failed to start, because the channel was stopped.   The MDB was trying to use that channel.  It took a lot of head scratching to get it to work as I expected.

  • I had messages like <BEA-015073>  Message-Driven Bean …  is configured with unknown activation-config-property name failIfQuiesce.  This message is wrong, failIfQuiesce is supported by the IBM Resource adapter.
  • I had the same message with activation-config-property name cfLookup.   This was my problem.  I should have specified connectionFactoryLookup.
  • If you have <activation-config-property-name>connectionFactoryLookup… (specified in the ejb-jar) any other parameters you specify in the ejb-jar.xml file are ignored.
  • If you do not specify a connectionFactoryLookup, nor properties in the ejb-jar.xml file, defaults are provided, see Configuring the resource adapter for inbound communication.  In my case I had not specified  activation-config-property-name channel, and this defaulted to SYSTEM.DEF.SVRCONN, which is why it continued to use that channel.
  • It worth putting <activation-config-property-name>applicationName … in your definitions so you can see what you are using.
    • dis qstatus(JMSQ3) type(handle) gave me APPLTAG(CF3Name) so I can tell which definitions are being used.
    • If you get APPLTAG(weblogic.Server) then you are taking the defaults.
  • The Oracle documentation  says the precedence order is as below.    I do not think this is 100% accurate. (I could not specify some of the parameters on the weblogic-ejb-jar.xml file).  I didnt try the java program.
    1. properties set in the weblogic-ejb-jar.xml deployment descriptor
    2. activation-config-property properties set in the ejb-jar.xml deployment descriptor
    3. activationConfigProperty annotation properties in the java program.

What do I need to specify?

As a minimum you need to use connectionFactoryLookup or  specify

  1. applicationName – so you can identify which definitions are being used
  2. channel – which channel to use
  3. failIfQuiesce
  4. hostName
  5. port

 

The ejb-jar.xml file is in the META-INF directory.  Change  the ejb-jar.xml or  weblogic-ejb-jar.xml file. IUpdate the jar file using a command like jar -uvf MDB4.jar  META-INF/ejb-jar.xml,   and redeploy it.

WebLogic message does not have authorization to view the logs

I was using JMX to display the connectionPool statistics  from webLogic, and kept getting messages

<Error> <Diagnostics> <BEA-320084> <The user principals=[] does not have authorization to view the logs.>

I solved this by using the webLogic console

  • Click “Security Realms” in the Domain Stucture box on the left hand side of the home page
  • Click on the name of your realm (myrealm)
  • Click on “Roles and Policies”
  • Click on “Realm Policies”
  • Expand Domain and select “View Policy Conditions” for “View Log”
  • Click on Add Condition
  • Select the role
  • Click on Finish
  • Click on Save

The change is available immediately on “Save”.

Did the JMS architects get it wrong? Possibly!

I was looking into a performance problem where a web server was doing 1 million MQCONNects  day!  During my investigations I found that the original designers did not design it properly because of the performance overhead, and so the JMS architects had to fix it up by creating connection factories.

Below, I cover

  • the performance problem
  • writing my own connection factory
  • the problems of writing my own connection factory
  • it might just be easier to use a connection factory provided by your web server or JMS provider.

What is an EJB and an MDB?

In a web server you can have Enterprise Java Beans (EJBs) which are a package of java applications doing enterprise type work ( get a message, update a database, send a reply) conforming to an specific Programming Interface.

A Message Driven Bean is an EJB which responds to messages.  A message listener applications gets messages from a queue – and passes the message to the MDB. In non EJB terms this is just like a trigger monitor, starting a transaction and passing the data.

When you create the MDB you have a configuration file  which allows a message listener to be created, and the parameters to use to connect to MQ, and  how many threads etc can be running concurrently.

The MDB, to process the message has a basic structure of

  • method onMessage – this is given the message
  • ejbCreate – when the EJB is created, you can do initialisation here
  • ejbRemove – when the EJB is about to be deleted, you can do clean up here

and looks like

public class MDB
implements MessageListener, MessageDrivenBean
{
  private static final long serialVersionUID = -8070254332864574796L;
  public void onMessage(Message message)
  {
    Connection connection = null;
    Session session = null;
    MessageProducer producer = null;

    try
    {
      InitialContext ctx = new InitialContext();
      ConnectionFactory cf = (ConnectionFactory)ctx.lookup("CF2");
      Destination dest = message.getJMSReplyTo();
     
      connection = cf.createConnection();
      session = connection.createSession(false, 1);
      producer = session.createProducer(dest);
      
      TextMessage response = session.createTextMessage("test response message from the WMQ_IVT_MDB");
      response.setJMSCorrelationID(message.getJMSMessageID());
      producer.send(response);
      return;
    }
    catch (Exception je)
    {
      System.out.println("Something went wrong." + je);
    }
  }

  public void ejbCreate() { }
  public void ejbRemove() { }
 }

Within the onMessage method is logic connection = cf.createConnection();
Under the covers this createConnection() does an MQCONN.  With 1 million messages a day – this is 1 million MQCONNs!
There was also an MQOPEN, an MQCLOSE,  and an MQDISC before the onMessage() method returned.  This results in a huge performance overhead.

Attempts to fix the performance

People quickly found that this model was not efficient, and they came up with ways to improve the performance.

The idea of a connection pool was developed.  Instead of doing an the createConnection() doing an MQCONN etc.  this code was changed, to not do the MQDISC, but to keep the Queue Manager handle. When the next MQCONN occurred, if there was one of these spare Queue Manager handles it would use it.

These connection pools are sophisticated.   For example they can free up connections if they had not been used for a time period, so reducing the overall number of Queue Manager connections in use, they can slowly increase the number of connections in the pool so there is not a sudden spike in requests.

I found it difficult to set up a connection pooling in Oracle’s webLogic webserver,  so I thought I would write my own.   This was pretty easy, but then I discovered some complications.

Writing my own “connection pooling”.

It was clear to me that having connection = cf.createConnection();     from the ejbCreate() method rather than in the onMessage() method would be much more efficient.

  • Move variables from being method variables to instance variables so they persisted across onMessage() calls.
  • Create a connect() method which actually does the createConnection().  This can be called in the ejbCreate setup routine, and from the onMessage() if the connection variable is null.
  • Have code in the ejbRemove to the connection, session etc.
public class IVTMDB
implements MessageListener, MessageDrivenBean
{
  private static final long serialVersionUID = -337338331639L;
  // create long lasting instance variables 
  Connection connection = null;
  Session session = null;
  MessageProducer producer = null;
  InitialContext ctx = null;
  ConnectionFactory cf = null;
  // new method to do the connect
  public void connect(){
  try {
     ctx = new InitialContext();
     cf = (ConnectionFactory)ctx.lookup("CF2");
     connection = cf.createConnection();
     session = connection.createSession(false, 1); 
    }
    catch (Exception je)
    {...
    } 
 }

 public void ejbCreate() {
   connect();
 }
...
}

This worked well.    The number of MQCONN dropped to about 10 per day for many puts.

The problem looked solved, until I tried to shut down the queue manager.

If no work was being processed, then the MDBs were not being called, and so there were no MQ requests being made, and so the connections stayed active, and the queue manager did not shut down.

To solve this you can set  cf.setExceptionListener(..) which will get notified if there are problems with the connection, and so you can close the connection etc.
You need to consider what to do if the connection cannot be made;  do you wait for a short time period, or do you return an error.

I found that this “simple” way of avoiding all of the MQCONNects was quickly getting much more complex.  It was going to be much easier in the long term to get the provided connection factories working.   This was another challenge, taking about a week.  See the next few blog entries on how I did this for webLogic.

Are your mirrored file systems consistent?

It started with a question “Several years ago you told us about checking your MQ disks are consistent,  can you provide us with a link to any documentation please?”.

I’ll explain why this is important and what you need to do to ensure you have data integrity and you do not lose data integrity when you go to a backup site.

With some applications that write to multiple files, the order that data is actually written to the disk does not matter.  For example when you print data, it often stays in a buffer, and is written out when the buffer is full.

A transaction manager

With programs that handle transactions (a transaction manager) it is critical that writes to disk are done in the order they are issued.  If the writes are not in the correct order then if there if the system crashes and tries to restore the transaction the recovery may be missing key data  (“it has taken the money from your account..  it cannot see who should get the money?”) and so data integrity is lost.

With local disks, the sequence is

  • Write to file1,
  • Wait for confirmation that the IO has completed
  • Write to file2,
  • Wait for confirmation that the IO has completed

Consider the case where file1 and file 2 are on different file systems.  For example file1 could be transaction log, file2 could be queue data.  (Picture file system1 on slow disks, and file system 2 on fast disks – so IO for file 2 is faster than IO to file 1).

With mirrored disks with synchronous replication, the sequence is

  • Write to file1 local copy; send data to remote site,  write to file1, send back OK when completed
  • Wait for confirmation that both IOs have completed
  • Write to file2 local copy,send data to remote site,  write to file2, send back OK when completed
  • Wait for confirmation that both IOs have completed

With synchronous replication the two locations need to be within 10s of kilometers.  The response time of the file write depends on the distance.

With Asynchronous replication the two locations can be 100s of kilometers apart.

In this case the sequence is

  • Write to file1 local copy; send data to remote site,  write to file1, send back OK when completed
  • Wait for confirmation that the local IO has completed.
  • Write to file2 local copy,send data to remote site,  write to file2, send back OK when completed
  • Wait for confirmation that the local IO has completed.

The disk subsystems manages the responses coming back from the remote end.

For capacity reasons there are usually multiple paths between the two sites.  It is possible that the data for file 2 gets there before the data for file1.  If the writes are done in the wrong order, this could be bad news.

Consistency group

The architecture of the mirroring systems have the concept of a consistency group.   You define one or more consistency groups.  You put file systems into a consistence group.  For any files in the consistency group the write order will be honoured.  So in the case above, if the two files are in the same consistency group, it will wait, write the data to file 1, then write to file 2.  This gives a solution with data integrity.

The lurking problem.

Someone needs to define the file systems to each consistency group.   The storage manager may have said

  • “all file systems are part of one consistency group”.
  • “”production data is in one consistency group, test data is in another consistency group”
  • “I’ll guess, and hope people tell me their requirements”

How will I know if I have a problem?

The sure fire way of finding out if you have a problem is to lose a site ( for example a power outage).  For 99 times out of a 100 it may be fine, and then one time in a hundred, you find you cannot restart your systems on the other site.  This is clearly the wrong time to find out.

Check with your storage administrator and give them information about the file systems that need to be part of the same consistency group.

Practice your fail over – perhaps weekly – at least monthly.

How to get my enterprise talking to your enterprise over MQ.

No,  this is not about Star Trek’s “beam me up Scottie”, but setting up MQ to MQ communications.

Of course you will want

  1. it to scale – you should be able to define more channels if you need more throughput than one channel can provide
  2. be secure – you can use TLS to protect the data on the channels
  3. be highly available – this is the trickier challenge which I’ll cover below.

Should I use clustering?

At first glance you may think clustering would be a good solution.   Many institutions are unhappy with being in a cluster with their business partner, so it is rare to have two enterprises both in the same cluster.

This means you are usually looking a gateway between the two institutions, which means point-to-point.

There are two parts to being highly available,

  1. How long does it take before messages in the queue manager can flow out of the queue manager?
  2. How long does it take before new work can flow?

How long does it take before messages in the queue manager can flow out of the queue manager?

If your queue manager has ended, the messages are not available until the queue manager (and the channels) area available again.

If you are using z/OS and shared queue etc – other queue managers in the queue sharing group should be able to process the work.

If you are using mid-range MQ, you an use multi instance queue managers. One queue manager is active, the other is in standby,  partially initialized, ready to take over once a failure has been detected.

How long does it take before new work can flow?

If you have only one queue manager, you have to wait until it restarts before work can flow.  If you have more than one queue manager, you can switch message to avoid the down queue manager, by stopping channels into your gateway queue manager.  With clustering within your organization the messages should then flow to another gateway queue manager.

This will much quicker than doing an emergency recreate of your queue manager!