Why do they ship java products on z/OS with the handbrake on? And how to take the brake off.

I noticed that it takes seconds to start MQ on my little z/OS machine, but minutes (feels like days) to start anything with Liberty Web server.  This include the MQWEB, z/OSMF,  and Z/OSConnect.  I mentioned this to an IBM colleague who asked if I was using Java Shared classes.  These get loaded into z/OS shared pages.

When I implemented it, my Liberty server came up in half the time!

I found this blog post which was very helpful, and showed me where to look for more information.  I subsequently found this document (from 2006!)

The kinder garden overview of how Java works.

  • You start with a program written in the Java language.
  • When you run this, Java converts it into byte codes
  • These byte codes get converted to native instructions  – so a byte code “push onto the stack” may become 8  390 assembler instructions.
  • This code can be optimised, for example code which is executed frequently can have the assembler instructions rewritten to go faster.  It might put code inline instead of out in a subroutine.
  • If you are using Java shared classes, this code can be written out and reused by other applications, or if you restart the server, it can reused what it created before.  Reusing the shared classes means that programs benefit because the byte codes have already been converted into native code, and optimisations have been done on the hot code.

What happens on z/OS?

By default, z/OS writes the code to virtual memory and does not save anything to disk.  If you restart your Java application within the same IPL, it can exploit the shared classes which have been converted to native code, and optimised – great- good design.   I found the second time I started the web server it took half the time.  However I IPL once a day, and start my web server once a day. I do not benefit from having it start faster a second time – as I only started it once per session. By default when you re-ipl, the shared classes code is discarded, and so next time you need the code, it has to be to convert to native instructions again, and it loses any optimisation which had been done.

What is the solution?

It is two easy steps:!

  1. Tell Java to write the information from memory to disk – to take a snaphot.
  2. After IPL tell Java to load memory from the disk image – to restore a snapshot.

It is as simple as that.

Background.

It is all to do with the java -Xshareclasses.

With your application you tell Java where to store information about the shared classed.  It defaults to Cache=/tmp/ name=javasharedresources.

In my jvm.options I overrode the defaults and specified

-Xshareclasses:nonFatal 
-Xshareclasses:groupAccess
-Xshareclasses:cacheDirPerm=0777
-Xshareclasses:cacheDir=/tmp,name=mqweb

If you give each application a name (such as mqweb)  you can isolate the cache to an application and not disrupt another JVM if you change the cache.  For example if you restore from a snapshot, only users of that “name” will be affected.

List what is in the cache

You can use the USS command,

java -Xshareclasses:cacheDir=/tmp/,listAllCaches

I used a batch job to do the same thing.

//IBMJAVA  JOB  1 
// SET V='listAllCaches' 
// SET C='/tmp/' 
//S1       EXEC PGM=BPXBATCH,REGION=0M, 
// PARM='SH java -Xshareclasses:cacheDir=&C,&V' 
//STDERR   DD   SYSOUT=* 
//STDOUT   DD   SYSOUT=*            

The output below, shows the cache name is mqweb.  Once you have created a snapshot it has an entry for it.

Listing all caches in cacheDir /tmp/                                                                          
                                                                                                              
Cache name       level         cache-type      feature         OS shmid       OS semid 
mqweb            Java8 64-bit  non-persistent  cr              8197           4101 

For MQWEB the default parameters are -Xshareclasses:cacheDir=/u/mqweb/servers/.classCache,name=liberty-%u” where /u/mqweb is the WLP parameter, where my parameter are defined, and %u is the userid the server is running under, so in my case liberty=START1.

When I had /u/mqweb/servers/.classCache, then the total command line was too long for BPXBATCH.   (Putting it into STDPARM gave me IEC020I 001-4 on the instream STDPARM because the resolved line wa greater than 80 characters.   I resolved this by adding -Xshareclasses:cacheDir=/u/mqweb,name=cache to the jvm.options file.

To take a snapshot


//IBMJAVA  JOB  1 
// SET C='/tmp/' 
// SET N='mqweb' 
// SET V='restoreFromSnapshot' 
// SET V='listAllCaches'
// SET V='snapshotCache' //S1 EXEC PGM=BPXBATCH,REGION=0M, // PARM='SH java -Xshareclasses:cacheDir=&C,name=&N,&V' //STDERR DD SYSOUT=* //STDOUT DD SYSOUT=* //

This job took a few seconds to run.

I believe you have to take the snapshot while your java application is executing – but I do not know for definite.

Restore a snapshot

To restore a snapshot just use restoreFromSnapshot in the above JCL. This took a few seconds to run. 

How to use it.

If you put the restoreFromSnaphot JCL at the start of the web server, it will preload it whenever you use your server.

If you take a snapshot every day before shutting down your server, you will get a copy with the latest optimisations.  If you do not take a new snapshot it continues to use the old one.

If you want to not use the shared cache you can get rid of it using the command destroySnapshot.

Is my cache big enough?

If you use the printStats request you get information like

Current statistics for cache "mqweb":                                                
...                                                                                     
cache size                           = 104857040                                     
softmx bytes                         = 104857040                                     
free bytes                           = 70294788 
...
Cache is 32% full                                     
                                                      
Cache is accessible to current user = true                                                 

The documentation says

When you specify -Xshareclasses without any parameters and without specifying either the -Xscmx or -XX:SharedCacheHardLimit options, a shared classes cache is created with a default size, as follows:

  • For 64-bit platforms, the default size is 300 MB, with a “soft” maximum limit for the initial size of the cache (-Xscmx) of 64MB, …

I had specified -Xscmx100m  which matches the value reported.

What is in the cache?

You can use the printAllStats command.  This displays information like

Classpath

1: 0x00000200259F279C CLASSPATH
/usr/lpp/java/J8.0_64/lib/s390x/compressedrefs/jclSC180/vm.jar
/usr/lpp/java/J8.0_64/lib/se-service.jar
/usr/lpp/java/J8.0_64/lib/math.jar

Methods for a class
  • 0x00000200259F24A4 ROMCLASS: java/util/HashMap at 0x000002001FF7AEB8.
  • ROMMETHOD: size Signature: ()I Address: 0x000002001FF7BA88
  • ROMMETHOD: put Signature: (Ljava/lang/Object;Ljava/lang/Object;)Ljava/lang/Object; Address: 0x000002001FF7BC50

This shows

  • there is a class HashMap. 
  • It has a method size() with no parameters returning an Int.  It is at…. in memory
  • There is another method put(Object o1, Object o2)  returning an Object.  It is at … in memory
Other stuff

There are sections with JITHINTS and other performance related data.

 

Taking the brakes off ZFS on z/OS – move it to OMVS

From z/OS 2.2 there is a performance advantage in running the ZFS file system as part of OMVS, rather than its own address space.  The IBM documentation says When running zFS in the OMVS address space, each file system vnode operation (such as creating a directory entry, removing a directory entry, or reading from a file) will have better overall performance. Each operation will take the same amount of time while inside zFS itself. The performance benefit occurs because z/OS UNIX can call zFS for each operation in a more efficient manner.  This will be relevant when you application is doing a lot of file IO – for example using a web server.

This move is not documented – but it is really easy!  It is mentioned here. Instructions are hidden in the installation instructions here.

Before I started

The IBM doc says You can determine if zFS is in its own address space by issuing D OMVS,PFS. If the output shows an ASNAME value, zFS is running as a colony address space.


OMVS     0010 ACTIVE             OMVS=(00,01,BP,IZ,RZ,BB)                
PFS CONFIGURATION INFORMATION                                            
 PFS TYPE   ENTRY      ASNAME    DESC      ST    START/EXIT TIME         
 ...   
  ZFS       IOEFSCM    ZFS       LOCAL     A     2021/02/17 17.35.06 

The steps I took…

  1. I added KERNELSTACKS(ABOVE) to USER.Z24A.PARMLIB(BPXPRM00).
  2. Being ultra cautious I re-ipled.
  3. The documentation talks about putting IOEZPRM DD in OMVS, then goes on to say As the preferred alternative to the IOEZPRM DDNAME specification, delete the IOEZPRM DDNAME and use the IOEPRMxx parmlib member.  So I did not change the OMVS proc.  When I reipled it worked and I got the message IOEZ00374I No IOEZPRM DD specified in OMVS proc. Parmlib search being used. 
  4. I edited USER.Z24A.PARMLIB(BPXPRM00) and removed the ASNAME in  FILESYSTYPE TYPE(ZFS) ENTRYPOINT(IOEFSCM)ASNAME(ZFS) Well I actually made a copy of the original line and put it between /* and */, then deleted the text.
  5. I reipled.

Afterwards

The  D OMVS,PFS command now gives  N/A instead of the Address Space Name

OMVS     0010 ACTIVE             OMVS=(00,01,BP,IZ,RZ,BB)                
PFS CONFIGURATION INFORMATION                                         
 PFS TYPE   ENTRY      ASNAME    DESC      ST    START/EXIT TIME         
...
  ZFS       IOEFSCM    N/A       LOCAL     A     2021/02/17 17.55.47  

Easy!

The hardest part was making sure I had an IPLable SARES1 in case I got it wrong!

Issuing commands…

I used to issue commands like f zfs,query,all. Now that the ZFS address space does not exist, you need to use  f omvs,pfs=zfs,query,all.

What should I monitor for MQ on z/OS – buffer pool

For the monitoring of MQ on z/OS, there are a couple of key metrics you need to keep an eye on for the buffer pools, as you sit watching the monitoring screen.

A quick overview of MQ buffer pools

An inefficient queue manager could have all messages stored on disk, in page sets.  An efficient queue manager would cache hot pages in memory to they can be accessed without doing any I/O.

The MQ Buffer Manager component does this caching, by using buffers  (so no surprise there).

A simplified view of the operation is as follows

  1. Messages are broken down into 4KB pages.
  2. Getting the contents of a pageset page, if the page is not in the buffers, the buffer manager reads it from disk into a buffer.
  3. If a page is requested and the contents are not required (for example it is about to be overwritten as part of a put) it does not need to read it from disk.
  4. If the page is updated, for example a new message, or a field in the message is updated during get processing,  the page is not usually written to disk immediately.   The write to disk is deferred (using a process called Deferred Write Process – another non surprise in the naming convention).  This has the advantage that there can be many updates to a page before it is written to disk.
  5. Any buffer page which has not been changed, and is not currently being used, is a free page.

If the system crashes, non persistent messages are thrown away, and persistent messages can be rebuilt from the log.

To reduce the restart time, pages containing persistent data are written out to the page set periodically.   This is driven by the log data set filling up which causes a checkpoint.  Updates which have been around for more than 2 checkpoints are written to disk.  During restart the page set knows how far back in the log restart needs to go.

In both cases, checkpoint and buffer pool filling up (when there are less than 15 % of free =85% in use) , once a page has been successfully written to the page set, the buffer is marked as free.

Pages for non-persistent messages can be written out to disk.

If the buffer pool is critically short of free buffers, and there are less than 5% free buffers, then pages are written to the page set immediately rather than use the deferred write process.  This allows some application work to be done while the buffer manger is trying to make more free pages available.

What is typical behaviour?

The buffer pool is working well.

When a message is got, the buffer is already in the buffer pool so there is no I/O to read from the page set.

The buffer pool is less than 85% full (has more than 15% free pages), there is periodic I/O to the page set because pages with persistent data are written to the page set at checkpoint.

The buffer pool fills up and has more 85% in-use pages.

This can occur if the number and size of the messages being processed is bigger than the size of the buffer pool. This could be a lot of messages in a unit of work,  big messages using lots of pages, or lots of transactions putting and getting messages.  It can also occur when there are lots of puts, but no gets.

If the buffer pool has between 85%  and 95% of in-use pages( between 15%  and 5% free pages),  the buffer manager is working hard to keep free pages available.

There will be I/O intermittently at checkpoints, and a steady I/O as the buffer manager writes the pages to the page set.

If messages are being read from disk, there will be read activity from the pageset, but the buffer pool page can be reused as soon as the data has been copied from the buffer pool page.

The buffer pool has less than 5% free pages.

The buffer manager is in overdrive.  It is working very hard to keep free pages in the buffer pool.   There will be pages written out to the page set as it tries to increase the number of free pages.  Gets may require reads from page set I/O.  All of this I/O can cause I/O contention and all the page set I/Os slow down, and so MQ API request using this buffer pool slow down.

What should be monitored

Most cars these days have a “low fuel light” and a “take me to a garage” light.  For monitoring we can provide similar.

Monitor the % buffer pool full.

  • If it is below 85% things are OK
  • If it is between 85% and 95% then this needs to be monitored, it may be “business as usual”.
  • If it is >=95% this needs to be alerted.  It can be caused by applications or channels not processing messages

Monitoring the number of pages written does not give much information.

It could be caused by a checkpoint activity, or because the buffer pool is filling up.

Monitoring the number of pages read from the page set can provide some insight.

If you compare today with this time last week you can check the profile is similar.

If the buffer pool is below 85% used,

  • Messages being got are typically in the buffer pool so there is no read I/O.
  • If there is read I/O this could be for messages which were not in the buffer pool – for example reading the queue after a queue manager restart.

If the buffer pool than 85% in-use  and less than 95% in-use  this can be caused by a large message work load coming in, and MQ is being “elastic” and queuing the messages.  Even short lived messages may be read from disk.  The number of read I/Os give an indication of throughput.  Compare this with the previous week to see if the profile is similar.

If the buffer pool is more than 95% in-use this will have an impact on performance, as every page processed is likely to have I/O to the page set, and elongated I/O response time due to the contention.

What to do

You may want “operators notes” either on paper or online which describe the expected status of the buffer pools on individual queue managers.

  1. PROD QMPR
    1. BP 1 green less than 85% busy
    2. BP 2 green less than 85% busy
    3. BP3 green except for Friday night when it goes amber  read I/O rate 6000 I/O per minute.
  2. TEST QMTE
    1. BP 1 green less than 85% busy
    2. BP 2 green less than 85% busy
    3. BP 3 usually amber – used for bulk data transfer

What to do the buffer statistics mean?

There are statistic on the buffer pool usage.

  1. Buffer pool number.
  2. Size of buffer pool – when the data was produced.
  3. Low – the lowest number of free pages in the interval.  100* (Size – log)/Size  gives you % full.
  4. Now – the current number of free pages in the interval.
  5. Getp – the number of requests ‘get page with contents’.  If the page is not in the buffer pool then it is read from the page set.
  6. Getn.   A new page is needed.   The contents are not relevant as it is about to be overwritten.  If the page is not in the buffer pool, just allocate a buffer, and do not read from the page set.
  7. STW – set write intent.  This means the page was got for update. I have not seen a use for this. For example
    1. A put wants to put part of a message on the page
    2. A get is being done and it wants to set the “message has been got” flag.
    3. The message has been got, and so pointers to a page need to be updated.
  8. RIO -the number of read requests to the page set.  If this is greater than zero
    1. The request is for messages which have not been used since the queue manager started
    2. The buffer pool had reached 85%, pages had been moved out to the page set, and the buffer has been reused.
  9. WIO the number of write I/Os that were done.  This write could be due to a checkpoint, or because the buffer pool filled up.
  10. TPW total pages written, a measure of how busy the buffer pool was.   This write could be due to a checkpoint, or because the buffer pool filled up.
  11. IMW – immediate write.  I have not used this value, sometimes I observe it is high, but it is not useful.  This can be caused by
    1. the buffer pool being over 95% full, so all application write I/O is synchronous,
    2. or a page was being updated during the last checkpoint, and it needs to be written to the page set when the update has finished.  This should be a small number.  Frequent checkpoints (eg every minute) can increase this value.
  12. DWT – the number of times the Deferred Write processor was started.  This number has little value.
    1. The DWP could have started and been so busy that it never ended, so this counter is 1.
    2. The DWP could have started, written a few pages and stopped – and done this repeatedly, in this case the value could be high.
  13. DMC.   The number of times the buffer pool crossed the 95% limit.  If this is > 0 it tells you the buffer pool crossed the limit
    1. This could have crossed just once, and stayed over 95%
    2. This could have gone above 95%, then below 95% etc.
  14. SOS – the buffer pool was totally short on storage – there were no free pages.  This is a critical indicator.  You may need to do capacity planning, and make the buffer pool bigger, or see if there was a problem where messages were not being got.
  15. STL – the number of times a “free buffer” was reused.  A buffer was associated with a page of a page set.   The buffer has been reused and is now for a different page.  If STL is zero, it means all pages that were used were in the buffer pool
  16. STLA – A measure of contention when pages are being reused.  This is typically zero.

Summary

Now you know as much as I do about buffer pools you’ll see that the %full (or %free) is the key measure.  If the buffer pool is > 85% used pages, then the I/O rate is a measure as to how hard the buffer manager is running.

Do performance measurements need to be accurate and precise?

Like many performance questions the answer is “it depends”.

This question is one of the “simple” questions – which most people get wrong. For example you can have a very precise answer which is inaccurate.

Precision

People often give the wrong precision. For example

  • What is the difference between 1 and 1.0?
  • What is the difference between 999 and 1000, and 1000.0?

What is the difference between 1 and 1.0?

With decimal points you can tell the precision from the number of digits after the decimal point.

  • The value 1 is a number between 0.5 and 1.49, with a range of almost 1
  • the number 1.0 is between 0.95 and 1.049. with a range of almost 0.1

What is the difference between 999 and 1000, and 1000.0?

With numbers without a decimal point, the convention is the number of significant non zero digits.

  • 999 is in the range 998.5 to 999.49 a range of almost 1
  • 1000 has one significant digit – so this can represent a number in the range 500 to 1490, a range of almost 1000!
  • 1000.0 represents a number in the range 999.5 to 1000.49 a range of almost 1

Is precision important ?

If you have a set of measurements with values 9.6 9.8,9.9, 10.0, 10.1, 10.2 10.4 the average is 10.0. Saying the average is 10.0 is misleading as it implies the value is between 0.995 and 10.049. The value have a spread – and in this case, saying the value is 10, is better than saying 10.0. You could say 10.0 ± 0.4 to give an even better answer.

Is accuracy important?

Depending on what you want to deliver from your measurements, the CPU figures you collect may not be accurate enough – or they may be good enough not to worry about.

Why are the CPU used figures inaccurate?

There is a gap between what is reported and what is actually used. For a high proportion of measurements the reported CPU is accurate enough. For some measurements you have to factor in this missing usage.

Below is a simplistic view of work being dispatched on an operating system.

  • The dispatcher scans the list of work to find the next work to dispatch
  • Get “Before – clock time”
  • Get “Before – processor CPU time”
  • Invoke the work. It does calculations, starts an IO request and waits for the response, by returning control to the dispatcher
  • Get “After – processor CPU time”
  • Get “After – clock time”
  • Calculate the difference in clock times and accumulate this in the “work – active time”
  • Calculate the different in the processor CPU times to determine how much CPU time was used for the work, and accumulate this in the “work – CPU time”
  • The dispatcher scans the list of work to find the next work to dispatch
  • etc

The elapsed time figures, and the CPU time used are both accurate, but they do not include the elapsed time or the CPU time the dispatcher took to initiate the work and update the statistics. We can give a better estimate of the true costs.

When running a measurement z/OS can report the CPU used by each engine across the duration of the measurement. We might have

  • Total CPU used across the duration 50 seconds
  • Total CPU used for job A 40 seconds
  • Total CPU used for job B 9.5 seconds
  • Difference in CPU 0.5 seconds

A common way of allocating this “missing CPU” is to allocate it to the jobs depending on the amount of CPU the jobs used in the period. Job A would have 40 + (0.5 *40/50) = 40.4, and job B would have 9.5 + (0.5 * 9.5/50) = 9.6, a total of 50.

Is this completely accurate? – no but it is more accurate than before!

Is this important?

If you want to see the effects of defining an index on an SQL table, a throughput improvement of 100 times or two time, it is good enough to say yes – it improver performance. This does not need to be precise or very accurate.

If you have enhanced your program to remove bottlenecks and can show the CPU per transaction has dropped from 1.2 to 1.1 units then this relative comparison is valid – even if the cost is not entirely accurate.

If you want to give numbers for other people to use, such as the cost of 1000 transactions a second is 1.2 seconds of CPU on this configuration, you need to ensure the figures are accurate.

Do performance measurements need to be accurate and precise?

For many measurements the numbers do not need to be very precise, and the accuracy may be good enough.

How do I use the mid-range MQ accounting data?

Like many topics I look at, this topic seems much harder than I expected.  I thought about this as I walked the long way to the shops, up a steep hill and down the other side (and came back the short way along the flat).  I wondered if it was because I ignore the small problems (like one queue manager and two queues), and always look at the bigger scale problems, for example with hundreds of queue managers and hundreds of queues.

You have to be careful processing and interpreting the data, as it is easy to give an inaccurate picture see “Be careful” at the bottom of this post.

Here are some of the things I found out about the data.

Enabling accounting trace

  • You need to alter qmgr acctQ(ON) acctqMQI(ON) before starting your applications  (and channels).  Enabling it after the program has started running does not capture the data for existing tasks.  You may want to consider enabling accounting trace, then shutdown and restart the queue manager to ensure that all channels are capturing data.
  • The interval ACCTINT is a minimum time.  The accounting records are produced after this time, and with some MQ activity.  If there is no MQ activity, no accounting record is produced, so be careful calculating rates eg messages processed a second.
  • Altering ACCTINT does not take effect until after the accounting record has been produced.  For example if you have ACCTINT(1800) and you change it to ACCTINT(60), the “ACCTINT(1800)” will have to expire, produce the records, then the ACCTINT(60) becomes operative.
  • You can use the queue attribute ACCTQ() to disable the queue accounting information for a queue.

The data

  • The MQI accounting has channel and connection information, the queue accounting does not.  This means you cannot tell the end point from just the queue data.
  • The MQI accounting and Queue accounting have a common field the “connectionId”.   It looks like this is made up with the first 12 characters of the the queue manager name, and a unique large number (perhaps a time stamp).  If you are using many machines, with similar long queue manager names you may want to use a combination field of machineName.connectionId, to make this truly unique.
  • I had an application using a dynamic reply to queue.  I ran this application 1000 times, so 1000 dynamic queues were used. When the application put to the server, and used the dynamic reply queue, the server had a queue record for each dynamic queue.   There were up to 100 queue sections in each queue record, and 11 accounting queue messages for the server ( 1000 dynamic queues, and one server input queue).  These were produced at the end of the accounting interval, they all had the same header information, connectionId, start time etc. You do not know in advance how many queue records there will be.
  • Compare this to using a clustered queue on a remote queue manager, the server queue accounting record on the remote system.had just two queues, the server input queue, and the SYSTEM.CLUSTER.TRANSMIT.QUEUE.
    • The cluster receiver channel on the originator’s queue manager had a queue entry for each dynamic queue.
  • In all my testing the MQI record was produced before the queue accounting record for a program.   If you want to merge the MQI and the Queue records, save information from the MQI record in a table, keyed with the connectionId. When the Queue records come along you use same connectioID key to get the connection information and MQI data.
    • You can get rid of the MQI key data from your table when queue record has less than 100 queues.
    • If the queue record has exactly 100 queues, you do not know if this is middle in a series or the last of the series.  To prevent a storage leak, you may want to store the time within the table and have a timer periodically delete these entries after a few seconds – or just restart the program once a day.
  • The header part of the data has a “sequenceNumber” field.    This  is usually incremented with every set of records.
  • On the SYSTEM.ADMIN.ACCOUNTING.QUEUE, messages for different program instances can be interleaved, for example client1 MQI, client2 MQI, client1 Queue, client3 MQI, client2 Queue, client1 Queue, client3 Queue.
  • You do not get information about the queue name as used by an application, the record has the queue name as used by the queue manager (which may be the same as that which the application used).  For example if your program uses a QALIAS -> clustered queue.   The queue record will have the remote queue name used: SYSTEM.CLUSTER.TRANSMIT.QUEUE, not what the application used.
    • You can use the activity trace to get a profile of what queues an application uses, and feed this into your processing
  • You do not get information about topics or subscriptions names.
  • You may want to convert connectionName  from 127.0.0.1 format to a domain name.

Using the data in your enterprise

You need to capture and reduce the data into a usable form.

From a central machine you can use the amqsevt sample over a client channel for each queue manager and output the data in json format.
I used a python script to process this data. For example:

  • Output the data into a file of format yymmdd.hh.machine.queueManager.json.  You can then use program like jq to take the json data for a particular day (or hour of day) and merge the output from all your queue managers to one stream, for reporting.
    • You get a new file every day/every hour for each queue manager, and allows you to archive or delete old files.
  • Depending on what you want to do with the data, you need to select different fields.  You may not be able to summarise the data by queue name,  as you may find that all application are using clustered queues, and so it is reported as SYSTEM.CLUSTER.TRANSMIT.QUEUE.  You might consider
    • ConnectionName – the IP address where the client came from
    • Channel Name
    • Userid
    • The queue manager where the application connected to
    • The queue manager group (see below)
    • The program name
    • Record start time, and record end time
    • The interval of the record – for a client application, this may be 1 second or less.  For a long running server or channel this will be the accounting interval
    • Number of puts that worked.
    • Number of gets that worked
    • Number of gets that failed.
    • Queue name used for puts
    • Queue name used for gets.
  • You can now display interesting things like
    • Over a day, the average number of queue managers used by an application.   Was this just one (a single point of failure), mostly using one (an imbalance), or spread across multiple queue managers(good).
    • If application did a few messages and ended, and repeated this frequently.  You can review the application and get it to stay connected for longer, and avoid expensive MQCONN and MQDISC, and so save CPU.
    • Did an application stay connected to one queue manager all day?  It is good practice to disconnect and reconnect perhaps hourly to spread the connections across available queue managers
    • You could charge departments for usage based on userid, or program name, and charge on the number or size of messages processed and connections (remembering that an MQCONN and MQDISC are very expensive).
    • You may want to group queue managers  so the queue managers FrontEnd1, FrontEnd2, FrontEnd3 would be grouped as FrontEndALL.   This provides summary information.

Be careful

The time interval of an accounting record can vary.  For example if you are collecting data at hh:15 and hh:45, and you business hours are  0900 to 1700.  If you have no traffic after 1700, and the next MQ request is 0900 the next day, the accounting record will be from 1645 today to 0901 tomorrow.

  • If you are calculating rates eg messages per second then the rates will be wrong, as the time interval is too long.
  • The start times of an interval may vary from day to day.  Yesterday it may have been 0910 to 0940, today it is 0929 to 0959.  It makes it hard to compare like for like.
  • If you try to partition the data into hourly buckets 0900 to 1000,1000 to 1100 etc. using data from 0950: 10:20 by putting a third of the numbers in the 0900:1000 bucket, and the other two thirds into 1000:1100 bucket, then for a record from1645 today to 0900 tomorrow will spread the data across the night and so give a false picture.

You may want to set your accounting interval to 30 minutes, and restart your servers  at 0830, so they are recording on the hour.  At the end of the day, shut down and restart your servers to capture all of the data in the correct time interval bucket

When is mid-range accounting information produced?

I was using the mid-range accounting information to draw graphs of usage, and I found I was missing some data.

There is a  “Collect Accounting” Time for every queue every ACCTINT seconds (default 1800 seconds = 30 minutes).  After this time, any MQ activity will cause the accounting record to be produced.  This does not mean you get records every half hour as you do on z/OS, it means you get records with a minimum interval of 30 minutes for long running tasks.

Setup

I had a server which got from its input queue and put a reply message to the reply-to-queue.

Every minutes an application started once a minute which put messages to this server, got the replies and ended.

When are the records produced?

Accounting data is produced (if collecting is enabled) when:

  • an MQDISC is issued, either explicitly implicitly
  • for long running tasks  the accounting record(s) seems to be produced at when the current time is past the “Collect Accounting time”, when there has been some MQ activity. For example  there were accounting records for a server at the following times
    • The queue manager was started at 12:35:51, and the server started soon afterwards
    • 12:36:04 to 13:06:33.   An application put a message to the server queue and got the response back.   This is 27 seconds after the half hour
    • 13:06:33 to 13:36:42  The application had been putting messages to the server and getting the responses back.   This is 6 seconds after the half hour
    • 13:36:42 to 14:29:48 this interval is 57 minutes.  The server did no work from 1400 to 14:29:48 ( as I was having my lunch).  At 14:29:48 a message arrived, and the accounting record was written for the server.
    • 14:29:48 to 15:00:27 during this time messages were being processed, the interval is just over the 30 minutes.

What does this mean?

  • If you want accounting data with an interval “on the half hour”, you need to start your queue manager “just before the half hour”.
  • Data may not be in the time period you expect.  If you have accounting record produced at 1645, the data collected between 1645 and 17:14  may not appear until the first message is processed the next day. The record havean  interval  from 16:45 to  09:00:01 the next day.  You may not want to display averages if the interval is longer than 45 minutes.
  • You may want to stop and restart the servers every night to have the accounting data in the correct day.

 

mqweb – how to get a chrome browser trace

How to get a chrome trace

See Troubleshooting Chrome network issues  and the description here on how to collect a trace.

  • Open a tab with the chrome://net-export/ url.
  • Click start logging to disk
  • Select a file location
  • In another tab select the mqweb url
  • Click on the “stop” button in the window
  • If you select show file – it opens the json file.   This has all the information you need to process the file, but it is much easier to use the provided tools
  • The filename is given for example “FILE: /home/colinpaice/Downloads/chrome-net-export-log.json
  • Click on “The log file can be loaded using the netlog_viewer.” link.   This gets you to a page which says
  • This app loads NetLog files generated by Chromium’s chrome://net-export. Log data is processed and visualized entirely on the client side (your browser). Data is never uploaded to a remote endpoint.
  • Select  https://netlog-viewer.appspot.com/ to invoke the formatter.
  • Drag your netlog file, or use “choose file”
  • Select events, and this displays all of the traffic
  • In the search bar at the top enter your port 9443, or error
  • You get a list like
  • NONE HOST_RESOLVER_IMPL_REQUEST
    1083 URL_REQUEST https://127.0.0.1:9443/ibmmq/console/
    1084 DISK_CACHE_ENTRY
    1085 HTTP_STREAM_JOB_CONTROLLER https://127.0.0.1:9443/
    1086 HTTP_STREAM_JOB https://127.0.0.1:9443/
  • If the background  is pale green – it is good.  If it is pink (pale red) there was a problem.
  • Click on a line and it displays trace information in a window.  For example the first URL_REQUEST gave
    • t= 8 [st= 8]        HTTP_STREAM_JOB_CONTROLLER_BOUND
                          --> source_dependency = 1089 (HTTP_STREAM_JOB_CONTROLLER)
      t=65 [st=65]        HTTP_STREAM_REQUEST_BOUND_TO_JOB
                          --> source_dependency = 1090 (HTTP_STREAM_JOB)
      t=65 [st=65]     -HTTP_STREAM_REQUEST
      t=65 [st=65]      URL_REQUEST_DELEGATE_SSL_CERTIFICATE_ERROR  [dt=1]
      t=66 [st=66]      CANCELLED
                        --> net_error = -200 (ERR_CERT_COMMON_NAME_INVALID)
      t=66 [st=66]   -URL_REQUEST_START_JOB
                      --> net_error = -200 (ERR_CERT_COMMON_NAME_INVALID)
      t=66 [st=66]    URL_REQUEST_DELEGATE_RESPONSE_STARTED  [dt=0]
      t=66 [st=66] -REQUEST_ALIVE
      
    • SSL_CONNECT_JOB gave me
      1087: SSL_CONNECT_JOB
      ssl/127.0.0.1:9443
      Start Time: 2020-01-29 08:41:25.699
      t= 1 [st= 0] +CONNECT_JOB  [dt=64]
      t= 1 [st= 0]    SOCKET_POOL_CONNECT_JOB_CREATED
                      --> backup_job = false
                      --> group_id = "ssl/127.0.0.1:9443"
      t= 1 [st= 0]   +SSL_CONNECT_JOB_CONNECT  [dt=64]
      t= 1 [st= 0]     +TRANSPORT_CONNECT_JOB_CONNECT  [dt=0]
      t= 1 [st= 0]        HOST_RESOLVER_IMPL_REQUEST  [dt=0]
                          --> address_family = 0
                          --> allow_cached_response = true
                          --> host = "127.0.0.1:9443"
                          --> is_speculative = false
      t= 1 [st= 0]        CONNECT_JOB_SET_SOCKET
      t= 1 [st= 0]     -TRANSPORT_CONNECT_JOB_CONNECT
      t=65 [st=64]      CONNECT_JOB_SET_SOCKET
      t=65 [st=64]   -SSL_CONNECT_JOB_CONNECT
                      --> net_error = -200 (ERR_CERT_COMMON_NAME_INVALID)
      t=65 [st=64] -CONNECT_JOB
      

Understanding Chromium trace and performance data

I found this link very useful to explain the developer information, such as trace, performance etc.

mqweb – performance notes

  • I found facilities in Liberty which can improve the performance of your mqweb server by 1% – ish, by using http/2 protocol and ALPN
  • Ive documented where time is spent in the mq rest exchange.

Use of http/2 and ALPN to improve performance.

According to Wikipedia, Application-Layer Protocol Negotiation (ALPN) is a Transport Layer Security (TLS) extension that allows the application layer to negotiate which protocol should be performed over a secure connection in a manner that avoids additional round trips and which is independent of the application-layer protocols. It is needed by secure HTTP/2 connections, which improves the compression of web pages and reduces their latency compared to HTTP/1.x.

mqweb configuration.

This is a liberty web browser configuration, see this page.

For example

 <httpEndpoint id="defaultHttpEndpoint"
   host="${httpHost}" 
   httpPort="${httpPort}"
   httpsPort="${httpsPort}"
   protocolVersion="http/2"
   >
   <httpOptions removeServerHeader="false"/>

</httpEndpoint>

Client configuration

Most web  browsers support this with no additional configuration needed.

With curl you specify ––http2.

With curl, ALPN is enabled by default (as long as curl is built with the ALPN support).

With the curl ––verbose option on a curl request,  you get

  • * ALPN, offering h2 – this tells you that curl has the support for http2.
  • * ALPN, offering http/1.1

and one of

  • * ALPN, server did not agree to a protocol
  • * ALPN, server accepted to use h2

The “* ALPN, server accepted to use h2” says that mqweb is configured for http2.

With pycurl you specify

 c.setopt(pycurl.SSL_ENABLE_ALPN,1)
 c.setopt(pycurl.HTTP_VERSION,pycurl.CURL_HTTP_VERSION_2_0)

Performance test

I did a quick performance test of a pycurl program getting a 1024 byte message (1024 * the character ‘x’) using TLS certificates.

HTTP support Amount of “application data” sent Total data sent.
http/1.1 2414 7151
http/2 2320 7097

So a slight reduction in the number of bytes send when using http/2.

The time to get 10 messages was 55 ms with http/2, and 77ms with http/1.1,  though there was significant variation in repeated measurements, so I would not rely on these measurements.

Where is the time being spent?

cURL and pycurl can report the times from the underlying libcurl package.  See TIMES here.

The times (from the start of the request) are

  • Name lookup
  • Connect
  • Application connect
  • Pre transfer
  • Start transfer
  • Total time

Total time- Start transfer = duration of application data transfer.

Connect duration = Connect Time – Name lookup Time etc.

For a pycurl request getting two messages from a queue the durations were

Duration in microseconds First message Second messages
Name_lookup 4265 32
Connect 53 3
APP Connect 18985 0
Pre Transfer 31 42
Start Transfer 12644 11036
Transfer of application data 264 235

Most of the time is spent setting up the connection, if the same connection can be reused, then the second and successive requests are much faster.

In round numbers, the first message took 50 ms, successive messages took between 10 and 15 ms.

Rant: I find the IBM Knowledge center on the web runs like a dog with a wooden leg

While playing with the mqweb stuff, I found I was searching for materials on mqweb in the IBM knowledge center.    I got fed up with it being so slow, so I’ve spent some time looking into it.  The slowness may be due to “performance code”  within the page which measures how slowly it goes.  We had a basset hound who had one of its front legs in plaster, and the display of the web pages reminds me of how it used to run.

It is so bad, I see the picture stuttering as it build up.

  • I see the blue header
  • then “Do you want to” which finally ends up at the bottom of the screen.
  • table of contents on the left hand side
  • the page with the content on it appears
  • finally the banner saying “free trial.   Try RESTful APIs to and from your IBM Z mainframe”.

This banner is annoying – I cannot  get rid of it.  It takes up 2cm out of the 15 cm space in my browser – that’s 13% of the real estate!  I keep being asked to give comments on the web site… I do, but I think any comments are being ignored.

I compared the IBM site with the BBC, which has lots of coloured image files,  using the “lighthouse” capability within the Chrome browser.

Site First meaningful paint, seconds Time to interactive, seconds
IBM 9.1 KC page q132130_.htm 0.6 5.8
BBC news page with lots of images 0.3 1.5

Wow, 5.8 seconds – even worse than I thought!

With my broadband, I get download speed of about 53 Mb/Second and upload about 17 Mb/Second.  Ping took about 30 ms to both IBM and to the BBC.  We are on an island, north of Scotland, so I think our response time is typical.

How did I get this data?

In Google Chrome,  Ctrl-shift I, select the Audits tab,  type your URL at the top, press enter

Select “desktop”, Performance, No Throttling.

Click on “Run Audits”.  It runs for a few seconds and stops.

There is a lot of good information.

If you click on “view trace” button, then at the bottom you get a summary chart.

  •      93 ms Loading
  • 3419 ms Scripting
  •   321 ms  Rendering
  •     31 ms Painting
  •  885 ms System

So most of the time is spent scripting!

What sites are used?

I took the trace file, extracted the records with “url” and counted the occurrences.

  • 7357 1.www.s81c.com – an IBM site
  • 5347 http://www.ibm.com
  •   240 tags.tiqcdn.com – Tealium enterprise tag management and marketing software.
  •     42 consent.trustarc.com – TrustArc Cookie Consent Manager
  •    34 9j ?
  •     25 consent.truste.com – TrustArc Cookie Consent Manager
  •     13 consent-st.trustarc.com – TrustArc Cookie Consent Manager
  •    12 js.logentries.com –  Live Log Management and Analytics
  •      7 mapvip.podc.sl.edst.ibm.com
  •      3 www-api.ibm.com
  •      3 idaas.iam.ibm.com

And there was me thinking that the knowlegde center was like a fat pipe squirting down the data, when in fact it sends data down a drop at a time. It also tells other sites what you are looking at.

You can use the “source” tab, and explore all the files which were downloaded.  For example  there is the >V9.1.0  jpg file, along with .js and .css files used in formatting.

What are the top use java script files?

There seem to be a couple of hot java script files, taking over 2 seconds.  (on http://www.s81c.com file  js/www.js… )  The text inside the files begin with IBMPerformance…   I think that a hot function within this,  is the time function, so maybe this code is timing every thing it does, and so slowing it down.

What helps me?

This link explains how to understand the trace and performance data from Chrome.

 

mqweb what’s the difference between the message API and the admin API?

At first glance it looks like the answer is in the question.  You can use

  • the messaging REST API put and get messages
  • the admin REST API to administer queue manager objects

In a couple of places the IBM documentation says you can use the messaging API to administer your objects, which is true at the general sense, but not the specific sense.  Until I hit a problem I thought there was one “messaging REST API” with different flavors of syntax.

Security

The admin API authorisation is managed through <security-role name=”MQWebAdmin”> and <security-role name=”MQWebAdminRO”> sections in the mqwebuser.xml file.

The messaging API authorisation is managed through <security-role name=”MQWebUser”> sections.

Access to resources is done using the Alternate Userid.  I can see in the activity trace that the userid is colinpaice(the id mqweb is running under), but the open of a queue was done with alternate userid testuser.  When I tried to browse messages on a queue, I got a message saying my userid did not have the correct authority. I used setmqaut, and mqsc command refresh security(*) to resolve it.

Cost of the admin interface

The admin interface has a request like

https://127.0.0.1:9443/ibmmq/rest/v1/admin/qmgr/QMA/queue/CP0000?attributes=*

which returns all of the attributes of the queue CP0000.  From the activity trace we can see

  • MQCONN + MQDISC
  • MQOPEN, MQINQ, MQCLOSE of the manager object – twice
  • MQOPEN, MQPUT, MQCLOSE to the SYSTEM.ADMIN.COMMAND.QUEUE
  • MQOPEN, MQGET, MQCLOSE to the SYSTEM.REST.REPLY.QUEUE
  • MQCMIT
  • MQBACK – the JMQI code always does this to be sure that there is no outstanding unit of work,

The most expensive request is the MQCONNect.

Using the admin interface is fine for administration because changes to objects are usually done infrequently.   If you are considering the admin interface to monitor objects, for example plot queue depths over time, the mq rest API may not be the best solution.

Cost of the messaging interface

The messaging API interface uses connection pooling.   When the application does an MQDISC, the connection is returned to a pool, and can be reused if the same userid does an MQCONN.  If the connection is not used for a period, it can be removed from the pool and an MQDISC done to release the connection.    This should eliminate frequent MQCONN and MQDISCs.

From the activity trace we see

MQOPEN, MQGET,MQGET,MQCLOSE of the queue, and no MQCONN.

There will be an MQCONN, is there is no connection available for that userid in the pool, but this should be infrequent.