Writing your own program to process Published Monitoring data.

How it works

You subscribe to a topic, and the data is delivered to your specified queue periodically – typically every 10 seconds. This data is in PCF-like format.

The subscription for the data is like

DEF SUB(MONCPUSS) 
TOPICSTR(‘$SYS/MQ/INFO/QMGR/QMA/Monitor/CPU/SystemSummary’)
DEST(MYQUEUE)

or

DEF SUB(MONCPUALL)
TOPICSTR(‘$SYS/MQ/INFO/QMGR/QMA/Monitor/CPU/#’) for all (both) the records for CPU.
DEST(MYQUEUE)
USERDATA(‘MONCPUALL)

Layout of the data

With the MQ statistics and Accounting information available before V9 data is available in a PCF message. The data is provided in a set of structures. For example there is a structure with identifier MQIAMO64_PUT_BYTES and the data is in a 64 bit number.
With the published data, you do not get a identifier and value, the data is returned with three values as a key, and the data value. You have to use the key to look up what the value represents. You can write your own mapping from keys to meaning, or subscribe to a METADATA topic which tells you the mapping, and the units of the data, and use that to look up the field. The original design was that when your program to process the data starts up, it will subscribe to the meta data, and get the retained values – then use these values in your processing.
This is not very usable, for example

  1. If the published messages are put to a remote queue, and your centralised processing processes the messages for many queue managers. The processing program will not be able to subscribe to the meta data topic. Also if there were different fields for different releases – your program now has to manage which fields are available to each queue manager.
  2. The descriptions provided by the meta data may not be suitable for your processing (for example column headings in a spread sheet, or fields available in SPLUNK or Elasticsearch

I wrote my own lookup value tables, so I could use my existing processing.

PCF record has the wrong type.

Using my Python code to process the messages on the queue, the data looks like

“PCFheader”: {

“Type”: “STATISTICS”,

“StrucLength”: 36,

“Version”: 3,

“Command”: “NONE”,

“MsgSeqNumber”: 1,

“Control”: “LAST”,

“CompCode”: 0,

“Reason”: 0,

“ParameterCount”: 14,

}

The PCF is of “Type”: “STATISTICS” – which is confusing. I think it needed a new type of “MONITORING”.

You cannot immediately tell if this is a true Statistics record or a Monitoring record.

  1. The MQMD for the monitoring record has “PutApplType”: “QMGR_PUBLISH”, For a true Statistics record the value is “PutApplType”: “QMGR”.
  2. You could dig deep into the data, and see if there are MONITOR_CLASS field in the data (you may have to process all of the fields as the order of the structures can change).
  3. You can use message properties, or get the RFH2 header to find the topic used to publish the data. I extended this by specifying userdata(…..) on the DEF SUB. This data was returned in RFH2 headers. I could then look at the RFH2 header and see what the data was for.

“RFH2”: [

{

“StrucId”: “RFH “,

“mqps”: “<mqps>

<Top>$SYS/MQ/INFO/QMGR/QMA/Monitor/CPU/QmgrSummary</Top>

<Sud>COLIN1</Sud>

</mqps> “

}

I could tell which topic it came from, and from the userdata (COLIN1) I could go back to the subscription. You could also try using a command like dis sub(*) where(topicstr,lk,’$SYS/MQ/INFO/QMGR/QMA/Monitor/CPU/*’) to display which subscription are for this topic.

What the data looks like for published Monitoring data

The data for the DISK/Log looks like in json format

“Data”: {

“Q_MGR_NAME”: “QMA”,

“MONITOR_CLASS”: 1,

“MONITOR_TYPE”: 2,

“MONITOR_INTERVAL”: 10000233,

“0”: 50331648,

“1”: 83886080,

“2”: 16443994112,

“3”: 19549782016,

“4”: 0,

“5”: 0,

“6”: 919,

“12”: 4118,

“7”: 3496,

“8”: 5010

}

From the Monitoring _class : 1 which is for DISK, and with DISK & Monitoring_Type : 2 which is for Log, the data are for the DISK/Log subscription.

You always get Q_MGR_NAME, MONITOR_CLASS, MONITOR_TYPE, and MONITOR_INTERVAL

Monitor interval is in microseconds so 10000233 is 10.000233 seconds.

Note that the data fields are not always in order, and some fields were not present (9, 10, 11 ).

For DISK/Log the available fields are

  • Key data_units: description : value from above data
  • 0000 Unit: Log – bytes in use: 50331648
  • 0001 Unit: Log – bytes max: 83886080
  • 0002 Unit: Log file system – bytes in use: 16443994112
  • 0003 Unit: Log file system – bytes max: 19549782016
  • 0004 Delta: Log – physical bytes written: 0
  • 0005 Delta: Log – logical bytes written: 0
  • 0006 Microseconds: Log – write latency :919
  • 0007 Percent: Log – current primary space in use: 3496
  • 0008 Percent: Log – workload primary space utilization: 5010
  • 0009 MB: Log – bytes required for media recovery: missing
  • 0010 MB: Log – bytes occupied by reusable extents:missing
  • 0011 MB: Log – bytes occupied by extents waiting to be archived:missing
  • 0012 Unit: Log – write size:4118

The units are

  • Unit: This is an absolute number, and a delta would not be meaningful.
  • Delta: This is the calculation (value and the end of the interval) – (value at the start of the interval). This could be number of messages put, or number of bytes processed.
  • Hundredths: To get a useful representation convert to a floating number and divide by 100.
  • KB:
  • Percent: To get a useful number, convert it to a floating point and divide by 100, so 3496 becomes 34.96%
  • Microseconds:
  • MB:
  • GB:

With some data you can convert a figure to a rate. You are given the interval, so (Log – physical bytes written)/interval and ( Log – logical bytes written)/interval are useful rates.
The rate of MQ Requests a second is interesting, but the RAM available/interval is not valid calculation.

You may want to calculate these useful fields as you process them.

I am working on some Python scripts to output the data in json format.

You cannot use the very good sample amqsevt, because the data in the monitoring records is not true PCF data. For example for the log data above the sample produced

eventData” : {

“queueMgrName” : “QMA”,

“monitorClass” : 1,

“monitorType” : 2,

“monitorInterval” : 10000776,

“monitorFlagsNone” : 50331648,

“applType” : 83886080,

“codedCharSetId” : 16457003008,

“currentQueueDepth” : 19549782016,

“defInputOpenOption” : 0,

“defPersistence” : 0,

“defPriority” : 965,

“usage” : “Unknown [4128]”,

“definitionType” : “Unknown [3512]”,

“hardenGetBackout” : “Unknown [5018]”

the key of 3 maps to MQIA_CURRENT_Q_DEPTH, and so the wrong descriptors and values are displayed, it should be Log file system – bytes max.