IBM Blog 2016 November

So when does a message really expire?

Nov 28 2016 ‎

I love going to customers because they use MQ in many different ways.
A customer’s problem was that they send a request to a back end, but sometimes the back end does not reply fast enough. The question was, how can they use MQ to tell an application when it has not received a message within a time period.

Thanks to Gwydion Tudur for the answer below.

The plan was to put a message to a local queue with an expiry time as well as sending the message to the back end. If the backend replies in time then get the message from the local queue. If the backend does not reply then if the local message expires. The expiry report message, put to the reply to queue in the MQMD of the local message, can be used to trigger an application to respond to the end user.
Will this work ? Yes , but not perhaps as expected.

The next question was, if I specify time out of 2 seconds, will the expiry report message always be delivered after 2 seconds. The answer is possibly not.

If there is high activity on the local queue (frequent MQGETs) and the queue depth is low, then you may get your expiry event message close to your expiry time. If the activity on the queue is very low then you event report may be generated after EXPRYINT seconds.

Digging into this in more detail…

There are two ways of checking for expiry

When an application does and MQGET on a queue, and the queue manager comes across a message where the expiry time has passed, then it will delete the message and generate the expired message report. If there are 1000 messages on the queue, and it takes 1 millisecond to process each message then it will take 1 second before the last message is processed, and so the the report message from the last message would be generated after 1 second. This relies on applications performing frequent MQGETs on the queue.
The queue manager can be enabled to check queues for expired messages. See the queue manager attribute EXPRYINT. There is one QMGR task which checks all queues; EXPRYINT specifies (in seconds) how often this task runs to scan queues and discard expired messages. The minimum is 5 seconds. If you have many queues or queues with many messages this can be disruptive. If you have a queue with millions of messages, and these messages are on a page set, then these messages will have to be read in and checked, this may take over a second. In this case it is unlikely that your message with a 2 seconds expiry will get the report messages generated close to 2 seconds.

How many buffers do I need for SMDS?

Nov 28 2016

I was asked this at a customer, and we found this is not covered in the Knowledge Centre or anywhere else. I gave the classic performance response ‘it depends’. The customer then said ‘ok it depends on what?’, and I said I would look into it and write it up. I’d like to thank Gwydion Tudur for help with the anwer.

There are two parts to the answer – an easy part, and a hard part. I’ll give an overview of how SMDS works, and cover the easy part first.

If there is a shortage of buffers this means some requests are slower, and can lead to increased IO to the data sets.

SMDS Overview

Each queue manager has an SMDS which it writes to. An entry is still written to the Coupling Facility structure for each message offloaded to SMDS, and contains data such as QueueManagerName.Record number_in_data_ set.
If my application is on MQP1, and it does an MQPUT the information may be like MQP1.9009.
When an MQGET is issued the data in the CF structure says where to get the message from. For example, if my application is connected to MQP1, and the CF Structure data says MQPQ,4993 then the data set for queue manager MQPQ is used and record 4993 is read.
If the message was put using the same queue manager as the application doing the MQGET then the data may already be in the SMDS buffers, and so you avoid a disk read.
Once a message has been deleted the buffer is marked as empty.

The easy part

The queue manager needs to write the message data to SMDS. To do this it will use a buffer.

If there is no buffer available then the application will wait for a free buffer.
If the data fits into one buffer, then one buffer is used.
If the data is bigger than one buffer, and if buffers are available, then more buffers will be used, and the IOs done in parallel.
If there is a shortage of buffers then the application reuses the same buffer, and the IOs are done sequentially.

As soon as an IO has completed, the buffer is available for reuse.
If the put message rate is low, and the message data fits into one buffer, then one buffer may be enough. As the put rate increases then there will be more IOs in parallel and so more buffers may be needed.
If there are bigger messages then more buffers should be used The same logic applies when getting messages from an SMDS, but see below for the hard bit.

You need enough buffers to be able to do all of your IOs without waiting for buffers. If you use the DISPLY USAGE TYPE(SMDS) command it tells you about the buffers in use. If the value of Lowest free is negative then you have had applications waiting – and you should increase the number of buffers by at least this value. For many people the defaults are fine.

The hard part
The simple view above is fine for many MQ environments. Having more buffers may improve performance when getting messages from the SMDS on the same queue manager.

After data is written to the SMDS the data is kept in a buffer until the buffer needs to be reused. If an MQGET is issued, and the data is still in a buffer, then the buffer is reused, and this avoids an IO to read from the SMDS. This is faster than having to read from the SMDS. In the DISPLAY USAGE TYPE(SMDS) command, the ‘saved’ buffers have valid data. The reads saved % tells you what proportion of the MQGET requests were satisfied from the Saved buffers.
This use of a saved buffer to read a message will only apply to messages put by the queue manager where the MQGET is issued – if you get a message from a different queue manager, it is in the other queue manager’s buffer, not where the MQGET is issued from.
If an application browses a message from any queue manager’s SMDS, and then re-browses, or gets the message, the data may be in saved buffers.

If the queue depths are low (for example below 100) then having 100 buffers may be enough so the MQGETs are from the buffer and not from the disk.
If the queue depth is high, or you are using very large messages then you may need many buffers. As long as Saved buffers is less than Total buffers your gets should be from the SMDS buffers.
It may be impractical to have enough buffers to avoid all reads from the data set. If you have a very large number of buffers, then this will use a lot of virtual storage, which will need a lot of real storage, and both of these may impact the z/OS system – more auxiliary storage required, and more real storage to avoid paging.

If you have an application where the message is put and then immediately got, this may use a saved buffer. If you have an application where the messages are long lived, then the data may not be in saved buffers.

The hard part of sizing the number of SMDS buffers is to allocate enough for most of the time -but not too many so they cause problems.

The best way to find out is to gradually increase the number of buffers and see if you get benefit without causing problems. You may find that a number like 100 buffers is enough.

What you need to do now

Use the DISPLY USAGE TYPE(SMDS) command and check the lowest free is positive. If not then increase the number of buffers.

What’s the difference between a CPACF and a crypto express?

Nov 21 2016 ‎

My colleague Tony Sharkey wrote the following words for a customer. I thought it was worth sharing them

z Machines (z13) these days have:

4 drawers,
Each drawer has 2 nodes
Each node has 3 memory chip modules (MCM)
Each MCM has between 6-8 processors, which can be configured as GP’s, zIIP, IFL etc

Each processor also has a CPACF (Central Processor Assist Crypto Function)
– This changed on zEC12 (prior to that, it was 1 CPACF per 2 processors)

The CPACF processor: is used for encryption, decryption and hashing and supports a ‘special’ instruction set. The instructions are used by System SSL (GSKit) of which MQ (and MQ AMS) exploits these.
They must be enabled (feature #3863).
Work run on CPACF is charged to the owning Address Space
Work is run in series – i.e. either GP or CPACF is doing work – not in parallel.
– on zEC12 onwards, this means no waiting for the CPACF processor..

MQ can run SSL channels which are secure using just GP’s (and CPACF).

CryptoExpress cards
However there are CryptoExpress (CEX) cards which can be added to offload (some) of the cost of cryptography.
CEX cards can be configured as co-processors or accelerators or PKCS processors.

Each card has a number (8?) processors that can be configured for different purposes

MQ can use either co-processors or accelerators. We have been given guidance than the accelerator is more optimal for MQ’s purposes.

MQ (as it uses System SSL) can only offload secret key negotiation to the CEX card, i.e. at channel start and when SSLRKEYC trigger is met.

In reality, some part of the key negotiation will be performed on GP (and CPACF) regardless of CEX availability.

Also certain SSLCIPH specs are not supported by the CEX cards (as per https://www.ibm.com/developerworks/community/blogs/c4142f9d-6cf1-44ef-a44a-b09428ad96d1/entry/is_my_ssl_channel_using_hardware_assist?lang=en )

MQ does not need CEX to run – it can work perfectly well with just GP (and CPACF), but you will see increased cost relating to secret key negotiation, and this may have an impact on what else the processors can do.

This is documented in MP16 (see https://ibm-messaging.github.io/mqperf/mp16.pdf) in the Channel Initiator section, specifically SSL and TLS.

So now you know.

Unexpected RACF message when using CHINIT

Nov 4 2016

We had a customer report a RACF message about the chinit accessing the wrong TCP stack
When I recreated it on my system I got

ICH408I USER(SCENSTC ) GROUP(… ) NAME(… )
EZB.STACKACCESS.MVCA.TCPIP CL(SERVAUTH)
WARNING: INSUFFICIENT AUTHORITY – TEMPORARY ACCESS ALLOWED
FROM EZB.** (G)

My chinit was using TCPIP2 stack – not TCPIP.
There is an MQ APAR for this PI67827

How do I edit that ASCII file in uss

Nov 3 2016 ‎

I had to edit a file for WAS configuration, and WAS needs the files in ASCII – so I was at a loss as to how to edit it.

If I use oedit or obrowse in a uss command line the contents were garbage.

Paul Dennis told me ISPF 3.17 (udlist – UNIX directory list) and then use line command va to view in ascii and ea to edit in ascii!

I also found the chtag command to enable this for uss command line.

chtag -tc ISO8859-1 CHANGES.TXT

oedit CHANGES.TXT then displays the data ok – but obrowse does not.

To display tag information use the chtag -p filename

This gave

t ISO8859-1 T=on filename

where

t is text

ISO8859-1 is the ASCII code page

T=on indicates the file has uniformly encoded text data. Only files with txtflag = ON and a valid codeset are candidates for automatic
conversion

Before the file had not been tagged the output was

– untagged T=off filename

For more information see “Using Enhanced ASCII Functionality′′ topic of z/OS UNIX System Services Planning.

For the list of prefix commands see here

Why is my DB2 stored procedure using a lot of CPU?

Nov 1 2016 ‎

Using SQL on z/OS you can use functions that map to MQPUT and MQGET to process MQ messages.
Because the MQGET WAIT time was too small, this meant that there were MQCONN … MQOPEN. MQGET.. MQDISC calls which added to the CPU time

Ok that was was the answer for those in a hurry. In the rest of this blog describe in more detail about using MQ from DB2.

Simple SQL statement

SELECT LENGTH(MSG),MSG FROM PROCTABLE

MSG is a column in table
Length() is a function

This can return all of the row (or be limited to perhaps 100 rows)

SQL statement using MQ

SELECT DB2MQ.MQSEND(‘TRY_SEND2’, ‘CPPOLICY2’, ‘TEST DATA’) FROM SYSIBM.SYSDUMMY1;

MQSEND is a special function

DB2 has configuration tables for MQ. I set these up using

INSERT INTO SYSIBM.MQSERVICE_TABLE VALUES(‘TRY_SEND’,’MQPA’,’CP0000′,500,785,’COLIN’);
INSERT INTO SYSIBM.MQPOLICY_TABLE (POLICYNAME, SEND_PERSISTENCE, DESC) VALUES(‘CPPOLICY2′,’Y’,’TEST POLICY FOR PERSISTENT MESSAGES’);
COMMIT WORK;

See here for a description of the layout of the tables.

The MQSERVICE table has

Service name TRY_SEND used in the MQSEND function
Queue manager : ‘MQPA’
Queue name ‘CP0000’
CCSID 500
Encoding 785
Comment COLIN

The MQPolicy table has columns including Priority, Persistence,Expiry, retry count, retry interval, correlid, reply to q, reply to queue manager,syncpoint,RCV_WAIT_INTERVAL

The RCV_WAIT_INTERVAL corresponds to the WaitInterval field in the get message options structure (MQGMO). The default is 10 milliseconds.

SYSIBM.SYSDUMMY1 is a pseudo ‘one row table’ – so the SELECT returns one row.

MQReceive

MQReceive returns a “table” with columns

MSG
CORRELID
TOPIC
QNAME
MSGID
MSGFORMAT

MQReceiveALL

SELECT T.MSG, T.QNAME

FROM TABLE (
MQRECEIVEALL (
‘TRY_SEND’, /* queue etc */
‘CPPOLICY2′, /* policy eg syncpoint */
’1234’) /* this correlid */
) T; /* MQRECEIVE all returns as a table called T */

MQRECEIVE generates table T.
This get messages with correlid ‘1234’

Stored procedure.

The code for these MQ functions run in a stored procedure. When the MQGET with wait returned no message, the program running in the stored procedure ended. When the next MQRECEIVE function was issued, DB2 started the DB2 supplied program in the Stored Procedure address space. It had to do MQCONN, MQOPEN… etc

By having a larger value of RCV_WAIT_INTERVAL, this should keep the address space active for longer – and eliminate the MQCONN.. MQOPEN. MQCLOSE MQDISC requests.