ICSF return codes – not for humans

Below are some of the errors I experienced using ICSF

IEC143I 213-85 RC=X’00000008′,RSN=X’0000271C’

Colin’s answer.

It could not find the key. Perhaps the CKDS was updated using the KGUP utility. Try refreshing the CKDS (either in batch or using the ISPF panels). For example

//REFRESHE EXEC PGM=CSFEUTIL,PARM=’CSF.SCSFCKDS,REFRESH’
//REFRESHF EXEC PGM=CSFPUTIL,PARM=’REFRESH,CSF.SCSFPKDS.NEW’

Perhaps you are trying to encrypt a data set with a non symmetric key – for example a PKI.b

IEC143I 213-8,RC=X’00000008′,RSN=X’00000BF3′

BF3 (3059) The provided key_identifier refers to an encrypted variable-length CCA key token or a key label of an encrypted variable-length CCA key token. The key-management field in the CCA token does not allow its use in high performance encrypted key operations.


User action: Supply a key token or the label of a key token with the required key-management settings.

Colin’s comments

With CSNBKTB2 I got the 0xbf3 when ‘XPRTCPAC’ was missing. For example I needed rule_array = ‘INTERNAL’||’AES ‘||’CIPHER ‘||’ANY-MODE’||’XPRTCPAC’

The doc for AES CIPHER says XPRTCPAC Allow export to CPACF protected key format.

I also got this trying to use an EXPORTER or an IMPORTER key. This does not support XPRTCPAC.

IEC143I 213-85, RC=X’00000008′,RSN=X’00000BFB’

The provided symmetric key label refers to an encrypted CCA key token, and the CSFKEYS profile covering it does not allow its use in high performance encrypted key operations.

User action: Contact your ICSF or RACF administrator if you need to use this key in calls to Symmetric Key Encipher (CSNBSYE) or Symmetric Key Decipher (CSNBSYD). Otherwise, use Encipher (CSNBENC) or Decipher (CSNBDEC) instead.

Colin’s answer

Define the profile with the bold text

RDEFINE CSFKEYS DES5 UACC(NONE) –
ICSF(SYMCPACFWRAP(YES) SYMCPACFRET(YES))

IEC143I 213-85, RC=X’00000008′,RSN=X’0000272C’

Colin’s comment

I got this when I tried to use

ADD TYPE(DECIPHER) ALGORITHM(DES) LABEL(DES5) CLEAR or
ADD TYPE(ENCIPHER) ALGORITHM(DES) LABEL(DES5) CLEAR

The following worked

ADD TYPE(CIPHER) ALGORITHM(DES) LABEL(DES5) CLEAR

UPDATE: AES256 is used for dataset encryption.

ADD LABEL(OUTAES5) TYPE(DATA ) ALGORITHM(AES)

CSFG1094 TRANSKEY label TOO WEAK.

Colin’s comment

I was trying to use TRANSKEY but the length of the transkey is shorter than the key being defined, for example

ADD LABEL(ATOB) TYPE(EXPORTER) CLEAR LENGTH(16) ALGORITHM(DES)

ADD LABEL(KEY2) TYPE(DATA) LENGTH(24)TRANSKEY(ATOB) ALGORITHM(DES)

IEC143I 213-86

During open processing for an encrypted extended format data set, on return from the ICSF service used to process the key label associated with the data set, the system detected that the encryption type of the data key associated with the key label was not of a supported encryption type. Only encryption keys of type AES256 are supported for extended format data sets.

Colin’s comment

Using

ADD TYPE(DATA) ALGORITHM(AES) LABEL(AES5) LENGTH(32)

with length(32) works. With length(16) it gives 213-86

Update: Only encryption keys of type AES256 are supported for ANY data set encryption.

IEC143I 213-85, RC=X’00000008′,RSN=X’0000085E’

The key usage attributes of the variable-length key token does not allow the requested operation. For example, the request might have been to encrypt data, but encryption is not allowed, or the request might have been to use the ECB cipher mode, but that mode is not allowed.
User action: Use the variable-length key token in a manner consistent with its usage attributes or create a new key token with the desired attributes

Colin’s comments 1

I got this when I had

ADD TYPE(CIPHER ) ALGORITHM(AES) LENGTH(32) LAB(AESCI)

Changing it to type(DATA) worked.

Colin’s comments 2

  • I had a C program and used CSNBKTB2. When CBC was defaulted I got the 85E, when I used ANY-MODE it worked.
  • I had another program which used rule_array = ‘INTERNAL’||’AES ‘||’CIPHER ‘||’XPRTCPAC’||’ANY-MODE’||’ENCRYPT ‘. Without ENCYPT it worked. With both ENCRYPT and DECRYPT it worked.

Return codes

048 ( 72 ) The value specified for length parameter for a key token, key, or text field is not valid.

Colin’s comments.

I got this in CSNDSYI2 when using a private key with a small key size(1024). When I used a private key with key size of 4096 it worked.

09B ( 155 ) The value that the generated_key_identifier parameter specifies is not valid,

or it is not consistent with the value that the key_form parameter specifies.

Colin’s comments

Case 1.

I was trying to generate an IMPORTER and an EXPORTER key. I used CSNBKTB2 to build a skeleton. When I used CSNBKGN2 to generate the token. I got this return code. I think this is because I did not provide a Transport Encryption Key (KEK)

When I used CSNDEDH passing the output from CSNBKTB2, the private key label, and the public key label, it worked, and I could add it to the CKDS using CSNBKRC2.

Case 2.

CSNBKGN2 only accepts skeletons created with type = CIPHER, HMAC, or MAC. See table 77. Trying to use a skeleton for EXPORTER or IMPORTER give you this message.

Case 3.

In CSNBKTB2 I had specified

‘INTERNAL’||’AES ‘||’CIPHER ‘||’XPRTCPAC’||’ANY-MODE’||’DECRYPT ‘

CSNBKGN2 gave me rs 155. Remove the DECRYPT and it worked

F6 ( 246 ) Not documented

I got

  • CSNDKRC  add pkds getting 0 246

I got this with

rc = 'FFFFFFFF'x
rs = 'FFFFFFFF'x
ADDRESS LINKPGM "CSNDKRC",
  'myrc' 'myrs' ,
 ...

because I had not initialised myrc and myrs.

7FB ( 2040 ) Bad data

This check is based on the first byte in the key identifier parameter. The key identifier provided is either an
internal token, where an external or null token was required; or an external or null token, where an internal token was required. The token provided may be none of these, and, therefore, the parameter is not a key identifier at all.

Colin’s comment

  • Check you are passing in the right data! I had misspelt a variable.
  • I was trying to import a PKI public certificate – when it was an AES exported certificate
  • I was trying to use a PKI private certificate of type ECC. CSNDSYI2 only accepts … key enciphered under an RSA public key or AES EXPORTER key.
  • You are trying to use CSNDSYI2 for a DATA key when you should be using CSNDSYI.

806 ( 2054 ) Invalid RSA enciphered key cryptogram; OAEP optional encoding parameters failed validation.

Colin’s comments

I got this when I used the wrong private key to decrypt a key in CSNDSYI2. When I used the correct key it worked.

829 ( 2089 ) The algorithm does not match the algorithm of the key identifier

Colin’s comment.

  • I got this because I had a private key created as an ECC. Where it was expecting an RSA key.
  • CSNDSYX trying to use an PKI public key with ECC…. to encryption under an application supplied RSA public key or AES EXPORTER key.

86A ( 2154 ) Bad key type

At least one key token passed to this callable service does not have the required key type for the specified
function.

Colin’s comments.

  • I got this trying to use an Importer key instead of an Exporter key.
  • I got this trying to use a DH key when an RSA key was expected. The requirements were CSNDSYX: RSA public key or AES EXPORTER

86E ( 2158 ) Not in the books

I got this doing Diffie-Hellman key exchange CSNDEDH using a private key and a public key.

  • With private ECC Curve: PRIME Bits 521 and public ECC Curve: PRIME Bits 384 I got reason code 2158.
  • With private ECC Curve: PRIME Bits 521 and public ECC Curve: PRIME Bits 521 I got reason code 0.
  • With private ECC Curve: PRIME Bits 521 and public ECC Curve: BRAINPOOL Bits 521 I got reason code 2158.
  • With private ECC Curve: BRAINPOOL Bits 521 and public ECC Curve: BRAINPOOL Bits 521 I got reason code 0.

It looks like you have to have matching curve type, and matching size (in bits) for it to work. The documentation under ECC Diffie-Hellman (CSNDEDH and CSNFEDH) says

The ECC curve type and size must be the same as the type (Prime, Brainpool, or Koblitz) and size of
the ECC key-token specified by the public key identifier parameter.

DC9 ( 3529 ) Bad label

A key identifier was supplied to a callable service as a key token or the label of a key token in a key data set. Either the key type of the key or the algorithm of the key is unsupported by the cryptographic features available to ICSF.

Colin’s comment

Perhaps you specified a label name – when it did not exit.

PKA Key Generate (CSNDPKG):generated_key_token_length: The length of the generated key token or label for the generated key token.

I assumed you could give it a label, and it would store the data under that label.

2B30 ( 11056 ) The input PKA token contains length fields that are not valid.


User action: Re-create the key token.

Colin’s comment
2B30 (11056) The skeleton_key_identifier_length field is not valid.
User action:  Check  the skeleton_key_identifier_length and skeleton_key_identifier  (returned from CSNDPKB fields key_token_length,key_token)

2AF8 ( 11000 ) The value specified for length parameter for a key token, key, or text field is not valid.


User action: Correct the appropriate length field parameter. For example I had target_key_identifier_length as 1000, but the documentation said The maximum value is 725 bytes.

Colin’s comment

Make sure you pass the address of the length eg &size, not the size itself.

Make sure you are adding to the correct database. If you try to add a PKI to a CKDS you will get this reason code.

Make fields bigger. I got this with

RSA_enciphered_key_length: The length of the RSA_enciphered_key parameter. This service updates this field with the actual length of the RSA_enciphered_key it generates. The maximum size is 512 bytes.

I had to make it 530 before it worked. Note when I came to check this at a later data – it all worked perfectly and I did not need to make it bigger!)

271C ( 10012 ) A key label was supplied for a key identifier parameter.

This label is the label of a key in the in-storage CKDS or PKDS. A key record with that label (and the specific type if required by the ICSF callable service) could not be found. For a retained key label, this error code is also returned if the key is not found in the CCA coprocessor specified in the PKDS record.

Colin’s comment.

  • I had specified a key of type data (which existed in the CKDS), but it was expecting a key of type Exporter, so was not found and could not find the label in the PKDS).
  • CSNDSYX trying to use an PKI public key with ECC…. to encryption under an application supplied RSA public key or AES EXPORTER key.
  • You specified a key, but the key was not char[64] and had garbage in the value. This can occur if you use a C null terminated string.

2740 ( 10048 ) The key_type parameter does not contain one of the valid types for the service or the keyword TOKEN.

Colin’s comment

I was trying to use CSNDEDH which required a private key and a public key of type ECC. I had specified an RSA key.

Can you send a secret to Mars?

I’ve been reading a book on cryptography, and playing with encryption of data between two sites, and these got me thinking about the history of cryptography.

Hundreds of years ago cryptography was used to send state secrets. With a lot of data, experts could crack and read your correspondence. You could have a “one time pad” where you used a key just once, which helped. The biggest problem was key distribution. In theory you could send the keys (or the pad of keys) through the post, but if people are intercepting and reading your mail – they get to see the keys!

With computers and mathematicians, sending keys to someone has got easier. It is harder to break, but not impossible.

I saw a document recently which said send your key on paper, in an envelope, by courier and not over the network. If people can read documents wrapped in jars from 2000 years ago in a CAT scanner – reading through an envelope should be easy. Also, there may be no rocket going to Mars to take your letter.

There are two common techniques used today for two ends to get a common secret: RSA and Diffie-Hellman

RSA

With RSA you have a private key (which only you has access to) and a public key – which every one can have access to. If you want to send me some data then you encrypt it with my public key and send the data to me. I use my private key and can then decrypt it.

Is this good enough ?

If you can afford to send someone to Mars, you can afford to have a team of programmers to create every possible private-public key combination and have a look up – if the public key is x then the private key is X. If you advertise a public key then some people can decrypt your message. If we do not advertise which private/public key to use then it makes it harder. But how do we negotiate in secret which private/public key to use?

Diffie-Hellman

From a mathematical perspective I found this is much more interesting. We want to communicate to get a secret key, knowing that people can monitor the network and see what we are sending.

It relies on some number theory properties

  • (x **a ) **b = (x**b) **a for example (a**2) **3 = (a* a*) * (a* a) * (a*a) = (a * a * a) * ( a * a * a) = (a**3) **2
  • modular arithmetic x mod y is the remainder if you divide x by y. 17 mod 5 is 2
  • combining these (x ** a) mod y = (x mod y ) **a mod y. x**a could have hundreds or millions of digits. “x mod y” is less than y, so may fit in 64 bit integers.
  • We agree two numbers x and y, which the spy can snoop. I think of a big number a, you think of a big number b.
  • I calculate (x **a) mod y = X and send it to you. You take X and calculate X ** b mod y.
  • You calculate x **b mod y = Y and send it to me. I take Y and calculate Y**a mod y
  • The two numbers are the same, and we can use it as our secret key. The secret agent who does not know a or b cannot calculate the secret.

The secret agents listening on the connection do not know which values of a and b we used, and there could be an infinite number of ‘a’s which all give the same value of (x **a) mod y. Typically people use a and b with thousands of digits.

If you have an x with thousands of digits, and an a with thousands of digits x **a will have millions of digits so you have to have special routines to do the calculations. Fortunately the mod y calculation reduces the size down to a more manageable number – with only thousands of digits.

Who said mathematics was boring?

Is this good enough ?

If you use small numbers then it is easy to crack. Today’s thinking is using more than 2048 bits will be hard to crack.

Backwards migration problem with load modules? – try using DLLs

I had a problem where I compiled a program on z/OS 2.4. When it ran on z/OS 2.3 it abended. This was because I had included some z/OS code with my load module, which was not compatible with back level versions of z/OS.
I didn’t have a z/OS 2.3 to run on …. what could I do? A quick search showed me that using DLL’s was the right answer. You may be using this today without knowing it, for example your C program does not contain the whole printf executable, it just contains a link to it.

What did I do?

My old JCL had

//COMPILE EXEC PROC=EDCCB,….
//BIND.SYSLMOD DD DISP=SHR,DSN=&LOADLIB.
//BIND.SYSLIB DD DISP=SHR,DSN=&LIBPRFX..SCEELKED
//BIND.OBJLIB DD DISP=SHR,DSN=COLIN.OBJLIB
//BIND.GSK DD DISP=SHR,DSN=SYS1.SIEALNKE
//BIND.CSS DD DISP=SHR,DSN=SYS1.CSSLIB
//BIND.SYSIN DD *
INCLUDE GSK(GSKCMS31)
INCLUDE GSK(GSKSSL)
INCLUDE CSS(IRRSDL00)
NAME AMSCHECK(R)

Where the INCLUDE GSK(..) copied in the (z/OS 2.4 specific ) code needed to execute.

My new JCL had

//BIND.SYSIN DD *
SETOPT PARM(CASE=MIXED)
include /usr/lpp/gskssl/lib/GSKSSL.x
include /usr/lpp/gskssl/lib/GSKCMS31.x

INCLUDE CSS(IRRSDL00)
NAME AMSCHECK(R)

How does it work?

The GSKSSL.x has

include /usr/lpp/gskssl/lib/GSKSSL.x
IMPORT CODE,’GSKSSL’,’gsk_attribute_get_buffer’
IMPORT CODE,’GSKSSL’,’gsk_attribute_get_cert_info’

This says gsk_attribute_get_buffer is in module GSKSSL, use dynamic resolve at execution time.

When the program is executed, the loader looks for entries which have pending dynamic resolve. In this case it sees gsk_attribute_get_buffer, It loads the module GSKSSL from the executing system, looks inside it, locates the gsk_attribute_get_buffer code and resolves the address.

On my z/OS 2.4 system it loaded the 2.4 version of GSKSSL, on the z/OS 2.3 system it loaded the 2.3 version of GSKSSL. As long as the parameters to the calls are consistent it will work.

Does this work for my own code?

Yes. You have to build it a certain way. Look for side-deck in the C user guide and C Programming guide.

Are there any other benefits?

Yes – your load modules are smaller, because the imported code is obtained at run time, and not included in your load module.

One Minute MVS – ICSF. It might be better if they finished it.

This is another post in the series of “One Minute MVS” which aims to give the basics you need to be able to get started with a topic.

The IBM documentation says: ICSF provides support for

  • The ANSI Data Encryption Algorithm (DES) and Advanced Encryption Standard (AES) encryption and decryption
  • DES key management and transport
  • AES key management and transport
  • Financial services including PINs, payment card industry transactions and ATMs
  • Public key operations including key generation, digital signatures and wrapping symmetric keys for transport
  • MAC and hash generation
  • Acceleration of handshake and frame encryption for SSL
  • PKCS #11 API

which has too many buzz words for me.

The manual Getting Started with z/OS Data Set Encryption is a very useful book.

My interpretation of what ICSF is:

ICSF…

  • Can create and store Public and Private certificates (as used in SSL and TLS). RACF can store its certificates in ICSF.
  • Can store symmetric keys used to encrypt data – such as data sets. Note: if you are using TLS the actual encryption of data over the network is done with a symmetric key.
  • Mange the hardware, tamper proof, keystores provided with z hardware. (You have to access the physical z machine to enter the master cryptographic keys).
  • To check credit card PIN number and other checks.
  • You can configure ICSF in a group of datasets, and switch to a different set.

ICSF facilities

ICSF has

  • A callable services API to allow you to call ICSF services from a program. There is a header file /usr/include/csfbext.h and SYS1.SIEAHDR.H which defines the function parameters.
  • Some ISPF panels to help you mange the ICSF entities.
  • Some batch command interfaces.

ICSF is not very usable

I found that ICSF was not very usable. For example

  • The ISPF panels are not intuitive. You can update the ICSF datasets in batch. You then have to refresh the in-memory copy.
    • For example, to refresh the CKDS data set and list it, using the ISPF panels for ICSF
      • 2 KDS MANAGEMENT -> 1 CKDS MANAGEMENT -> 1 CKDS OPERATIONS -> 2 REFRESH – Activate an updated CKDS.
      • Then PF3, 3 times.
    • To list the contents
      • 5 UTILITY -> 5 CKDS KEYS -> 1 List and manage all records
    • I would have had a page for KDS, and had a refresh option, and a list option on the same page.
  • I expected to be able to use ICSF using commands. It looks like I have to write programs to use the API! Some REXX programs are available for a subset of the function.
  • There is a lack of consistency. Two utilities doing similar things one has PARM=’REFESH,name’ the other has PARM=’name,REFRESH’. The utility for PKDS is CSFPUTIL. The utility for CKDS is CSFEUTIL.

It feels like ICSF has not yet been finished, more of an “Here are some API’s – good luck as you are on your own”, than a guide for the new user.

I’m writing some C programs to do some basic definitions, and pass parameters. I should not need to do this.

ICSF concepts

There are asymmetric keys such as private key and its public key. If you encrypt some text with a public key you need the private key to decrypt it. If you encrypt with the private key – you need the public key to decrypt it.

There are symmetric keys where the same key is used at each end. For example encrypt: change A to s, B to !; decrypt change s to A, ! to B.

Asymmetric keys are usually used in negotiating or sending a symmetric key to the remote end.

Symmetric keys are usually used to encrypt the payload. It is good practice to change the symmetric key periodically to make it harder for someone to break the cipher.

ICSF has

  • PKDS (Public Key Data Set) for storing Private and Public Asymmetric keys.
  • CKDS (Cryptographic Key Data Set) which is used to store Symmetric keys
  • TKDS (Token Key Data Set). When you are using keys stored in the hardware cryptographic facility, you have a token to reference the data.

The data in these key data sets, may be encrypted, for example by the hardware cryptographic facility. You can configure ICSF so your keys are encrypted, and in normal operation they are not available in clear text, as they are encrypted by the tamper proof hardware, and are used within the hardware, where they are decrypted and used.

Keys can be in one of several state

  • Active. The key can be used to process data. It is within its start and end dates (if present)
  • Archive. The key cannot be used to process data. For example you have backed up an encrypted data set to tape. If you delete the key, the data cannot be processed. If they key is archived then it cannot be used. If you need to access the backed up dataset, you can change the status to Active for the duration of the work.
  • Inactive – A key which has not been archived, and is outside of the start and end dates (if present).
  • Pre-active – I cannot find what this is. It is mentioned in the ISPF panels.

Configuration

You can configure which ICSF data sets are current using, and other parameters via SYS1.PARMLIB(CSFPRMxx).

When you start the CSF procedure you specify the xx. For example in SYS1.PROCLIB concatenation member CSF,

//CSF PROC PRM=00
//CSF EXEC PGM=CSFINIT,PARM=&PRM,…..

S CSF,PRM=CP

You can use the SETICSF operator command to change some parameters for the duration of the CSF task.

You can use D ICSF, for example d ICSF,kds to give

CKDS CSF.SCSFCKDS FORMAT=KDSR SYSPLEX=N MKVPs=DES AES
PKDS CSF.SCSFPKDS FORMAT=VARIABLE SYSPLEX=N MKVPs=RSA ECC
No TKDS was provided.

To change use a different CSFPRMxx, you have to stop and restart CSF, specifying the CSFPRM suffix.

You need to plan your ICSF usage

You can set up data set encryption so when you create a data set, the data is automatically encrypted. You define a key and give it a label. If you delete the key, the data cannot be read. To be able to process the dataset you need RACF access to the dataset, and RACF access to the key.

The label is associated with the data set (so it you send the dataset to a remote system, it will still have the same label).

Most organisations say you much change your password periodically, typically every month. For similar reasons organisations say you should define a new key and use it, typically every month. This is called Key Rotation, where you roll over your key to a new value.

I could have a key with label ColinDSKey, and encrypt my data sets with it. I encrypt data set COLIN.AUG2021.LOG with this key. Next month if I create a new key and reuse the same label, I can encrypt the data set COLIN.SEPT2021.LOG and use the new key. However I will be unable to read COLIN.AUG2021.LOG, because ColinDSKey now has a new value.

I’ve seen presentations which say ” just re-encrypt all your datasets with the new key”. This sounds like a lot of work and a major disruption.

Another approach is to change the key label, for example ColinAug2021DSKey. When I generate a new key, it has a label ColinSept2021DSKey. I configure datasets to use the new label. Datasets with the old label can still be used, as long as the key exists. You can tell ICSF to archive the label (and key), so it cannot be used. If it is needed you make the key active, use the data set, and re archive the key.

To set the label for a data set, you can

  • Specify it in JCL. So you may have to change your JCL every month, or use a symbolic such as the month and year.

// SET KEY=’OUTAESKEY’
//S4 EXEC PGM=IKJEFT01,REGION=0M
//SYSTSPRT DD DSN=IBMUSER.ENC,DISP=(MOD,CATLG),SPACE=(CYL,(1,1)),
// DSKEYLBL=&KEY,DSNTYPE=EXTREQ,DCB=(LRECL=132,BLKSIZE=3200)
//SYSTSIN DD *
LU *
/*

Use SMS and have rules to generate the label depending on your profile and the name of the data set.

Define a RACF profile for example ADDSD ‘PROTECT.*’ UACC(NONE) DFP(DATAKEY(AES2)) says for all datasets with a HLQ of PROTECT, then use key AES2.

When a dataset is allocated you get a message

IGD17150I DATA SET PROTECT.ENC IS ELIGIBLE FOR ACCESS METHOD ENCRYPTION. KEY LABEL IS (AES5)

You also need to think about encrypting your databases, which is another jump in complexity. As an old text book said “We’ll leave this as an exercise for the reader”.

To make the planning just a little! more complex; If you send an encrypted data set to another z/OS system, it will have the same label as when it was originally created, so you need the keys sent (securely) to the remote system for the data set to be processed, and coordinate naming conventions.

As the section header says You need to plan your ICSF usage. ICSF and data set encryption needs a lot of planning.

One Minute MVS – tuning stack and heap pools

These days many applications use a stack and heap to manage storage used by an application. For C and Cobol programs on z/OS these use the C run time facilities. As Java uses the C run time facilities, it also uses the stack and heap.

If the stack and heap are not configured appropriately it can lead to an increase in CPU. With the introduction of 64 bit storage, tuning the heap pools and stack is no longer critical. You used to have to carefully manage the stack and heap pool sizes so you didn’t run out of storage.

The 5 second information on what to check, is the number of segments freed for the stack and heap should be zero. If the value is large then a lot of CPU is being used to manage the storage.

The topics are

Kinder garden background to stack.

When a C (main) program starts, it needs storage for the variables uses in the program. For example

int i;
for (ii=0;ii<3:ii++)
{}

char * p = malloc(1024);

The variables ii and p are variables within the function, and will be on the functions stack. p is a pointer.

The block of storage from the malloc(1024) will be obtained from the heap, and its address stored in p.

When the main program calls a function the function needs storage for the variables it uses. This can be done in several ways

  1. Each function uses a z/OS GETMAIN request on entry, to allocate storage, and a z/OS FREEMAIN request on exit. These storage requests are expensive.
  2. The main program has a block of storage which functions can use. For the main program uses bytes 0 to 1500 of this block, and the first function needs 500 bytes, so uses bytes 1501 to 2000. If this function calls another function, the lower level function uses storage from 2001 on wards. This is what usually happens, it is very efficient, and is known as a “stack”.

Intermediate level for stack

It starts to get interesting when initial block of storage allocated in the main program is not big enough.

There are several approaches to take when this occurs

  1. Each function does a storage GETMAIN on entry, and FREEMAIN on exit. This is expensive.
  2. Allocate another big block of storage, so successive functions now use this block, just like in the kinder garden case. When functions return to the one that caused a new block to be allocated,
    1. this new block is freed. This is not as expensive as the previous case.
    2. this block is retained, and stored for future requests. This is the cheapest case. However a large block has been allocated, and may never be used again.

How big a block should it allocate?

When using a stack, the size of the block to allocate is the larger of the user specified size, and the size required for the function. If the specified secondary size was 16KB, and a function needs 20KB of storage, then it will allocate at least 20KB.

How do I get the statistics?

For your C programs you can specify options in the #PRAGMA statement or, the easier way, is to specify it through JCL. You specify C run time options through //CEEOPTS … For example

//CEEOPTS DD *
STACK(2K,12K,ANYWHERE,FREE,2K,2K)
RPTSTG(ON)

Where

  • STACK(…) is the size of the stack
  • RPTSTG(ON) says collect and display statistics.

There is a small overhead in collecting the data.

The output is like:

STACK statistics:                                                
  Initial size:                                2048     
  Increment size:                             12288     
  Maximum used by all concurrent threads:  16218808     
  Largest used by any thread:              16218808     
  Number of segments allocated:                2004     
  Number of segments freed:                    2002     

Interpreting the stack statistics

From the above data

  • This shows the initial stack size was 2KB and an increment of 12KB.
  • The stack was extended 2004 times.
  • Because the statement had STACK(2K,12K,ANYWHERE,FREE,2K,2K), when the secondary extension became free it was FREEMAINed back to z/OS.

When KEEP was used instead of FREE, the storage was not returned back to z/OS.

The statistics looked like

STACK statistics:                                                
  Initial size:                               2048     
  Increment size:                            12288     
  Maximum used by all concurrent thread:  16218808     
  Largest used by any thread:             16218808     
  Number of segments allocated:               1003     
  Number of segments freed:                      0     

What to check for and what to set

For most systems, the key setting is KEEP, so that freed blocks are not released. You can see this a) from the definition b) Number of segments freed is 0.

If a request to allocate a new segment fails, then the C run time can try releasing segments that are not in use. If this happens the “”segments freed” will be incremented.

Check that the “segments freed” is zero, and if not, investigate why not.

When a program is running for a long time, a small number of “segments allocated” is not a problem.

Make the initial size larger, closer to the “Largest used of any thread” may improve the storage utilisation. With smaller segments there is likely to be unused space, which was too small for a functions request, causing the next segment to be used. So a better definition would be

STACK(16M,12K,ANYWHERE,KEEP,2K,2K)

Which gave

STACK statistics:                                                          
  Initial size:                                     16777216               
  Increment size:                                      12288               
  Maximum used by all concurrent threads:           16193752               
  Largest used by any thread:                       16193752               
  Number of segments allocated:                            1               
  Number of segments freed:                                0               

Which shows that just one segment was allocated.


Kinder garden background to heap

When there is a malloc() request in C, or a new … in Java, the storage may exist outside of the function. The storage is obtained from the heap.

The heap has blocks of storage which can be reused. The blocks may all be of the same size, or or different sizes. It uses CPU time to scan free blocks looking for the best one to reuse. With more blocks it can use increasing amounts of CPU.

There are heap pools which avoids the costs of searching for the “right” block. It uses a pools of blocks. For example:

  1. there is a heap pool with 1KB fixed size blocks
  2. there is another heap pool with 16KB blocks
  3. there is another heap pool with 256 KB blocks.

If there is a malloc request for 600 bytes, a block will be taken from the 1KB heap pool.

If there is a malloc request for 32KB, a block would be used from the 256KB pool.

If there is a malloc request for 512KB, it will issue a GETMAIN request.

Intermediate level for heap

If there is a request for a block of heap storage, and there is no free storage, a large segment of storage can be obtained, and divided up into blocks for the stack. If the heap has 1KB blocks, and a request for another block fails, it may issue a GETMAIN request for 100 * 1KB and then add 100 blocks of 1KB to the heap. As storage is freed, the blocks are added to the free list in the heap pool.

There is the same logic as for the stack, about returning storage.

  1. If KEEP is specified, then any storage that is released, stays in the thread pool. This is the cheapest solution.
  2. If FREE is specified, then when all the blocks in an additional segment have been freed, then free the segment back to the z/OS. This is more expensive than KEEP, as you may get frequent GETMAIN and FREEMAIN requests.

How many heap pools do I need and of what size blocks?

There is usually a range of block sizes used in a heap. The C run time supports up to 12 cell sizes. Using a Liberty Web server, there was a range of storage requests, from under 8 bytes to 64KB.

With most requests there will frequently be space wasted. If you want a block which is 16 bytes long, but the pool with the smallest block size is 1KB – most of the storage is wasted.
The C run time gives you suggestions on the configuration of the heap pools, the initial size of the pool and the size of the blocks in the pool.

Defining a heap pool

How to define a heap pool is defined here.

You specify the size of overall size of storage in the heap using the HEAP statement. For example for a 16MB total heap size.

HEAP(16M,32768,ANYWHERE,FREE,8192,4096)

You then specify the pool sizes


HEAPPOOL(ON,32,1,64,2,128,4,256,1,1024,7,4096,1,0)

The figures in bold are the size of the blocks in the pool.

  • 32,1 says maximum size of blocks in the pool is 32 bytes, allocate 1% of the heap size to this pool
  • 64,2 says maximum size of blocks in the pool is 64 bytes, allocate 2% of the heap size to this pool
  • 128,4 says maximum size of blocks in the pool is 128 bytes, allocate 4% of the heap size to this pool
  • 256,1 says maximum size of blocks in the pool is 256 bytes, allocate 1% of the heap size to this pool
  • 1024,7 says maximum size of blocks in the pool is 1024 bytes, allocate 7% of the heap size to this pool
  • 4096,1 says maximum size of blocks in the pool is 4096 bytes, allocate 1% of the heap size to this pool
  • 0 says end of definition.

Note, the percentages do not have to add up to 100%.

For example, with the CEEOPTS

HEAP(16M,32768,ANYWHERE,FREE,8192,4096)
HEAPPOOLS(ON,32,50,64,1,128,1,256,1,1024,7,4096,1,0)

After running my application, the data in //SYSOUT is


HEAPPOOLS Summary:                                                         
  Specified Element   Extent   Cells Per  Extents    Maximum      Cells In 
  Cell Size Size      Percent  Extent     Allocated  Cells Used   Use      
  ------------------------------------------------------------------------ 
       32        40    50      209715           0           0           0 
       64        72      1        2330           1        1002           2 
      128       136      1        1233           0           0           0 
      256       264      1         635           0           0           0 
     1024      1032      7        1137           1           2           0 
     4096      4104      1          40           1           1           1 
  ------------------------------------------------------------------------ 

For the cell size of 32, 50% of the pool was allocated to it,

Each block has a header, and the total size of the 32 byte block is 40 bytes. The number of 40 bytes units in 50% of 16 MB is 8MB/40 = 209715, so these figures match up.

(Note with 64 bit heap pools, you just specify the absolute number you want – not a percentage of anything).

Within the program there was a loop doing malloc(50). This uses cell pool with size 64 bytes. 1002 blocks(cells) were used.

The output also has

Suggested Percentages for current Cell Sizes:
HEAPP(ON,32,1,64,1,128,1,256,1,1024,1,4096,1,0)


Suggested Cell Sizes:
HEAPP(ON,56,,280,,848,,2080,,4096,,0)

I found this confusing and not well documented. It is another of the topics that once you understand it it make sense.

Suggested Percentages for current Cell Sizes

The first “suggested… ” values are the suggestions for the size of the pools if you do not change the size of the cells.

I had specified 50% for the 32 byte cell pool. As this cell pool was not used ( 0 allocated cells) then it suggests making this as 1%, so the suggestion is HEAPP(ON,32,1

You could cut and paste this into you //CEEOPTS statement.

Suggested Cell Sizes

The C run times has a profile of all the sizes of blocks used, and has suggested some better cell sizes. For example as I had no requests for storage less than 32 bytes, making it bigger makes sense. For optimum storage usage, it suggests of using sizes of 56, 280,848,2080,4096 bytes.

Note it does not give suggested number of blocks. I think this is poor design. Because it knows the profile it could have an attempt at specifying the numbers.

If you want to try this definition, you need to add some values such as

HEAPP(ON,56,1,280,1,848,1,2080,1,4096,1,0)

Then rerun your program, and see what percentage figures it recommends, update the figures, and test again. Not the easiest way of working.

What to check for and what to set

There can be two heap pools. One for 64 bit storage ( HEAPPOOL64) the other for 31 bit storage (HEAPPOOL).

The default configuration should be “KEEP”, so any storage obtained is kept and not freed. This saves the cost of expensive GETMAINS and FREEMAINs.

If the address space is constrained for storage, the C run time can go round each heap pool and free up segments which are in use.

The value “Number of segments freed” for each heap should be 0. If not, find out why (has the pool been specified incorrectly, or was there a storage shortage).

You can specify how big each pool is

  • for HEAPPOOL the HEAP size, and the percentage to be allocated to each pool – so two numbers to change
  • for HEAPPOOL64 you specify the size of each pool directly.

The sizes you specify are not that sensitive, as the pools will grow to meet the demand. Allocating one large block is cheaper that allocating 50 smaller blocks – but for a server, this different can be ignored.

With a 4MB heap specified

HEAP(4M,32768,ANYWHERE,FREE,8192,4096)
HEAPP(ON,56,1,280,1,848,1,2080,1,4096,1,0)

the heap report was

 HEAPPOOLS Summary: 
   Specified Element   Extent   Cells Per  Extents    Maximum      Cells In 
   Cell Size Size      Percent  Extent     Allocated  Cells Used   Use 
   ------------------------------------------------------------------------ 
        56        64      1         655           2        1002           2 
       280       288      1         145           1           1           0 
       848       856      1          48           1           1           0 
      2080      2088      1          20           1           1           1 
      4096      4104      1          10           0           0           0 
   ------------------------------------------------------------------------ 
   Suggested Percentages for current Cell Sizes: 
     HEAPP(ON,56,2,280,1,848,1,2080,1,4096,1,0) 

With a small(16KB) heap specified

HEAP(16K,32768,ANYWHERE,FREE,8192,4096)
HEAPP(ON,56,1,280,1,848,1,2080,1,4096,1,0)

The output was

HEAPPOOLS Summary:                                                            
  Specified Element   Extent   Cells Per  Extents    Maximum      Cells In    
  Cell Size Size      Percent  Extent     Allocated  Cells Used   Use         
  ------------------------------------------------------------------------    
       56        64      1           4         251        1002           2    
      280       288      1           4           1           1           0    
      848       856      1           4           1           1           0    
     2080      2088      1           4           1           1           1    
     4096      4104      1           4           0           0           0    
  ------------------------------------------------------------------------    
  Suggested Percentages for current Cell Sizes:                               
    HEAPP(ON,56,90,280,2,848,6,2080,13,4096,1,0)                             

and we can see it had to allocate 251 extents for all the request.

Once the system has “warmed up” there should not be a major difference in performance. I would allocate the heap to be big enough to start with, and avoid extensions.

With the C run time there are heaps as well as heap pools. My C run time report gave

64bit User HEAP statistics:
31bit User HEAP statistics:
24bit User HEAP statistics:
64bit Library HEAP statistics:
31bit Library HEAP statistics:
24bit Library HEAP statistics:
64bit I/O HEAP statistics:
31bit I/O HEAP statistics:
24bit I/O HEAP statistics:

You should check all of these and make the initial size the same as the suggested recommended size. This way the storage will be allocated at startup, and you avoid problems of a request to expand the heap failing due to lack of storage during a buys period.

Advanced level for heap

While the above discussion is suitable for many workloads, especially if they are single threaded. It can get more complex when there are multiple thread using the heappools.

If you have a “hot” or highly active pool you can get contention when obtaining and releasing blocks from the heap pool. You can define multiple pools for an element size. For example

HEAPP(ON,(56,4),1,280,1,848,1,2080,1,4096,1,0)

The (56,4) says make 4 pools with block size of 56 bytes.

The output has

HEAPPOOLS Summary:                                                          
  Specified Element   Extent   Cells Per  Extents    Maximum      Cells In  
  Cell Size Size      Percent  Extent     Allocated  Cells Used   Use       
  ------------------------------------------------------------------------  
       56       64     1           4         251        1002           2  
       56       64     1           4           0           0           0  
       56       64     1           4           0           0           0  
       56       64     1           4           0           0           0  
      280       288      1           4           1           1           0  
      848       856      1           4           1           1           0  
     2080      2088      1           4           1           1           1  
     4096      4104      1           4           0           0           0  
  ------------------------------------------------------------------------  

We can see there are now 4 pools with cell size of 56 bytes. The documentation says Multiple pools are allocated with the same cell size and a portion of the threads are assigned to allocate cells out of each of the pools.

If you have 16 threads you might expect 4 threads to be allocated to each pool.

How do you know if you have a “hot” pool.

You cannot tell from the summary, as you just get the maximum cells used.

In the report is the count of requests for different storage ranges.

Pool  2     size:   160 Get Requests:           777707 
  Successful Get Heap requests:    81-   88                 77934 
  Successful Get Heap requests:    89-   96                 59912 
  Successful Get Heap requests:    97-  104                 47233 
  Successful Get Heap requests:   105-  112                 60263 
  Successful Get Heap requests:   113-  120                 80064 
  Successful Get Heap requests:   121-  128                302815 
  Successful Get Heap requests:   129-  136                 59762 
  Successful Get Heap requests:   137-  144                 43744 
  Successful Get Heap requests:   145-  152                 17307 
  Successful Get Heap requests:   153-  160                 28673
Pool  3     size:   288 Get Requests:            65642  

I used ISPF edit, to process the report.

By extracting the records with size: you get the count of requests per pool.

Pool  1     size:    80 Get Requests:           462187 
Pool  2     size:   160 Get Requests:           777707 
Pool  3     size:   288 Get Requests:            65642 
Pool  4     size:   792 Get Requests:            18293 
Pool  5     size:  1520 Get Requests:            23861 
Pool  6     size:  2728 Get Requests:            11677 
Pool  7     size:  4400 Get Requests:            48943 
Pool  8     size:  8360 Get Requests:            18646 
Pool  9     size: 14376 Get Requests:             1916 
Pool 10     size: 24120 Get Requests:             1961 
Pool 11     size: 37880 Get Requests:             4833 
Pool 12     size: 65536 Get Requests:              716 
Requests greater than the largest cell size:               1652 

It might be worth splitting Pool 2 and seeing if makes a difference in CPU usage at peak time. If it has a benefit, try Pool 1.

You can also sort the “Successful Heap requests” count, and see what range has the most requests. I don’t know what you would use this information for, unless you were investigating why so much storage was being used.

Ph D level for heap

For high use application on boxes with many CPUs you can get contention for storage at the hardware cache level.

Before a CPU can use storage, it has to get the 256 byte cache line into the processor cache. If two CPU’s are fighting for storage in the same 256 bytes the throughput goes down.

By specifying

HEAPP(ALIGN….

It ensures each block is isolated in its own cache line. This can lead to an increase in virtual storage, but you should get improved throughput at the high end. It may make very little difference when there is little load, or on an LPAR with few engines.

Can I define a disk Read Only to z/OS?

As part of migrating z/OS to a new service level, I wanted to mount old volumes Read-Only, so they were not updated when the new level was used. (For example z/OS updates the dataset last access time in the VTOC). I was running on zPDT, or z/OS on top of Linux, so all of the hardware is emulated. On a real machine you may be able to configure the storage subsystem.

I had four options

  • Make the disk on Linux read only – this worked, and was easy.
  • Copy the disks of interest so I had write access to a copy. This worked, and was easy.
  • Use the zPDT command awsmount 0ac5 -m /mnt/zimages/zOS/A4USR1 –readonly . This worked and was easy.
  • Update the Hardware Configuration Definition (HCD) to make a disk read only. I could define it, but not activate it because this read-only support is for PPRC mirrored disks. I could not vary the address online.

This blog post describes how I changed the HCD to add a read only disk.

This was a journey going into areas I had not been in before (creating IODFs).

The Hardware Configuration Definition(HCD) defines the configuration of the hardware. In day’s gone by the systems programmer would have to do a “sysgen” and used macros to define devices, then assemble it and use it. Nowadays you can maintain the configuration using ISPF panels.

What does the HCD do, and what is an OSCONFIG?

The documentation is not very clear about HCD. There are tiny clues, where it mentions making disks read-only, in OSCONFIG, but does not explain how to display and use the OSCONFIG. Now I know, it is easy.

  • You define each device, or group of similar devices in the HCD.
  • For each OS Configuration (OSCONFIG) you define each operating system image, and which devices belong in which OSCONFIG. See, … simple!

For example you define your configuration, including production and test devices, in the HCD. You then configure

  • A test system with only the test volumes
  • A production system with only the production volumes
  • The sysprog’s system with both test and production devices. From this machine, the systems programmer can create production or test configurations.

Getting started with HCD

The HCD is panel driven from ISPF.

You have to work with a copy of the IODF, and the system will generate a copy for you (suffixed with .WORK). I created a copy, made changes, then created a new IODF.

What is currently being used?

From the main HCD panel

  • 2. Activate or process configuration data
    • 5. View active configuration

Create a copy

From main menu use

  • 6. Maintain I/O definition files
    • 2. Copy I/O definition file

and follow the prompts.

On the home page it has the name of the current IODF being worked on, update it if necessary.

Display the OSCONFIG

Use the ISPF configuration panels for HCD:

  • 1. Define, modify, or view configuration data
    • 1. Operating system configurations

It then lists the available OSCONFIGs. Use / to select one, then select

  • 7. Work with attached devices

This lists the devices. You can scroll or use “L AF0” to locate the devices.

Put / in front to display the options. At the right it gives the command, so

  • 8. Delete . . . . . . . . . . . . . . (d)

I can either use /, and 8, or use ‘d’ (instead of the /) to delete an entry.

PF3 to return to “Define, Modify, or View Configuration Data”.

Add new devices

Use

  • 5. I/O devices

This lists the devices. Use F11 to add

  • Device number 0af0
  • Number of devices 16
  • Device type 3390

Press enter.

It displays a list of OS Configs, select one.

  • option 1 select

You are prompted to configure the devices

  • OFFLINE No Device considered online or offline at IPL
  • DYNAMIC Yes Device supports dynamic configuration
  • LOCANY No UCB can reside in 31 bit storage
  • WLMPAV Yes Device supports work load manager
  • READ-ONLY Sec Restrict access to read requests (SEC or NO)
  • SHARED No Device shared with other systems
  • SHAREDUP No Shared when system physically partitioned

Press enter. To make this read-only I specified Shared=no and read-only=sec. (Sec is for secondary device. The read write copy of the mirrored is is the primary device).

Use PF3 to return.

Activate the configuration

From the HCD home page,

  • 2. Activate or process configuration data
    • 1. Build production I/O definition file

Create production eg “‘SYS1.IODF88”

then

  • 6. Activate or verify configuration dynamically

This displays

  • Currently active IODF . : SYS1.IODF99
  • IODF to be activated . : SYS1.IODF88
  • Test only . . . . . . . . Yes (Yes or No)

Use Test only = YES to validate it, then repeat with Test only = NO. This will make it live.

For me, the SYS1.IODFxx dataset, was created on the wrong volume. It has to be on the same volume as the SYS1.IPLPARM and other IPL information for a successful IPL.

Move the SYS1.IODF to the IPL parm volume.

Change your IPL loadxx member in SYS1.IPLPARM to point to the new IODF.

Although I had specified A4SYS1 as the volume for the SYS1.IODF88, SMS allocation routines allocated it on a different volume. I had to move it to the correct volume. See here.

Once I had IPLed with the new IODF

The command

D U,,,,0AF0,1 gave

UNIT TYPE STATUS   VOLSER     VOLSTATE      SS   
0AF0 3390 F-NRD-RO                /RSDNT     0   

Which says there is no device mounted, but it has been defined as RO.

I varied it online and I got

V 0AF0,ONLINE
IEE103I UNIT 0AF0 NOT BROUGHT ONLINE
IEE763I NAME= IECDINIT CODE= 000000000110088F
IEA434I DEVICE ONLINE IS NOT ALLOWED, R/O SEC PPRC STATE NOT VALID
IEE764I END OF IEE103I RELATED MESSAGES

Which means it was unable to mount my disk as it was not part of a PPRC mirrored DASD environment. I had defined a disk as Read Only, but was not able to use it.

Running a headless Linux meant I was running disk less, and had no backups.

I had a Linux server and had a USB attached disk which I used to do backups. When I logged on after boot using the locally attached screen and keyboard the USB disk was visible. I configured an auto backup procedure, and checked it worked whenever I powered on the server.

I got into the a habit of using telnet to logon and accessing the system remotely. By chance, I checked to see if the backup disk was full, and found the disk was not visible. When I logged on with a local screen and keyboard, the disk was there, and had not been updated for over 100 days.

Digging around I found that USB disks can be mounted at startup or when a user logs on.

The mount information is in a file /etc/fstab.

I used the Ubuntu program “disks” to display and manage the disks. I selected a USB disk, clicked on the settings button, and selected “Edit Mount Options”. By default it had “User Session Defaults” on – which means mount the USB when a user logs on locally. I set

  • User Session Defaults off
  • Mount at system startup
  • Show in user interface
  • Mount Point /mnt/backup1

Next time I rebooted in headless mode, the disk was there as /mnt/backup.

I checked my backups – and they were done. I remember one of the points from when I use to do a MQ health check with customers.

Always check your backups are being done – and are backup what you expect.

Colin Paice

I should have paid more attention!

Moving a system dataset was a challenge

As part of configuring the IO on my z/OS system using HCD, I needed to create a dataset on the IPL volume. This was a challenge, but I got there, the long way.

When I used the HCD to create a SYS1.IODFxx dataset, I specified the DASD volume I wanted to put it on. Unfortunately, because SMS got in the way and overrode my the volume I had specified, and picked a different one!

I could have changed the SMS definitions to say do not play with dataset beginning with SYS1, but I thought it would be easy to move it. After a while I got the following JCL to work

//IBMIODF JOB   ACCOUNTING INFORMATION,REGION=NNNNK 
//STEP1    EXEC  PGM=ADRDSSU,REGION=0M 
//SYSPRINT DD    SYSOUT=A 
//DASD1    DD    UNIT=3390,VOL=(PRIVATE,SER=USER00),DISP=OLD 
//DASD2    DD    UNIT=3390,VOL=(PRIVATE,SER=A4SYS1),DISP=OLD 
//SYSIN    DD    * 
 COPY DATASET(INCLUDE('SYS1.IODF88.CLUSTER')) SPHERE - 
  PROCESS(SYS1)  - 
   BYPASSACS('SYS1.IODF88.CLUSTER') - 
   NULLSTORCLAS - 
  LOGINDDNAME(DASD1) OUTDDNAME(DASD2) DELETE CATALOG 
/* 

Notes:

  1. I had to specify the name with its cluster name. Without this I got message ADR383W code 05.
  2. Although I had specified the target volid of A4SYS1, it was moved to A4USR1! I had to specify
    1. PROCESS(SYS1) I think it gives an extra layer of security. For example many people can have access to DFDSS to move data sets, around, but only a few people would want to move SYS1.** data sets around.
    2. BYPASSACS(…) to bypass SMS and not use ACS routines to allocate the volume. I had to specify the dataset name, using “*” did not move it to the required volume.
    3. NULLSTORCLASS to tell SMS not to use a storage class.

How do I see what is in my z/OS HCD?

I’m running ADCD z/OS on zD&T, and wanted to see what was in my HCD (Hardware Configuration Definition). There are some ISPF panels, which allow you to print things, but they had some “required” parameters which I didn’t have and didn’t need.

This post shows the JCL I used, and gives a quick overview of the output

The JCL to create a batch report. This uses program CBDMGHCP. This parameters are described here. Note you can have the output created in XML format.

//IBMHCD JOB MSGCLASS=H
//GCREP EXEC PGM=CBDMGHCP,
// PARM='REPORT,CSMEN,,,,,00'
//HCDIODFS DD DSN=SYS1.IODF99,DISP=SHR
//HCDRPT DD SYSOUT=,
// DCB=(RECFM=FBA,LRECL=200,BLKSIZE=6400)
//HCDMLOG DD SYSOUT=,
// DCB=(RECFM=FBA,LRECL=200,BLKSIZE=6400)

The output data sets require the DCB information. If this is not provided you get messages like

IEC141I 013-34,IGG0199G,IBMHCD,GCREP,HCDRPT.

The output of the report in HCDRPT

At the top is

TIME: 16:46 DATE: 2021-07-12
IODF NAME: SYS1.IODF99
IODF TYPE: Production
IODF VERSION: 5
IODF VOLUME: A4SYS1
DESCRIPTION: m3000 with SCSI 08-0B only

It lists the sections

PROCESSOR SUMMARY REPORT                         A
PARTITION REPORT                                 B
IOCDS REPORT                                     C
CHANNEL PATH SUMMARY REPORT                      D
CONTROL UNIT SUMMARY REPORT                      G
DEVICE SUMMARY REPORT                            I
SWITCH SUMMARY REPORT                            K 
SWITCH DETAIL  REPORT                            L
SWITCH CONFIGURATION SUMMARY REPORT              M
SWITCH CONFIGURATION DETAIL REPORT               N
OPERATING SYSTEM SUMMARY REPORT                  O
OS DEVICE REPORT                                 P
OS DEVICE DETAIL REPORT                          Q
EDT REPORT  (MVS ONLY)                           R
OS CONSOLE REPORT                                S

Component reports heading

Following the list of sections are the data. Each section has a header like

CONTROL UNIT SUMMARY REPORT TIME:... DATE:...  PAGE G- 1

Where the title (CONTROL UNIT SUMMARY REPORT) and the character following the PAGE match the list of sections, which has “CONTROL UNIT SUMMARY REPORT G”.

To go to a section use Find ‘PAGE G’ .

CONTROL UNIT SUMMARY REPORT

   CONTROL UNIT        
 NUMBER  TYPE-MODEL    
 _____________________ 
 -  -  -  -  -  -  -  -
  0700   3174 
 -  -  -  -  -  -  -  -
  0A80   3990 
  0A81   3990 
  0A82   3990 
  0A83   3990 

Where 0700 is a console, and 0a80 is disk on an emulated 3990.

DEVICE SUMMARY REPORT

--- DEVICE ---    DEVICE                                                
 NUMBER,RANGE    TYPE-MODEL    ATTACHING CONTROL UNITS         
______________  _____________  |____|___
   0700         3270-X          0700 
   0A80         3390            0A80

OPERATING SYSTEM SUMMARY REPORT

 OPERATING                                     
 SYSTEM ID   TYPE        GEN    DESCRIPTION    
 _________   ________    ___    _______________
 OS390       MVS                ADCD ZOS IODF 

MVS DEVICE REPORT

DEV#,RANGE  TYPE-MODEL ... 
__________ __________  ...

0700,64    3270-X....
0A80,112   3390  ....          

Followed by columns of data with headings ( given in the report)

KEY            KEY DESCRIPTION 
---            --------------- 
DEV#,RANGE  -  DEVICE NUMBER, COUNT OF DEVICES (DECIMAL) 
TYPE-MODEL  -  DEVICE TYPE AND MODEL 
SS          -  SUBCHANNEL SET ID 
BASE        -  BASE DEVICE NUMBER FOR MULTIPLE EXPOSURE DEVICES 
UCB-TYPE    -  UCB TYPE BYTES 
ERP-NAME    -  ERROR RECOVERY PROGRAM 
DDT-NAME    -  DEVICE DESCRIPTOR TABLE 
MLT-NAME    -  MODULE LIST TABLE 
OPT         -  OPTIONAL MLT INDICATOR 
UIM-NAME    -  UNIT INFORMATION MODULE SUPPORTING THE DEVICE 
ATI         -  ATTENTION TABLE INDEX (UCBATI) 
AL          -  ALTERNATE CONTROL UNIT (UCBALTCU) 
SH          -  SHARED UP OPTION (UCBSHRUP) 
SW          -  DEVICE CAN BE SWAPPED BY DDR (UCBSWAPF) 
MX          -  DEVICE HAS MULTIPLE EXPOSURES (UCBMTPXP) 
MI          -  MIH PROCESSING SHOULD BE BYPASSED (UCBMIHPB) 
O           -  MLT IS OPTIONAL 
Y           -  DEVICE SUPPORTS THIS FEATURE 
BLANK       -  DEVICE DOES NOT SUPPORT THIS FEATURE 

The UIM is a an object with parameters defined for the device type. They are defined here. The options are defined here. For example a 3270 can have a selector pen!. a 3800 printer can have a burster.

There is a summary of devices.

        TOTAL NUMBER OF DEVICES BY CLASS 
        -------------------------------- 
CLASS NAME             CLASS TYPE    DEVICE COUNT 
----------             ----------    ------------ 
TAPE                       80               64 
COMMUNICATION DEVICES      40                0 
C-T-C                      41               24 
DASD                       20             1001 
GRAPHICS                   10               95 
UNIT RECORD                08                3 
CHARACTER READERS          04                0 
TOTAL NUMBER OF I/O DEVICES DEFINED BY THIS I/O CONFIGURATION     1187 

MVS DEVICE DETAIL REPORT

NUMBER,RANGE TYPE   SS PARAMETER                        FEATURE
____________ ______ __ _______________________________ _______
   0700,64   3270-X  0 OFFLINE=NO                       SELPEN,... 
   0A80,112  3390    0 OFFLINE=NO,DYNAMIC=YES,LOCANY=NO SHARED 

E D T REPORT

The Eligible Device Table Report has

                                AFFINITY  ALLOCATION   
 NAME TYPE  VIO  TOKEN  PREF    INDEX     DEVICE TYPE  DEVICE NUMBER LIST 
 _________  ___  _____  ______  ________  ___________  ___________________
 GENERIC                   280    FFFF     3010200F    ... 0A80- 0AFF ...
 GENERIC                  3800    FFFF     12001009    0700- 073F   

N I P Console REPORT

The Nucleus IP l Console Report has

 DEVICE #        TYPE-MODEL 
 ________       _____________ 
 0700           3270-X 
 0701           3270-X 

I’m sorry I haven’t a clue…

As well as being a very popular British comedy, it is how I sometimes feel about what is happening inside the Liberty Web servers, and products like z/OSMF, z/OS Connect and MQWEB. It feels like a spacecraft in cartoons – there are usually only two controls – start and stop.

One reason for this is that the developers often do not have to use the product in production, and have not sat there, head in hand saying “what is going on ?”.

In this post I’ll cover

What data values to expose

As a concept, if you give someone a lever to pull – you need to give them a way of showing the effect of pulling the level.

If you give someone a tuning parameter, they need to know the impact of using the tuning parameter. For example

  • you implement a pool of blocks of storage.
  • you can configure the number of maximum number of blocks
  • if a thread needs some storage, and there is a free block in the pool, then assign the block to the thread. When the thread has finished with it, the thread goes back into the pool.
  • if all the blocks in the pool are in-use, allocate a block. When the thread has finished with the block – free it.
  • if you specify a very large number of blocks it could cause a storage shortage

The big questions with this example is “how big do you make the pool”?

To be able to specify the correct pool size you need to know information like

  • What was the maximum number of blocks used – in total
  • How many times were additional blocks allocated (and freed)
  • What was the total number of blocks requested.

You might decide that the pool is big enough if less than1% of requests had to allocate a block.

If you find that the maximum value used was 1% of the size of the pool, you can make the pool much smaller.

If you find that 99% of the requests were allocated/freed, this indicates the pool is much to small and you need to increase the size.

For other areas you could display

  • The number of authentication requests that were userid+ password, or were from a certificate.
  • The number of authentication requests which failed.
  • The list of userid names in the userid cache.
  • How many times each application was invoked.
  • The number of times a thread had to wait for a resource.
  • The elapsed time waiting for a resource, and what the resource was.

What attributes to expose

You look at the data to ask

  • Do I have a problem now?
  • Will I have a problem in the future? You need to collect information over time and look at trends.
  • When we had a problem yesterday, did this component contribute to it? You need to have historical data.

It is not obvious what data attributes you should display.

  • The “value now” is is easy to understand.
  • The “average value” is harder. Is this from the start of the application (6 months ago), or a weighted average (99 * previous average + current value)/100. With this weighted average, a change since the previous value indicates the trend.
  • The maximum value is hard – from when? There may have been a peak at startup, and small peaks since then will not show up. Having a “reset command” can be useful, or have it reset on a timer – such as display and reset every 10 minutes.
  • If you “reset” the values and display the value before any activity, what do you display? “0”s for all of the values, or the values when the reset command was issued.

Resetting values can make it easier to understand the data. Comparing two 8 digit numbers is much harder than comparing two 2 digit numbers.

How to expose data

Java has a Java Management eXtension (JMX) for reporting management information. It looks very well designed, is easy to use, and very compact! There is an extensive document from Oracle here.

I found Basic Introduction to JMX by Baeldung , was an excellent article with code samples on GitHub. I got these working in Eclipse within an hour!

The principal behind JMX is …

For each field you want to expose you have a get… method.

You define an interface with name class| |”MBean” which defines all of the methods for displaying the data.

public interface myClassMBean {
public String getOwner();
public int getMaxSize();
}

You define the class and the methods to expose the data.

public class myClass implements myClassMBean{

// and the methods to expose the data

public String getOwner() {
return fileOwner;
}

public int getMaxSize() {
return fileSize;
}

}

And you tell JMX to implement it

myClass myClassInstance = new myClass(); // create the instance of myClass

MBeanServer server = ManagementFactory.getPlatformMBeanServer();
ObjectName objectName =….
server.registerMBean(myClassInstance, objectName);

Where myClassInstance is a class instance. The JMX code extracts the name of the class from the object, and can the identify all the methods defined in the class||”MBean” interface. Tools like jconsole can then query these methods, and invoke them.

ObjectName is an object like

ObjectName objectName = new ObjectName(“ColinJava:type=files,name=onefile”);

Where “ColinJava” is a high level element, “type” is a category, and “name” is the description of the instance .

That’s it.

When you use jconsole ( or other tools) to display it you get

You could have

MBeanServer server = ManagementFactory.getPlatformMBeanServer();

ObjectName bigPoolName = new ObjectName(“ColinJava:type=threadpool,name=BigPool”);
server.registerMBean(bigpoolInstance, bigPoolName);

ObjectName medPoolName = new ObjectName(“ColinJava:type=threadpool,name=MedPool”);
server.registerMBean(medpoolInstance, medPoolname);

ObjectName smPoolName = new ObjectName(“ColinJava:type=threadpool,name=SmallPool”);
server.registerMBean(smallpoolInstance,smPoolName);

This would display the stats data for three pools

  • ColinJava
    • threadpool
      • Bigpool..
      • MedPool….
      • SmallPool…

And so build up a tree like

  • ColinJava
    • threadpool
      • Bigpool..
      • MedPool….
      • SmallPool…
    • Userids
      • Userid+password
      • Certificate
    • Applications
      • Application 1
      • Application 2
    • Errors
      • Applications
      • Authentication

You can also have set…() methods to set values, but you need to be more careful; checking authorities, and possibly synchronising updates with other concurrent activity.

You can also have methods like resetStats() which show up within jconsole as Operations.

How do I build up the list of what is needed?

It is easy to expose data values which have little value. I remember MQ had a field in the statistics “Number of times the hash table changed”. I never found a use for this. Other times I thought “If only we had a count of ……”

You can collect information from problems reported to you. “It was hard to diagnose because… if we had the count of … the end user could have fixed it without calling us”.

Your performance team is another good source of candidates fields. Part of the performance team’s job is to identify statistics to make it easier to tune the system, and reduce the resources used. It is not just about identifying hot spots.

Before you implement the collection of data, you could present to your team on how the data will be used, and produce some typical graphs. You should get some good feedback, even if it is “I dont understand it”.

What can I use to display the data

There are several ways of displaying the data.

  • jconsole – which comes as part of Java can display the data in a window
  • python – you can issue a query can capture the data. I have this set up to capture the data every 10 seconds
  • other tools using the standard interfaces.