Ways of logging on to MQWEB on z/OS.

There are different ways of connecting to the MQ Web Server on z/OS (this is based on the z/OS Liberty Web server). Some ways use the SAF interface. This is an interface to the z/OS security manager. IBM provides RACF, there are other security managers such as TOP SECRET, and ACF2. Userid information is stored in the security manager database.

The ways of connecting to the MQ Web server on z/OS.

No security. Use no_security.xml to set up the MQ Web Server.
Hard coded userids and passwords in a file. Using the basic_registry.xml. This defines userid information like <user name=”mqadmin” password=”mqadmin”> . This is suitable only for a sandbox. The password can be obscured or left in plain text.
Logon by z/OS userid and password. Use zos_saf_registry.xml. Logon is by userid and password and checked by a SAF call to the z/OS security manager. The userid is checked for access to a resource like MQWEB.com.ibm.mq.console.MQWebAdmin in class(EJBROLE) and MQWEB in class(APPL).
Connect with a client certificate, and authenticate using userid and password. This uses zos_saf_registry.xml plus additional configuration. The userid, password and access to the EJBROLE and APPL resources is checked by the SAF interface. The certificate id is not used to check access, it is just used to do the TLS handshake.
Certificate authentication, a password is not required. Connecting use a client certificate. This uses zos_saf_registry.xml plus additional configuration. Using the SAF interface, the certificate maps to a z/OS userid; this ID is used for checking access to the EJBROLE and APPL resource.

The configuration for using TLS is not clear.

I found the documentation for the TLS configuration to be unclear. Two parameters are <ssl clientAuthentication clientAuthenticationSupported…/> The documentation says

If you specify clientAuthentication="true", the server requests that a client sends a certificate. However, if the client does not have a certificate, or the certificate is not trusted by the server, the handshake does not succeed.
If you specify clientAuthenticationSupported="true", the server requests that a client sends a certificate. However, if the client does not have a certificate, or the certificate is not trusted by the server, the handshake might still succeed.
If you do not specify either clientAuthentication or clientAuthenticationSupported, or you specify clientAuthentication="false" or clientAuthenticationSupported="false", the server does not request that a client send a certificate during the handshake.

I experimented with the different options and the results are below.

I used a web browser with several possible certificates that could be used for authentication. I was given a pop up which listed them. Chrome remembers the choice. With Firefox, you can click an option “set as default“. If this is unticked you get prompted every time.
I used a browser with no certificates for authentication.

When a session was not allowed, I got (from Firefox) Secure Connection Failed. An error occurred during a connection to 10.1.1.2:9443. PR_END_OF_FILE_ERROR

Client Authentication	Client Authentication Supported	Browser with certificates	Browser without certificates
true	ignored	Pick certificate, userid and password NOT required	PR_END_OF_FILE_ERROR
false	true	Pick certificate, userid and password NOT required	A variety of results. One of PR_END_OF_FILE_ERROR, Blank screen Userid and password required
false	false	Userid and password required	Userid and password required

When using certificates, you can chose to specify userid and password instead of client authentication, by using the appropriate URL with https://10.1.1.2:9443/ibmmq/console/login.html, instead of https://10.1.1.2:9443/ibmmq/console .

Note well.

The server caches credential information. If you change the configuration and refresh the server, the change may not be picked up immediately.

Once you have logged on successfully, a cookie is stored in your browser. This may be used to authenticate, until the token has expired. To be sure of clearing this token I restarted my browser.

Why do they ship java products on z/OS with the handbrake on? And how to take the brake off.

I noticed that it takes seconds to start MQ on my little z/OS machine, but minutes (feels like days) to start anything with Liberty Web server. This include the MQWEB, z/OSMF, and Z/OSConnect. I mentioned this to an IBM colleague who asked if I was using Java Shared classes. These get loaded into z/OS shared pages.

When I implemented it, my Liberty server came up in half the time!

I found this blog post which was very helpful, and showed me where to look for more information. I subsequently found this document (from 2006!)

The kinder garden overview of how Java works.

You start with a program written in the Java language.
When you run this, Java converts it into byte codes
These byte codes get converted to native instructions – so a byte code “push onto the stack” may become 8 390 assembler instructions.
This code can be optimised, for example code which is executed frequently can have the assembler instructions rewritten to go faster. It might put code inline instead of out in a subroutine.
If you are using Java shared classes, this code can be written out and reused by other applications, or if you restart the server, it can reused what it created before. Reusing the shared classes means that programs benefit because the byte codes have already been converted into native code, and optimisations have been done on the hot code.

What happens on z/OS?

By default, z/OS writes the code to virtual memory and does not save anything to disk. If you restart your Java application within the same IPL, it can exploit the shared classes which have been converted to native code, and optimised – great- good design. I found the second time I started the web server it took half the time. However I IPL once a day, and start my web server once a day. I do not benefit from having it start faster a second time – as I only started it once per session. By default when you re-ipl, the shared classes code is discarded, and so next time you need the code, it has to be to convert to native instructions again, and it loses any optimisation which had been done.

What is the solution?

It is two easy steps:!

Tell Java to write the information from memory to disk – to take a snaphot.
After IPL tell Java to load memory from the disk image – to restore a snapshot.

It is as simple as that.

Background.

It is all to do with the java -Xshareclasses.

With your application you tell Java where to store information about the shared classed. It defaults to Cache=/tmp/ name=javasharedresources.

In my jvm.options I overrode the defaults and specified

-Xshareclasses:nonFatal 
-Xshareclasses:groupAccess 
-Xshareclasses:cacheDirPerm=0777 
-Xshareclasses:cacheDir=/tmp,name=mqweb

If you give each application a name (such as mqweb) you can isolate the cache to an application and not disrupt another JVM if you change the cache. For example if you restore from a snapshot, only users of that “name” will be affected.

List what is in the cache

You can use the USS command,

java -Xshareclasses:cacheDir=/tmp/,listAllCaches

I used a batch job to do the same thing.

//IBMJAVA  JOB  1 
// SET V='listAllCaches' 
// SET C='/tmp/' 
//S1       EXEC PGM=BPXBATCH,REGION=0M, 
// PARM='SH java -Xshareclasses:cacheDir=&C,&V' 
//STDERR   DD   SYSOUT=* 
//STDOUT   DD   SYSOUT=*

The output below, shows the cache name is mqweb. Once you have created a snapshot it has an entry for it.

Listing all caches in cacheDir /tmp/                                                                          
                                                                                                              
Cache name       level         cache-type      feature         OS shmid       OS semid 
mqweb            Java8 64-bit  non-persistent  cr              8197           4101

For a different product I got

Incompatible shared caches                                     
rseapi                  Java8 32-bit  non-persistent  default

The Incompatible shared caches looks like it means you are using 64 bit Java – but there is a cache using 32 bit Java.

For MQWEB the default parameters are -Xshareclasses:cacheDir=/u/mqweb/servers/.classCache,name=liberty-%u” where /u/mqweb is the WLP parameter, where my parameter are defined, and %u is the userid the server is running under, so in my case liberty=START1.

When I had /u/mqweb/servers/.classCache, then the total command line was too long for BPXBATCH. (Putting it into STDPARM gave me IEC020I 001-4 on the instream STDPARM because the resolved line wa greater than 80 characters. I resolved this by adding -Xshareclasses:cacheDir=/u/mqweb,name=cache to the jvm.options file.

To take a snapshot


//IBMJAVA  JOB  1 
// SET C='/tmp/' 
// SET N='mqweb' 
// SET V='restoreFromSnapshot' 
// SET V='listAllCaches'
// SET V='snapshotCache'
//S1       EXEC PGM=BPXBATCH,REGION=0M, 
// PARM='SH java -Xshareclasses:cacheDir=&C,name=&N,&V' 
//STDERR   DD   SYSOUT=* 
//STDOUT   DD   SYSOUT=* 
//

This job took a few seconds to run.

I believe you have to take the snapshot while your Java application is executing – but I do not know for definite.

Restore a snapshot

To restore a snapshot just use restoreFromSnapshot in the above JCL. This took a few seconds to run.

How to use it.

If you put the restoreFromSnaphot JCL at the start of the web server, it will preload it whenever you use your server.

If you take a snapshot every day before shutting down your server, you will get a copy with the latest optimisations. If you do not take a new snapshot it continues to use the old one.

If you want to not use the shared cache you can get rid of it using the command destroySnapshot.

Is my cache big enough?

If you use the printStats request you get information like

Current statistics for cache "mqweb":                                                
...                                                                                     
cache size                           = 104857040                                     
softmx bytes                         = 104857040                                     
free bytes                           = 70294788 
...
Cache is 32% full                                     
                                                      
Cache is accessible to current user = true

The documentation says

When you specify -Xshareclasses without any parameters and without specifying either the -Xscmx or -XX:SharedCacheHardLimit options, a shared classes cache is created with a default size, as follows:

For 64-bit platforms, the default size is 300 MB, with a “soft” maximum limit for the initial size of the cache (-Xscmx) of 64MB, …

I had specified -Xscmx100m which matches the value reported.

What is in the cache?

You can use the printAllStats command. This displays information like

Classpath

1: 0x00000200259F279C CLASSPATH
/usr/lpp/java/J8.0_64/lib/s390x/compressedrefs/jclSC180/vm.jar
/usr/lpp/java/J8.0_64/lib/se-service.jar
/usr/lpp/java/J8.0_64/lib/math.jar

Methods for a class

0x00000200259F24A4 ROMCLASS: java/util/HashMap at 0x000002001FF7AEB8.
ROMMETHOD: size Signature: ()I Address: 0x000002001FF7BA88
ROMMETHOD: put Signature: (Ljava/lang/Object;Ljava/lang/Object;)Ljava/lang/Object; Address: 0x000002001FF7BC50

This shows

there is a class HashMap.
It has a method size() with no parameters returning an Int. It is at…. in memory
There is another method put(Object o1, Object o2) returning an Object. It is at … in memory

Other stuff

There are sections with JITHINTS and other performance related data.

Getting SSL/TLS to work on MQ on z/OS

After I succeeded in getting TLS 1.3 to run on MQ midrange 9.2, I thought I would try it on z/OS. I had not used TLS on z/OS for about 10 years, so it was almost like coming to the topic with very rusty knowledge.

I searched the Knowledge centre and found no relevant hits – lots of hits which were not relevant. I eventually found an SSL related keyword, and this got me to the topic Working with SSL/TLS on z/OS. I think this is well documented. It covered all the things I had to do.

The remained of this post covers the bits not covered by the documentation.

Define SSLTASKS.

You need to define SSLTASKS to be able to use TLS on z/OS. See the comments here. I used

%CSQ9 ALTER QMGR SSLTASKS(5)

You need to restart the CHINIT if you change the value of SSLTASKS.

Set up the keyring for the queue manager.

See here. This post show how to create the keyring and import a CA from z/OS, and import a CA from a Linux system.

If you alter the keyring or certlabl you just need a refresh security type(SSL) command to pick up the changes.

Defining the channel

I tried to define the channel, as this failed for security reasons, I’ve given the RACF setup I had to do.

In this section I defined the specific commands for example DEFINE.CHANNEL. I could have defined DEFINE.* to allow all define commands.

I used a channel called TLS, and define the resource CSQ9.CHANNEL.TLS* to allow my ID to define TLS, TLS1 etc

The command %CSQ9 DEF CHL(TLS) CHLTYPE(SVRCONN) gave me

ICH408I USER(CSQOPR ) GROUP(SYS1 ) NAME(COLIN PAICE ) 167
CSQ9.DEFINE.CHANNEL CL(MQCMDS )
WARNING: INSUFFICIENT AUTHORITY – TEMPORARY ACCESS ALLOWED
ACCESS INTENT(ALTER ) ACCESS ALLOWED(NONE )

I used the RACF commands in a batch job.

/* RDELETE MQCMDS CSQ9.DEFINE.CHANNEL
RDEF MQCMDS CSQ9.DEFINE.CHANNEL UACC(NONE)
PERMIT CSQ9.DEFINE.CHANNEL CLASS(MQCMDS ) –
ID(COLIN,IBMUSER) ACCESS(ALTER )

I also set up CSQ9.DELETE.CHANNEL and CSQ9.ALTER.CHANNEL in a similar way, so my userid could maintain the channels.

I refreshed MQ security %CSQ9 refresh security to pick up the changes.

I reissued the command %CSQ9 DEF CHL(TLS ) CHLTYPE(SVRCONN) and got

ICH408I USER(COLIN ) GROUP(SYS1 ) NAME(COLIN PAICE )
CSQ9.CHANNEL.TLS CL(MQADMIN )
PROFILE NOT FOUND – REQUIRED FOR AUTHORITY CHECKING
ACCESS INTENT(ALTER ) ACCESS ALLOWED(NONE )

I used the RACF commands in a batch job.

/* RDELETE MQADMIN CSQ9.CHANNEL.TLS*
RDEF MQADMIN CSQ9.CHANNEL.TLS* UACC(NONE) WARNING
PERMIT CSQ9.CHANNEL.TLS* CLASS(MQADMIN) –
ID(COLIN,IBMUSER) ACCESS(ALTER )
SETROPTS RACLIST(MQADMIN) REFRESH

I issued the commands

%CSQ9 refresh security
%CSQ9 DEF CHL(TLS ) CHLTYPE(SVRCONN)

And successfully defined the channel.

I changed the cipher spec.

I selected a cipher spec from the list.

%CSQ9 alter chl(TLS) chltype(SVRCONN) SSLCIPH(ECDHE_RSA_AES_128_CBC_SHA256)

When I started the channel I got

CSQX631E … CSQXRESP Cipher specifications differ, channel TLS local=ECDHE_RSA_AES_128_CBC_SHA256 remote=TLS_RSA_WITH_AES_256_GCM_SHA384
connection 10.1.0.2 (10.1.0.2)

This was clear; I love clear messages.

I decided to change the z/OS end

%CSQ9 alter chl(TLS) chltype(SVRCONN) SSLCIPH(TLS_RSA_WITH_AES_256_GCM_SHA384 )

and the client connected successfully.

With MQ 9.2 I could (and did) change this to

%CSQ9 alter chl(TLS) chltype(SVRCONN) SSLCIPH(ANY_TLS12)

and the client worked successfully. The ANY_TLS12. provides a wide list of supported cipher specifications, includes TLS_RSA_WITH_AES_256_GCM_SHA384 and ECDHE_RSA_AES_128_CBC_SHA256.

When I am ready to support TLS 1.3 I will use ANY_TLS12_OR_HIGHER and ANY_TLS13_OR_HIGHER.

Connect a client to it!

I had had my client connect to a midrange queue manager, so I had working client environment. See here for the journey.

I created a .json file for the CCDT connection to z/OS. I specified

{ "channel":
  [{
    "name": "TLS",
    "clientConnection":
    {
      "connection":
      [{
        "host": "10.1.1.2",
        "port": 1414
       }],
      "queueManager": "CSQ9"
    },
    "transmissionSecurity":
    {
      "cipherSpecification": "ANY_TLS12",
      "certificateLabel": "rsaca256_client",
      "certificatePeerName": ""
    },
    "type": "clientConnection"
  }]
}

When it connected I got messages

+CSQX511I %CSQ9 CSQXRESP Channel TLS started connection 10.1.0.2
ICH408I USER(COLINPAI) GROUP( ) NAME(??? )
LOGON/JOB INITIATION – USER AT TERMINAL NOT RACF-DEFINED
IRR012I VERIFICATION FAILED. USER PROFILE NOT FOUND.
+CSQX512I %CSQ9 CSQXRESP Channel TLS no longer active connection 10.1.0.2

COLINPAI came from the userid on the Linux machine (colinpaice) upper cased and truncated. This id will be flowed and used as the MCAUSER if you don’t set it to anything else, using CHLAUTH for example (Thanks to Morag for this information).

Enable chlauth

To be able to map from the DN in a certificate to a z/OS userid you have to use MQ CHLAUTH. See Mapping a client user ID to an MCAUSER user ID.

Check it is enabled at the queue manager level and enable it it needed.

%CSQ9 DIS QMGR CHLAUTH
%CSQ9 ALTER QMGR CHLAUTH(ENABLED)

Define a mapping from certificate to userid

I used

//S1 EXEC PGM=CSQUTIL,PARM='CSQ9' 
//STEPLIB  DD DSN=COLIN.MQ921.SCSQLOAD,DISP=SHR 
//         DD DSN=COLIN.MQ921.SCSQANLE,DISP=SHR 
//SYSPRINT DD SYSOUT=* 
//SYSIN   DD * 
 COMMAND DDNAME(COMMAND) 
//COMMAND DD * 
 SET CHLAUTH('TLS') + 
     TYPE(SSLPEERMAP) SSLPEER('O="cpwebuser"') + 
     ACTION(REPLACE)   + 
     MCAUSER(ADCDD ) CHCKCLNT(ASQMGR) 
/*

This says for channel TLS, take the Organisation(O=..) from the certificate, and if it is cpwebuser then set the ID to ADCDD.

Check it works

Once the channel had started I used

%CSQ( DIS CHS(TLS)
it displayed the following, where I have removed lines which are not relevant to TLS and added some comments

CHSTATUS(TLS)
CHLTYPE(SVRCONN)
…
SECPROT(TLSV12) – this is the level of the protocol
SSLCERTI(CN=SSCARSA1024,OU=CA,O=SSS,C=GB)- this is the DN of the issuer of the SSLPEER certificate (below)
SSLCERTU(START1) – the IBM documentation says “The local user ID associated with the remote certificate.” I dont know where this comes from.. how to change it, or where it is used.
SSLCIPH(TLS_RSA_WITH_AES_256_GCM_SHA384) – The negotiated cipher spec
SSLRKEYS(0) -The number of successful TLS key resets.
SSLKEYTI() -The time on which the previous successful TLS secret key was reset. The secret key has not been reset
SSLKEYDA() -The date on which the previous successful TLS secret key was reset. The secret key has not been reset
SSLPEER(SERIALNUMBER=01:90,CN=rsaca256,O=cpwebuser,C=GB, UNSTRUCTUREDNAME=openssl_ca_user_cnf.keyAgreement2, UNSTRUCTUREDNAME=localhost, UNSTRUCTUREDADDRESS=127.0.0.1) . This is information from the certificate at the remote end.
…
MCAUSER(ADCDD) – This is the userid (set by the CHLAUTH above) used by this channel.
LOCLADDR(10.1.1.2(1414)) – This is the address the connection came in from. This value will be different it you have different IP stacks and different listener ports.

Taking the brakes off ZFS on z/OS – move it to OMVS

From z/OS 2.2 there is a performance advantage in running the ZFS file system as part of OMVS, rather than its own address space. The IBM documentation says When running zFS in the OMVS address space, each file system vnode operation (such as creating a directory entry, removing a directory entry, or reading from a file) will have better overall performance. Each operation will take the same amount of time while inside zFS itself. The performance benefit occurs because z/OS UNIX can call zFS for each operation in a more efficient manner. This will be relevant when you application is doing a lot of file IO – for example using a web server.

This move is not documented – but it is really easy! It is mentioned here. Instructions are hidden in the installation instructions here.

Before I started

The IBM doc says You can determine if zFS is in its own address space by issuing D OMVS,PFS. If the output shows an ASNAME value, zFS is running as a colony address space.

OMVS     0010 ACTIVE             OMVS=(00,01,BP,IZ,RZ,BB)                
PFS CONFIGURATION INFORMATION                                            
 PFS TYPE   ENTRY      ASNAME    DESC      ST    START/EXIT TIME         
 ...   
  ZFS       IOEFSCM    ZFS       LOCAL     A     2021/02/17 17.35.06

The steps I took…

I added KERNELSTACKS(ABOVE) to USER.Z24A.PARMLIB(BPXPRM00).
Being ultra cautious I re-ipled.
The documentation talks about putting IOEZPRM DD in OMVS, then goes on to say As the preferred alternative to the IOEZPRM DDNAME specification, delete the IOEZPRM DDNAME and use the IOEPRMxx parmlib member. So I did not change the OMVS proc. When I reipled it worked and I got the message IOEZ00374I No IOEZPRM DD specified in OMVS proc. Parmlib search being used.
I edited USER.Z24A.PARMLIB(BPXPRM00) and removed the ASNAME in FILESYSTYPE TYPE(ZFS) ENTRYPOINT(IOEFSCM)~~ASNAME(ZFS)~~ Well I actually made a copy of the original line and put it between /* and */, then deleted the text.
I reipled.

Afterwards

The D OMVS,PFS command now gives N/A instead of the Address Space Name

OMVS     0010 ACTIVE             OMVS=(00,01,BP,IZ,RZ,BB)                
PFS CONFIGURATION INFORMATION                                         
 PFS TYPE   ENTRY      ASNAME    DESC      ST    START/EXIT TIME         
...
  ZFS       IOEFSCM    N/A       LOCAL     A     2021/02/17 17.55.47

Easy!

The hardest part was making sure I had an IPLable SARES1 in case I got it wrong!

Issuing commands…

I used to issue commands like f zfs,query,all. Now that the ZFS address space does not exist, you need to use f omvs,pfs=zfs,query,all.

Colins updates to MQ messages

As I was trying to get TLS to work on midrange, I had many MQ error messages. Sometimes the messages were a bit vague “you’ve had a problem. Resolve it and restart the channel”.

Below is the list of messages I’ve added comments to. I’ve done it as a blog post as well-known search engines are not finding the pages.

Mid range

z/OS

Client

JMSCMQ0001: IBM MQ call failed with compcode ‘2’ (‘MQCC_FAILED’) reason ‘2278’ (‘MQRC_CLIENT_CONN_ERROR’)

Control unit cache section (SMF 42.2) statistics appear strange and unloved. Should I use them? No!

Why should I not use them?

The SMF 42 subtype 2 control unit cache records seem to have strange data in them – and seem to add little value. The RMF 74 subtype 5 – Cache Subsystem Device Activity include most of the data in the 74.2, and contains additional data. The 74.5 record layout has good documentation. The documentation for 42.2 does not.

Why do I think the record is unloved?

Some of the record formats are non standard.

Many SMF records use a time since midnight in hundred’s of a second. SMF 42.2 does it in seconds (so I had to change my formatting routine – sigh).
SMF dates tend to be packed decimal. SMF 42.2 a date is documented as “Year, in the form 0cyy, where c is 0 for 19xx and 1 for 20xx, and yy”. It is really just the year – 1900. The value x79 is decimal 121 so 1900 + 121 is 2021 – this year.
It has two data sections “Statistics gathered from last update period” and Statistics gathered from current update period. The current is 90 seconds after the “last” ( should that be previous interval?). I dont know what data these sections contain. Is it for the whole duration between the SMF records, or just the last 90 seconds of each interval.
It has a field “Fast write bypasses per minute (an integer).” for each of the intervals. I don’t know what it means especially when you consider the interval between the two sections is 90 seconds.
The SMF records were produced every 30 minutes. I dont know how to get the data for the other 27 minutes.
There are sections for each SMS managed volume. The data basically says “Is Cache and Fast write” enabled for this volume. Yes/Yes. So what?

As this has a subtype of 2 I expect this record was produced 30+ years ago when people did not really know what they wanted in the SMF records. I expect Development quietly ignored this subtype, and created other subtypes with useful data. This is why I think you should use other records instead of this 42.2 records.

Do storage controllers pining having anything to do with a Norwegian Fiords?

No – it means I need to wear my glasses!

I was looking at the disk storage controller statistics and saw references to pinned storage. I naturally thought this was a reference to the Monty Python dead parrot sketch where the Norwegian Blue was “pining for the Fiords”.

Ive often wondered what Pinned storage was. I found “Pinned data cannot be removed from the cache by the storage machine because of a hardware failure”. I think that if the amount of pinned storage is greater than zero then phone someone.

There are other fields “Unavailable storage”.

These are examples of providing data – but no information. You need to know that if these fields are non zero, this is a problem and you need to do something about it – just displaying a number has little value. I would have hoped that that documentation would have said, in big bold letters: If these numbers are not zero then take action!

How do I used Linux to manage my corporate certificates?

Having used z/OS to be my corporate Certificate Authority I thought I would use Linux to be a corporate CA, and manage z/OS certificates. For more information on Certificate Authorities, and signing on certificates see here.

Setting up your Corporate CA up on Linux

At the top of the CA certificate hierarchy is a self signed certificate.

Create the CA self signed certificate

openssl req -x509 -config openssl-ca.cnf -newkey rsa:4096 -days 3000 -nodes -subj “/C=GB/O=SSS/OU=CA/CN=SSCA8” -out cacert.pem -keyout cacert.key.pem -outform PEM –addext basicConstraints=”critical,CA:TRUE, pathlen:0″ -addext keyUsage=”keyCertSign, digitalSignature”

This creates a certificate with

-x509 says make it self signed – so my enterprise master CA
using 4096 rsa encryption
a subject “/C=GB/O=SSS/OU=CA/CN=SSCA8”
valid for 3000 days
Not DES encryption ( -nodes) of the output files
the who and public key are stored in cacert.pem
the private key is stored in cacert.key.pem
using format pem
extra parameters CA:TRUE and keyusage…

Display it

openssl x509 -in cacert.pem -text -noout|less

Create a personal certificate on Linux and sign it.

Create a personal certificate on Linux and get it signed by the CA created above.

I set up a shell script to do the work

name=”adcdd”
subj=’-subj “/C=GB/O=cpwebuser/CN=adcdd” ‘
#Passwords are stored in a file called password.file
passwords=”-passin file:password.file -passout file:password.file”

#clean out the old foils
rm $name.key.pem
rm $name.csr
rm $name.pem

CA=”cacert”

# generate a private certificate using Elliptic curve and type secp256r1
openssl ecparam -name secp256r1 -genkey -noout -out $name.key.pem

#create a certificate signing request (CSR)
openssl req -new -key $name.key.pem -out $name.csr -outform PEM $subj $passwords

#sign it – or send it off to be signed. Get the $name.pem back from the request.
openssl ca -config openssl-ca-user.cnf -md sha384 -out $name.pem -cert cacert.pem -keyfile cacert.key.pem -policy signing_policy -extensions signing_mqweb -md sha256 -infiles $name.csr

#Get the *.pem file back, if required, and merge the files to form the .p12 file
openssl pkcs12 -export -inkey $name.key.pem -in $name.pem -out $name.p12 -CAfile $CA.pem -chain -name $name $passwords

I stored common information in configuration files, such as openssl-ca-user.cnf . This had a section called signing_policy, and signing_mqweb which had

[ signing_mqweb ]

keyUsage = digitalSignature
subjectAltName = DNS:localhost, IP:127.0.0.1
keyUsage = digitalSignature, keyEncipherment
extendedKeyUsage = serverAuth

Use Linux to be the Certificate Authority for my z/OS RACDCERT certificates.

Create a certificate on z/OS.

//IBMRACF JOB 1,MSGCLASS=H 
//S1 EXEC PGM=IKJEFT01,REGION=0M 
//SYSPRINT DD SYSOUT=* 
//SYSTSPRT DD SYSOUT=* 
//SYSTSIN DD * 
 
RACDCERT ID(START1) DELETE(LABEL('MYCERTL')) 
 /* create the certificate - note it is not signed
RACDCERT ID(START1) GENCERT - 
  SUBJECTSDN(CN('MYCERTL') - 
  O('SSS') - 
  OU('SSS')) - 
  ALTNAME(IP(10.1.1.2) - 
  DOMAIN('WWW.ME2.COM') )- 
  SIZE(4096) - 
  RSA - 
  WITHLABEL('MYCERTL') 

/* convert this to a certificate request and output it
RACDCERT GENREQ (LABEL('MYCERTL')) ID(START1) - 
DSN('IBMUSER.CERT.MYCERTL.CSR') 
/

The first line of the data set is —–BEGIN NEW CERTIFICATE REQUEST—– which is what I expect for a Certificate Signing Request.

FTP this down to the Linux machine as mycertl.csr and use the openssl ca command. This uses the cacert.*.pem files created above

openssl ca -config openssl-ca-user.cnf -md sha384 -out mycertl.pem -notext -cert cacert.pem -keyfile cacert.key.pem -policy signing_policy -extensions signing_mqweb -md sha256 -infiles mycertl.csr

Note: My openssl-ca-user.cnf is given below.

I carefully checked the details displayed, and replied y to both questions.

This produced

Using configuration from openssl-ca-user.cnf
Check that the request matches the signature
Signature ok
The Subject's Distinguished Name is as follows
organizationName :PRINTABLE:'SSS'
organizationalUnitName:PRINTABLE:'SSS'
commonName :PRINTABLE:'MYCERTL'
Certificate is to be certified until Oct 14 15:28:45 2023 GMT (1000 days)
Sign the certificate? [y/n]:y

1 out of 1 certificate requests certified, commit? [y/n]y
Write out database with 1 new entries
Data Base Updated

If the option -notext is not specified, the output file contains the readable interpretation of the certificate. Specify -notext to get the output file where the first line is —–BEGIN CERTIFICATE—–

Upload this output file to z/OS (for example “put mycertl.pem ‘IBMUSER.CERT.MYCERTL.PEM’ ” ).

Check the contents before you add it to the RACF keystore

RACDCERT CHECKCERT(‘IBMUSER.CERT.MYCERTL.PEM’)

Add the certificate to the keystore

//IBMRACF JOB 1,MSGCLASS=H 
//S1 EXEC PGM=IKJEFT01,REGION=0M 
//SYSPRINT DD SYSOUT=* 
//SYSTSPRT DD SYSOUT=* 
//SYSTSIN DD * 
RACDCERT ADD('IBMUSER.CERT.MYCERTL.PEM') - 
    ID(START1) WITHLABEL('MYCERTL) 
/*

The command racdcert list (label(‘MYCERTL’)) id(start1) shows the certificate has NOTRUST, so will not be visible on any keyring. You need

RACDCERT id(START1) ALTER(LABEL(MYCERTL’))TRUST

and will need to connect it to any keyrings.

Upload the CA certificate into the RACF database.

You will also need to upload the public key to the RACF database as a CERTAUTH, and connect it to any ring that uses a certificate signed by the Linux CA.

FTP the certificate file, cacert.pem (created above), to z/OS as text. Once you have FTPed the file, check the first line is “—–BEGIN CERTIFICATE—–“

Add it to RACF

You can add the certificate owned by a userid ( rather than certauth). The certificate needs to be connected to the keyring as usage(CERTAUTH).

//IBMRACF JOB 1,MSGCLASS=H 
//S1 EXEC PGM=IKJEFT01,REGION=0M 
//SYSPRINT DD SYSOUT=* 
//SYSTSPRT DD SYSOUT=* 
//SYSTSIN DD *  
 /* delete it if needed 
RACDCERT DELETE (LABEL('Linux-CA256')) ID(START1)

RACDCERT id(start1) ADD('IBMUSER.CA256.PEM') - 
          WITHLABEL('Linux-CA256') TRUST 

RACDCERT CONNECT(id(start1) LABEL('Linux-CA256') - 
      RING(TRUST) USAGE(CERTAUTH)) ID(START1) 

RACDCERT CONNECT(id(start1) LABEL('Linux-CA256') - 
      RING(DANRING) USAGE(CERTAUTH)) ID(START1) 

RACDCERT LISTRING(TRUST ) ID(START1) 

racdcert list (label('Linux-CA256')) id(start1) 

SETROPTS RACLIST(DIGTCERT,DIGTRING ) refresh

My openssl-ca-user.cnf file.

HOME = .
RANDFILE = $ENV::HOME/.rnd

####################################################################
[ ca ]
default_ca = CA_default # The default ca section

[ CA_default ]
default_days = 1000 # How long to certify for
default_crl_days = 30 # How long before next CRL

default_md = sha256 # Use public key default MD
preserve = no # Keep passed DN ordering

x509_extensions = ca_extensions # The extensions to add to the cert

email_in_dn = no # Don’t concat the email in the DN
copy_extensions = copy # Required to copy SANs from CSR to cert

base_dir = .
certificate = $base_dir/cacert.pem # The CA certifcate
private_key = $base_dir/cakey.pem # The CA private key
new_certs_dir = $base_dir # Location for new certs after signing
database = $base_dir/index.txt # Database index file
serial = $base_dir/serial.txt # The current serial number

unique_subject = no # Set to ‘no’ to allow creation of
# several certificates with same subject.

[ signing_policy ]
countryName = optional
stateOrProvinceName = optional
localityName = optional
organizationName = optional
organizationalUnitName = optional
commonName = supplied
emailAddress = optional

##########

[ signing_mqweb ]

subjectAltName = DNS:localhost, IP:127.0.0.1
keyUsage = digitalSignature, keyEncipherment
extendedKeyUsage = serverAuth

How to backup only key data sets on z/OS

I’ve been backing up my datasets on z/OS, and wondered what the best way of doing it was.

I wanted to backup datasets containing data I wanted to keep, but did not want to backup other data sets which could easily be recreated, such as IPCS dump dataset, the output of compiles, or the SMF records.

DFDSS has a backup and restore program which is very powerful. With it you can

Process data sets under a High Level Qualifier – include or exclude data sets.
Backup only changed data sets
Backup individual files in a ZFS or USS – but this is limited, you have to explicitly specify the files you want to backup. You cannot backup a directory

You cannot backup individual members of a PDS(E). You have to backup the whole PDS(E), If you need to restore a member, restore the backup with a different HLQ and select the members from that.

What should I use?

I tend to use XMIT and DFDSS – the Storage Management component on z/OS. This tends to be used by the data managers as it can backup groups of data sets, volumes, etc..

Backing up using XMIT.

This has the advantage that the output file is a card image, which is a portable format.

I have a job

//MYLIBS1 JCLLIB ORDER=USER.Z24A.PROCLIB 
// SET TODAY='D201224' 
//S1 EXEC PROC=BACKUP,P=USER.Z24A.PARMLIB,DD=&TODAY. 
//S1 EXEC PROC=BACKUP,P=USER.Z24A.PROCLIB,DD=&TODAY.

Where

P is the name of the dataset
TODAY – is where I set today’s date.

The backup procedure has

//BACKUP PROC P='USER.Z24A.PROCLIB',DD='UNKNOWN' 
//S1 EXEC PGM=IKJEFT01,REGION=0M, 
// PARM='XMIT A.A DSN(''&P'') OUTDSN(''BACKUP.&P..&DD'')' 
//SYSPRINT DD SYSOUT=* 
//SYSTSPRT DD SYSOUT=* 
//SYSTSIN DD * 
// PEND

The command that gets generated when P is USER.Z24A.PROCLIB and DD=D201224 is

XMIT A.A DSN('USER.Z24A.PROCLIB') OUTDSN('BACKUP.USER.Z24A.PROCLIB.D201224')

This makes it easy to find the backups for a file, and a particular data.

To restore a file you use command TSO RECEIVE INDSN(‘BACKUP.USER.Z24A.PROCLIB.D201224’) .

Using DFDSS to backup

This is a powerful program, and it is worth taking baby steps to understand it.

The basic job is

//IBMDFDSS JOB 1,MSGCLASS=H 
//S1 EXEC PGM=ADRDSSU,REGION=0M,PARM='TYPRUN=NORUN' 
//TARGET DD DSN=COLIN.BACKUP.DFDSS,DISP=(MOD,CATLG), 
// SPACE=(CYL,(50,50),RLSE) 
//SYSPRINT DD SYSOUT=* 
//SYSIN DD * 
 DUMP - 
   DATASET(INCLUDE(COLIN.JCL,COLIN.WLM,COLIN.C) - 
   BY(DSCHA,EQ,YES)) - 
   OUTDDNAME(TARGET) - 
   COMPRESS 
/*

For the syntax of the dump data set command see here.
This dumps the specified data sets, COLIN.JCL. COLIN.WLM, COLIN.C, takes them and puts them in one file through TARGET. TARGET is defined a dataset (COLIN.BACKUP.DFDSS).
This does not actually do the backup because it has TYPRUN=NORUN.
You can specify many filter criteria, in the BY(…) such as last reference, size, etc. See here.
The BY(DSCHA,EQ,YES) says dump datasets only if they have the “changed flag” set. The Changed flag is set when a data set was open for output. Using ADRDSSU with the RESET option resets the changed flag. This allows you to backup only data sets which have changed – see below.
It compresses the files as it backs up the files.

I did have

DATASET(INCLUDE(COLIN.**) - 
   EXCLUDE(COLIN.O.**,COLIN.SMP*.**,COLIN.DDIR ) - 
    BY(DSCHA,EQ,YES)) -

Which says backup all data sets with the High Level Qualifier COLIN.**, but exclude the listed files. I ran this using TYPRUN=NORUN, and this listed 100+ datasets. Whoops, so I changed it to explicitly include the files I wanted to backup. Once I had determined the files I wanted to backup I removed the TYPRUN=NORUN, and backed up the datasets.

Using DFDSS to restore

You can restore from the DFDSS backups using a job like

//S1 EXEC PGM=ADRDSSU,REGION=0M,PARM='TYPRUN=NORUN' 
//TARGET DD DSN=COLIN.BACKUP.DFDSS,DISP=SHR
//SYSPRINT DD SYSOUT=* 
//SYSIN DD * 
  RESTORE - 
    DATASET(INCLUDE(COLIN.C) ) - 
    RENAME(COLINN) - 
    INDDNAME(TARGET) 
/*

This says restore the files specified in the INCLUDE… rename the HLQ to be COLINN. From the dataset via //TARGET.

Initially I specified PARM=’TYPRUN=NORUN’ so it did not actually try to restore the files. It reported

THE INPUT DUMP DATA SET BEING PROCESSED IS IN LOGICAL DATA SET FORMAT AND WAS CREATED BY Z/OS DFSMSDSS 
VERSION 2 RELEASE 4 MODIFICATION LEVEL 0 ON 2020.359 17:16:44 
DATA SET COLINN.C WAS SELECTED 
PROCESSING BYPASSED DUE TO NORUN OPTION 
THE FOLLOWING DATA SETS WERE SUCCESSFULLY PROCESSED 
COLIN.C

From the time stamp 2020.359 17:16:44 we can see I was using the expected backup.

Once you are happy you have the right backup, and list of data sets, you can remove the PARM=’TYPRUN=NORUN’ to restore the data.

If you have backed up COLIN.JCL, and SUE.JCL, and try to rename on restore ( so you do not overwrite existing files) it would fail because if would create COLINN.JCL and then try to create COLINN.JCL from the other file! To get round this using INCLUDE(COLIN.**) RENAMEN(COLINN) and INCLUDE(SUE.*) renamen(SUEN) .

What’s in the backup?

You can use the following to list the contents (with TYPRUN=NORUN)

RESTORE - 
  DATASET(INCLUDE(**) ) - 
  INDDNAME(TARGET)

Note: that because this job does not have REPLACE, it will not overwrite any files.

Using advanced backup facilities.

Each dataset has a changed-flag associated with it. If this bit is on, the data set has been changed. You can display this in the data set section of ISMF. Field 24 – CHG IND, or if you have access to the DCOLLECT output, it is in one of the flags.

If you use

DUMP - 
   DATASET(INCLUDE(COLIN.JCL,COLIN.WLM,COLIN.C) - 
   BY(DSCHA,EQ,YES)) - 
   RESET - 
   OUTDDNAME(TARGET) - 
   COMPRESS

it will backup the data sets, and reset the changed flag. In my case it backed up the 3 data sets I had specified.

When I reran the same job, it backup up NO data sets, giving me a message

ADR383W (001)-DTDSC(01), DATA SET COLIN.JCL NOT SELECTED, 01.
Where 01 means The fully qualified data set name did not pass the INCLUDE, EXCLUDE, and/or BY filtering criteria.

This is because I had specified BY(DSCHA,EQ,YES)) which says filter by Data Sets with the CHAnge flag on (DSCSHA) flag on. The first DUMP request RESET the flag, the second DUMP job skipped the data sets.

You can exploit this by backing up all data sets once a week, and just changed data sets during the week.

You might need to keep the output of the dump job in member of a PDS, so you can search for your dataset name to find the date when a backup was done which included the file.

How many backups should I keep?

This depends on if you are backing up all, or just changed files. You can use GDG (see here) where you use a generation of dataset. If you specify 3 generations, then when you create the 4th copy, it deletes copy 1 etc.

How can I replicate the RACF definitions for MQ on z/OS?

If you are the very careful person who makes all updates to RACF only through batch jobs, then this is easy – take the old jobs, and change the queue manager name and rerun them.

For the other 99.99% of us, read on…

Even if you have been careful to keep track of any changes to security definitions, someone else may have made a change either using the native TSO commands, or via the ISPF panels.
You can list the RACF database, but there is no easy way of listing the RACF database in command format, to allow you to do a global rename, and submit the commands.

I have found two ways of extracting the RACF definitons.

Using an unloaded copy of the RACF database
Using RACF commands to extract and recreate the requests

Using an unloaded copy of the RACF database

I discovered dbsync on a RACF tools repository which does most of the hard work. You can run a RACF utility to unload the RACF database into a flat file (omitting sensitive information like passwords etc). Dbsync is a rexx program which takes two copies of an unloaded database, and generates the RACF commands for the differences. I simply used my existing unloaded file and a null file, and got out the commands to create all of the entries.

The steps are

Unload the RACF database
Get dbsync into your z/OS system
Run DBsync
Edit the files, and remove all lines which are not relevant
Run the output to create/modify the definitions

Unload the database

//IBMUSUN JOB 1,MSGCLASS=H 
//* use the TSO RVARY command to display databases 
//UNLOAD EXEC PGM=IRRDBU00,PARM=NOLOCKINPUT 
//SYSPRINT DD SYSOUT=* 
//INDD1 DD DISP=SHR,DSN=SYS1.RACFDS 
//OUTDD DD DISP=(MOD,CATLG),DSN=COLIN.RACF.UNLOAD, 
// SPACE=(CYL,(1,1)),DCB=(LRECL=4096,RECFM=VB,BLKSIZE=13030)

Of course this assumes you have the authority to create this file. If not ask a friendly sysprog to run the command, edit the to output delete all records which do not have MQ in them.

Run dbsync

I had to make the following changes

Dataset 1 was the dataset I created above
Dataset 2 was a dummy

Modify the sort step to output to a temporary output file

//COLINRA JOB 1,MSGCLASS=H 
//* ftp://public.dhe.ibm.com/eserver/zseries/zos/racf/dbsync/ 
//SORT1 EXEC PGM=SORT 
//SYSOUT DD SYSOUT=* 
//SORTIN DD DISP=SHR,DSN=COLIN.RACF.UNLOAD 
//SORTOUT DD DISP=(NEW,PASS),DSN=&TEMP1,SPACE=(CYL,(1,1)) 
//SYSIN DD * 
SORT FIELDS=(5,2,CH,A,7,1,AQ,A,8,549,CH,A) 
ALTSEQ CODE=(F080,F181,F282,F383,F484,F585,F686,F787,F888,F989, 
C191,C292,C393,C494,C595,C696,C797,C898,C999, 
D1A1,D2A2,D3A3,D4A4,D5A5,D6A6,D7A7,D8A8,D9A9, 
E2B2,E3B3,E4B4,E5B5,E6B6,E7B7,E8B8,E9B9) 
OPTION VLSHRT,DYNALLOC=(SYSDA,3) 
/*

Delete the sort of the other data set – as I was using a dummy file

Run dbsync

I changed the bold lines below, the template JCL had

//OUTSCD1 DD DSN=your.dsname.for.outscd1,
// DISP=(NEW,CATLG),

so I changed

your.dsname.for to COLIN.RACF
NEW,CATLG to MOD,CATLG
Upper cased the changed lines using the ucc…ucc ISPF edit line command.

//DBSYNC EXEC PGM=IKJEFT01,REGION=5000K,DYNAMNBR=50,PARM='%DBSYNC' 
//SYSPRINT DD SYSOUT=* 
//SYSTSPRT DD SYSOUT=* 
//SYSTSIN DD DUMMY 
//SYSEXEC DD DISP=SHR,DSN=COLIN.DBSYNC.REXX 
//OPTIONS DD * 
/* your options here 
//INDD1 DD DISP=SHR,DSN=*.SORT1.SORTOUT 
//INDD2 DD DUMMY 
//OUTADD1 DD DSN=COLIN.RACF.ADDFILE1, 
// DISP=(MOD,CATLG), 
// UNIT=SYSDA,SPACE=(CYL,(25,25),RLSE), 
// DCB=(RECFM=VB,LRECL=255,BLKSIZE=6400) 
etc

The output was rexx commands in a file, such as

“rdefine MQCMDS CSQ9.** owner(IBMUSER ) uacc(CONTROL )
    audit(failures(READ )) level(00)”
“permit CSQ9.** class(MQCMDS) reset”
“rdefine MQQUEUE CSQ9.** owner(IBMUSER ) uacc(NONE )
     audit(failures(READ )) level(00) warning notify(IBMUSER )”
“permit CSQ9.** class(MQQUEUE) reset”
“rdefine MQCONN CSQ9.BATCH owner(IBMUSER ) uacc(CONTROL )
    audit(failures(READ )) level(00)”
“permit CSQ9.BATCH class(MQCONN) reset”
“rdefine MQCONN CSQ9.CHIN owner(IBMUSER ) uacc(READ )
    audit(failures(READ )) level(00)”
“permit CSQ9.BATCH class(MQCONN) id(IBMUSER ) access(ALTER )”
“permit CSQ9.BATCH class(MQCONN) id(START1 ) access(UPDATE )”
“permit CSQ9.CHIN class(MQCONN) id(IBMUSER ) access(ALTER )”

You edit and run the the Rexx exec to issue the commands.

Easy – it took me less than half an hour from start to finish.

Using RACF commands to extract and recreate the requests

I found that most people do not have access to an unloaded RACF database. My normal userid does not have the authority to create the unloaded copy.

I put an exec up on Github. It issues a display command for each class in MQCMDS MXCMDS MQQUEUE MXQUEUE MXTOPIC MQADMIN MXADMIN MQCONN and formats it as a RDEFINE command, and then issues the permit command to give people access to it. It writes the output in to the file being edited.

Use ISPF to edit a member where you want the output.

Make sure the rexx exec is in the SYSPROC or SYSEXEC concatenation, for example use ISRDDN to check.

Syntax

genclass <queuemanagername>

The output is like

 /* class:MXCMDS profile:MQPA class not found 
 /* class:MXQUEUE profile:MQPA profile not found 
 /* class:MXTOPIC profile:MQPA profile not found 
 /* class:MXADMIN profile:MQPA profile not found 
RDEFINE MQCONN - 
MQPA.CICS - 
- /* Create date 07/17/20 
OWNER(ADCDA) - 
- /* Last reference Date 07/17/20 
- /* Last changed date 07/17/20 
- /* Alter count 0 
- /* Control count 0 
- /* Update count 0 
- /* Read count 0 
UACC(NONE) - 
LEVEL(0) - 
- /* Global audit NONE 
/* Permit MQPA.CICS CLASS(MQCONN ) RESET 
Permit MQPA.CICS CLASS(MQCONN ) ID(ADCDA ) ACCESS(ALTER ) 
Permit MQPA.CICS CLASS(MQCONN ) ID(START1 ) ACCESS(READ ) 
/* class:MQCONN profile:MQPA.CICS profile not found

It includes a Permit… RESET if you want to remove all access