When is activity trace enabled?

I found the documentation for activity trace was not clear as to the activity trace settings.

In mqat.ini you can provide information as to what applications (including channels) you want traced.

For example


This file and trace value are checked when the application connects.  If you have TRACE=ON when the application connects, and you change it to TRACE=OFF, it will continue tracing.

If you have TRACE=OFF specified, and the application connects, changing it to TRACE=ON will not produce any records.


  • TRACE=ON, the application will be traced
  • TRACE=OFF the application will not be traced
  • TRACE= or omitted then the tracing depends on alter qmgr ACTVTRC(ON|OFF).   For a long running transaction using alter qmgr to turn it on, and then off, you will get trace records for the application from in the gap.

If you have



then program progput will have trace turned on because the definition is more specific.

You could have



to  be able to turn trace on for all programs beginning with prog, but not to trace progzzz.


Thanks to Morag of MQGEM  who got in contact with me, and said  long running tasks are notified of a change to the mqat.ini file, if the file has changed, and a queue manager attributed has been changed – even if it is changed to the same variable.

This and lots of other great info about activity trace (a whole presentation’s worth of information) is available here.

The power of exclude in ISPF edit

I hope most people know about the ISPF edit exclude facility.  You can issue commands like

X ALL /* exclude every thing */
DELETE ALL X /* delete everything which is excluded */
F ZZZZ all
F AAAA all
FLIP /* make the excluded visible, and the displayed hidden */
CHANGE 'ABC' 'ZYX' ALL NX /* do not change hidden lines */

You can also exploit this in macros.  See ISREDIT XSTATUS.

/* rexx */ 
"ISREDIT EXCLUDE 'AAA' ALL PREFIX" /* just like the command*/
"ISREDIT LOCATE .ZLAST" /* go to the last line */
"ISREDIT (r,c) = CURSOR" /* get the row and column number */
/* r has the row number of the last line */
do i = 1 to r
say "Line exclude status is" i xnx
"ISREDIT (line) = LINE " i /* extract the line */
if (...) then
"ISREDIT XSTATUS "i" = X" /* exclude it */

You can also see if a line has been changed this edit session:

"ISREDIT (status) = LINE_STATUS "i /* for line i */
if (substr(status,8,1) = '1') then
say "Line" i " was changed this session "



It is now so easy to have hide/show text in your html page. Auto-twisties are here.

Click me to show the rest of the information. Go on, click me

This text is hidden until the summary line is clicked.


This is done using

<details><summary> This is displayed</summary>this is hidden until the summary is clicked</details>

So easy and impressive isn’t it. You used to have to set up scripts etc to do this. Now it is built in. Of course,  you’ll need to format the summary so it looks clickable.  

Note: Another technique, the accordion, requires a java script for it to work. <details… works without java script.

You can also have hover test using <span title = “hovertext”>displayed text</span>.

How to backup only key data sets on z/OS

I’ve been backing up my datasets on z/OS, and wondered what the best way of doing it was.

I wanted to backup datasets containing data I wanted to keep, but did not want to backup other data sets which could easily be recreated, such as IPCS dump dataset, the output of compiles, or the SMF records.

DFDSS has a backup and restore program which is very powerful.  With it you can

  • Process data sets under a High Level Qualifier – include or exclude data sets.
  • Backup only changed data sets
  • Backup individual files in a ZFS or USS – but this is limited, you have to explicitly specify the files you want to backup.   You cannot backup a directory

You cannot backup individual members of a PDS(E).   You have to backup the whole PDS(E),   If you need to restore a member, restore the backup with a different HLQ and select the members from that.

What should I use?

I tend to use XMIT and DFDSS – the Storage Management component on z/OS. This tends to be used by the data managers as it can backup groups of data sets, volumes, etc..

Backing up using XMIT.

This has the advantage that the output file is a card image, which is a portable format.

I have a job

// SET TODAY='D201224'


  • P is the name of the dataset
  • TODAY  – is where I set today’s date.

The backup procedure has


The command that gets generated when P is USER.Z24A.PROCLIB and DD=D201224 is


This makes it easy to find the backups for a file, and a particular data.

To restore a file you use command TSO RECEIVE INDSN(‘BACKUP.USER.Z24A.PROCLIB.D201224’)  .

Using DFDSS to backup

This is a powerful program, and it is worth taking baby steps to understand it.

The basic job is

// SPACE=(CYL,(50,50),RLSE)
  • For the syntax of the dump data set command see here.
  • This dumps the specified data sets, COLIN.JCL. COLIN.WLM, COLIN.C, takes them and puts them in one file through TARGET.   TARGET is defined a dataset (COLIN.BACKUP.DFDSS).
  • This does not actually do the backup because it has TYPRUN=NORUN.
  • You can specify many filter criteria, in the BY(…) such as last reference, size, etc.  See here.
  • The BY(DSCHA,EQ,YES) says dump datasets only if they have the “changed flag” set.  The Changed flag is set when a data set was open for output.  Using ADRDSSU with the RESET option resets the changed flag.   This allows you to backup only data sets which have changed – see below.
  • It compresses the files as it backs up the files.

I did have


Which says backup all data sets with the High Level Qualifier COLIN.**, but exclude the listed files.  I ran this using TYPRUN=NORUN, and this listed 100+ datasets.   Whoops, so I changed it to explicitly include the files I wanted to backup.  Once  I had determined the files I wanted to backup I removed the TYPRUN=NORUN, and backed up the datasets.

Using DFDSS to restore

You can restore from the DFDSS backups using a job like


This says restore the files specified in the INCLUDE…  rename the HLQ to be COLINN.  From the dataset via //TARGET.

Initially I specified PARM=’TYPRUN=NORUN’ so it did not actually try to restore the files.   It reported


From the time stamp 2020.359 17:16:44 we can see I was using the expected backup.

Once you are happy you have the right backup, and list of data sets, you can remove the PARM=’TYPRUN=NORUN’ to restore the data.

If you have backed up COLIN.JCL, and SUE.JCL, and try to rename on restore ( so you do not overwrite existing files) it would fail because if would create COLINN.JCL and then try to create COLINN.JCL from the other file!   To get round this using INCLUDE(COLIN.**) RENAMEN(COLINN) and INCLUDE(SUE.*) renamen(SUEN) .


What’s in the backup?

You can use the following to list the contents  (with TYPRUN=NORUN)


Note: that because this job does not have REPLACE, it will not overwrite any files.

Using advanced backup facilities.

Each dataset has a changed-flag associated with it.   If this bit is on, the data set has been changed.  You can display this in the data set section of ISMF.  Field 24 – CHG IND, or if you have access to the DCOLLECT output, it is in one of the flags.

If you use


it will backup the data sets, and reset the changed flag.  In my case it backed up the 3 data sets I had specified.

When I reran the same job, it backup up NO data sets, giving me a message

Where  01 means  The fully qualified data set name did not pass the INCLUDE, EXCLUDE, and/or BY filtering criteria.

This is because I had specified BY(DSCHA,EQ,YES)) which says filter by Data Sets with the CHAnge flag on (DSCSHA) flag on.  The first DUMP request RESET the flag, the second DUMP job skipped the data sets.

You can exploit this by backing up all data sets once a week, and just changed data sets during the week.

You might need to keep the output of the dump job in member of a PDS, so you can search for your dataset name to find the date when a backup was done which included the file.

How many backups should I keep?

This depends on if you are backing up all, or just changed files.  You can use GDG (see here) where you use a generation of dataset.  If you specify 3 generations, then when you create the 4th copy, it deletes copy 1 etc.

How can I replicate the RACF definitions for MQ on z/OS?

If you are the very careful person who makes all updates to RACF only through batch jobs, then this is easy – take the old jobs, and change the queue manager name and rerun them.

For the other 99.99% of us,  read on…

Even if you have been careful to keep track of any changes to security definitions,  someone else may have made a change either using the native TSO commands, or via the ISPF panels. 
You can list the RACF database, but there is no easy way of listing the RACF database in command format, to allow you to do a global rename, and submit the commands.

I have found two ways of extracting the RACF definitons.

  1. Using an unloaded copy of the RACF database
  2. Using RACF commands to extract and recreate the requests

Using an unloaded copy of the RACF database

I discovered dbsync on a RACF tools repository which does most of the hard work.   You can run a RACF utility to unload the RACF database into a flat file (omitting sensitive information like passwords etc).  Dbsync is a rexx program which takes two copies of an unloaded database, and generates the RACF commands for the differences. I simply used my existing unloaded file and a null file, and got out the commands to create all of the entries.

The steps are

  1. Unload the RACF database
  2. Get dbsync into your z/OS system
  3. Run DBsync
  4. Edit the files, and remove all lines which are not relevant
  5. Run the output to create/modify the definitions

Unload the database

//* use the TSO RVARY command to display databases
// SPACE=(CYL,(1,1)),DCB=(LRECL=4096,RECFM=VB,BLKSIZE=13030)

Of course this assumes you have the authority to create this file.  If not ask a friendly sysprog to run the command, edit the to output delete all records which do not have MQ in them.

Run dbsync

I had to make the following changes

  1. Dataset 1 was the dataset I created above
  2. Dataset 2 was a dummy

Modify the sort step to output to a temporary output file

//* ftp://public.dhe.ibm.com/eserver/zseries/zos/racf/dbsync/
SORT FIELDS=(5,2,CH,A,7,1,AQ,A,8,549,CH,A)
ALTSEQ CODE=(F080,F181,F282,F383,F484,F585,F686,F787,F888,F989,

Delete the sort of the other data set – as I was using a dummy file

Run dbsync

I changed the bold lines below, the template JCL had

//OUTSCD1 DD DSN=your.dsname.for.outscd1,

so I changed

  • your.dsname.for to COLIN.RACF
  • Upper cased the changed lines using the ucc…ucc ISPF edit line command.
/* your options here

The output was rexx commands in a file, such as

“rdefine MQCMDS CSQ9.** owner(IBMUSER ) uacc(CONTROL )
    audit(failures(READ )) level(00)”
“permit CSQ9.** class(MQCMDS) reset”
“rdefine MQQUEUE CSQ9.** owner(IBMUSER ) uacc(NONE )
     audit(failures(READ )) level(00) warning notify(IBMUSER )”
“permit CSQ9.** class(MQQUEUE) reset”
“rdefine MQCONN CSQ9.BATCH owner(IBMUSER ) uacc(CONTROL )
    audit(failures(READ )) level(00)”
“permit CSQ9.BATCH class(MQCONN) reset”
“rdefine MQCONN CSQ9.CHIN owner(IBMUSER ) uacc(READ )
    audit(failures(READ )) level(00)”
“permit CSQ9.BATCH class(MQCONN) id(IBMUSER ) access(ALTER )”
“permit CSQ9.BATCH class(MQCONN) id(START1 ) access(UPDATE )”
“permit CSQ9.CHIN class(MQCONN) id(IBMUSER ) access(ALTER )”

You edit and run the the Rexx exec to issue the commands.

Easy – it took me less than half an hour from start to finish.

Using RACF commands to extract and recreate the requests

I found that most people do not have access to an unloaded RACF database.  My normal userid does not have the authority to create the unloaded copy. 

I put an exec up on Github.   It issues a display command for each class in MQCMDS MXCMDS MQQUEUE MXQUEUE MXTOPIC MQADMIN MXADMIN MQCONN and formats it as a RDEFINE command, and then issues the permit command to give people access to it.  It writes the output in to the file being edited.

Use ISPF to edit a member where you want the output.

Make sure the rexx exec is in the SYSPROC or SYSEXEC concatenation, for example use ISRDDN to check.


genclass <queuemanagername>

The output is like

 /* class:MXCMDS profile:MQPA class not found 
/* class:MXQUEUE profile:MQPA profile not found
/* class:MXTOPIC profile:MQPA profile not found
/* class:MXADMIN profile:MQPA profile not found
- /* Create date 07/17/20
- /* Last reference Date 07/17/20
- /* Last changed date 07/17/20
- /* Alter count 0
- /* Control count 0
- /* Update count 0
- /* Read count 0
LEVEL(0) -
- /* Global audit NONE
/* class:MQCONN profile:MQPA.CICS profile not found

It includes a Permit… RESET if you want to remove all access

How to capture TSO command output into an ISPF edit session

One way of doing this, is to run a TSO command in batch, use SDSF to look at the output, and then use SE to edit the spool file.

This was too boring, so I wrote a quick macro ETSO

/* REXX /  parse source p1 p2 p3 p4 p5 p6 p7 env p9  
if env = "ISPF" then  
    Say "This has to run from an ISPF edit session " 
   return 8;  
/* prevent auto save … user has to explicitly save or cancel*/  
"ISREDIT autoSave off prompt"  
x = OUTTRAP('VAR.')  
address tso   cmd rest  
y = OUTTRAP('OFF')  
/* insert each record in turn last to first at the top of the file */ 
do i = var.0 to 1 by -1 
  "ISREDIT LINE_AFTER 0  = '"var.i"'" 

This macro needs to go into a dataset in SYSEXEC or SYSPROC concatenation – See TSO ISRDDN. I typed etso LU in an edit session, and it inserted the output of the TSO LU the top of the file.

For capturing long lines of output, you’ll need to be in a file with a long record length.


To create a temporary dataset to run the above command I created another rexx called TEMPDSN

/* REXX */
parse source p1 p2 p3 p4 p5 p6 p7 env p9
if env = "ISPF" then
  Say "This has to run from an ISPF session "
  return 8;
u = userid()
address TSO "ALLOC DSN('"u".temptemp') space(1,1) tracks NEW DELETE ",
"BLKSIZE(1330) LRECL(133) recfm(F B) DDNAME(MYDD)"
if (rc <> 0) then exit
address ISPEXEC "EDIT DATASET('"u".temptemp')"

You can then do even cleverer things by having TEMPDSN calling ETSO as a macro, and passing parameters using the ISPF variable support from the TSO command line.

Updated TEMPDSN, changes in bold.

/* REXX */ 
parse source p1 p2 p3 p4 p5 p6 p7 env p9
say "env" env
if env = "ISPF" then
Say "This has to run from an ISPF session "
return 8;
tempargs = ""
/* if we were passed any parameters */

parse arg args
if ( length(args) > 0) then
tempargs = args
"VPUT tempargs profile"

u = userid()
address TSO "ALLOC DSN('"u".temptemp') space(1,1) tracks NEW DELETE ",
"BLKSIZE(1330) LRECL(133) recfm(F B) DDNAME(MYDD)"
if (rc <> 0) then exit
address ISPEXEC "EDIT DATASET('"u".temptemp') MACRO(ETSO)"

if ( length(tempargs) > 0) then
"VERASE tempargs profile"

and the macro ETSO has

/* REXX */ 
parse source p1 p2 p3 p4 p5 p6 p7 env p9
if env = "ISPF" then
Say "This has to run from an ISPF edit session "
return 8;
/* prevent auto save ... user has to explicitly save or cancel*/
"ISREDIT autoSave off prompt"
if (length(cmd) = 0) then
"VGET tempargs " PROFILE
if (rc = 0) then
cmd = tempargs
address tso cmd rest
/* insert each record at the top of the file */
do i = var.0 to 1 by -1
"ISREDIT LINE_AFTER 0 = '"var.i"'"

From ISPF 6, a TSO command line, I issued tempdsn racdcert list certauth  and it gave me the file with the output from the command, and the file has record length of 133.  If you try to  save the data,  the data set will be deleted on close.



Familiarity makes you blind to ugliness.

I’ve been using z/OS in some for over 40 years and I do not see many major problems with it (some other people may – but it cannot be very many).   It is the same with MQ on z/OS – but since I left IBM, and did not use MQ on z/OS for a year or so,  and I am now coming to it as a customer, I do see a few things that could be improved. When these products were developed, people were grateful for any solution which worked, rather than an elegant solution which did not work.  In general people do not like to be told their baby is ugly.

I have been programming in Python, and gone back to C to extract certificate information from RACF, and C looks really ugly in comparison.  It may be lack of vision from the people who managed the C language, or lack of vision from the people who set standards, but the result is ugly.

My little problem (well, one of them)

I have decoded a string returned from RACF and it is in an ugly structure which uses

 typedef struct _attribute {     
oid identifier;
buffer * pValue ;
int lName
} x509_rdn_attribute;

typedef enum {
unknown = 0,
name = 1,
surname = 2,
givenName = 3,
initials = 4,
commonName = 6

} oid ;

My problem is that I have an oid identifier of 6 – how to I get map this to the string “commonName”.

I could use a big switch statement, but it means I have to preprocess the list, or type it all in.  I then need to keep an eye on the provided definitions, and update my file if it changes.  The product could provide a function to do the work for me.  A good start – but perhaps the wrong answer.

If I ruled the world…

As well as “Every day would be the first day of Spring”  I would have the C language provide a function for each enum so mune_oid(2) returns “surname”;

I wrote some code to interpret distributed MQ trace, and I had to do this reverse mapping for many field types.  In the end, I wrote some Python script which took the CMQC.h header file and transformed each enum into a function which did the switch, and return the symbolic name from the number.

I came up with the idea of using definitions like colin.h with clever macros,

TYPEDEF_ENUM (_attribute)
  ENUM(unknown,0,descriptive comment);
ENUM(name,1,persons name);

For normal usage I would define macros

#define  TYPEDEF_ENUM (a)  typedef struct a {
#define  ENUM(a,b,c)  a=b,/* c*/
#define END_TYPEDEF_ENUM(a)  } a;
#include <colin.h>

For the lookup processing the macros could be

#define  TYPEDEF_ENUM (a)  char * lookup_a(int  in);switch(in){
#define  ENUM(a,b,c)  case b return #a; /* c */
#define END_TYPEDEF_ENUM(a)  } return "Unknown";
#include <colin.h>

but this solution means I have to include colin.h more than once, which may cause duplicate definitions.

My solution looks uglier than the problem, so I think I’ll just stick to my Python solution and creating a mune… function.


Here’s another nice mess I’ve gotten into!

Or “How to clean up the master catalog when you have filled it up with junk”. Looking at my z/OS system,  I was reminded of my grandfathers garage/workshop where the tools were carefully hung up on walls, the chisels were carefully stored a a cupboard to keep them sharp etc.   He had boxes of screws, different sizes and types in different boxes.   My father had a shed with a big box of tools.  In the box were his chisels, hammers, saws etc..  He had a big jar of “Screws – miscellaneous – to be sorted”.    The z/OS master catalog should be like my grandfather’s garage, but I had made it like my father’s shed.

Well, what a mess I found!   This blog post describes some of the things I had to do to clean it up and follow best practices.

In days of old, well before PCs were invented, all data sets were cataloged in the master catalog.  Once you got 10s of thousands of data sets on z/OS, the time to search the catalog for a dataset name increased, and the catalogs got larger and harder to manage.  They solved this about 40 years ago by developing the User Catalog  – a catalog for User entries.
Instead of 10,000 entries for my COLIN.* data sets, there should be an Alias COLIN in the Master Catalog which points to another catalog which could be just for me, or can be shared by other users. This means that even if I have 1 million datasets in the user catalog, the access time for system datasets is not affected.  What I expected to see in the master catalog is the system datasets, and aliases for the userids.  I had over 400 entries for COLIN.* datasets, 500 BACKUP.COLIN.* datasets, 2000, MQ.ARCHIVE.* datasets etc.  What a mess!

Steps to solve this.

Prevention is better than cure.

You can use the RACF Option PROTECTALL.  This says a userid needs a RACF profile before it can create a dataset.  This means each userid (and group) needs a profile  like ‘COLIN.*’, and give the userid access to this profile.  Once you have done this for all userids, you can use the RACF command SETROPTS PROTECTALL(WARNING) to enable this support.   This will allow users to create datasets, when there is no profile, but produces a warning message on the operator console – so you can fix it.  An authorised person can use SETROPTS NOPROTECTALL to turn this off.  Once you have this running with no warnings you can use the command SETROPTS PROTECTALL to make it live – without warnings, and you will live happily ever after, or at least till the next problem.


  1. Whenever you create a userid you need to create the RACF dataset profile for the userid.
  2. You also need to set up an ALIAS for the new userid to point to a User Catalog.

How bad is the problem?

You can use IDCAMS to print the contents of a catalog


This has output like

ALIAS --------- BAQ300

This says there are datasets BACKUP… which should not be in the catalog.
There is an Alias BAQ300 which points to a user catalog.   This is what I expect.

The IDCAMS command


list all of the aliases in the catalog, for example

ALIAS --------- BAQ300 

This shows for high level qualifier BAQ3000, go and look in the user catalog  USERCAT.Z24A.PRODS.

Moving the entries out of the Master Catalog

The steps to move the COLIN.* entries out of the Master Catalog are

  1. Create a User Catalog
  2. Create an ALIAS COLIN2 which points to this User Catalog. 
  3. Rename COLIN…. to COLIN2….
  4. Create an ALIAS COLIN for all new data sets.
  5. Rename COLIN2… to COLIN…
  6. Delete the ALIAS COLIN2.

Create a user catalog

Use IDCAMS to create a user catalog

MEGABYTES(15 15) -
FREESPACE(10 10) -
STRNO(3 ) ) -
BUFND(4) ) -

To list what is in a user catalog

Use a similar IDCAMS command to list the master catalog 


Create an alias for COLIN2


Get the COLIN.* entries from the Master Catalog into the User Catalog

This was a bit of a challenge as I could not see how to do a global  rename.

You can rename non VSAM dataset either using ISPF 3.4 or use the TSO rename command in batch.

The problem occurs with the VSAM data sets.   When I tried to use the IDCAMS rename, I got an error code IGG0CLE6-122 which says I tried to do a rename, which would cause a change of catalog.

The only way I found of doing it was to copy the datasets to a new High Level Qualifier, and delete the originals.   Fortunately DFDSS has a utility which can do this for you.


Most of the data sets were “renamed” to COLIN2… but I had a ZFS which was in use, and some dataset aliases.  I used

  •  the TSO command unmount filesystem(‘COLIN.ZCONNECT.BETA.ZFS’)
  • the  IDCAMS command DELETE COLIN.SCEERUN ALIAS for each of the aliases.

and reran the copy job.   This time it renamed the ZFS.  The renaming steps are

  • Check there are no datasets with the HLQ COLIN.
  • Define an alias for COLIN in the master catalog to point to a user catalog.
  • Rerun the copy job to copy from COLIN2 back to COLIN.
  • Mount the file system.
  • Redefine the alias to data sets (eg COLIN.SCEERUN).
  • Delete the alias for COLIN2.

To be super efficient, and like my grandfather, I could have upgraded the SMS ACS routines to force data sets to have the “correct” storage class, data class, or management class.  The job output showed  “Data set COLIN2.STOQ.CPY has been allocated with newname  COLIN.STOQ.CPY using STORCLAS SCBASE,  no DATACLAS, and no MGMTCLAS“.  These classes were OK for me, but may not be for a multi-user z/OS system.

One last thing, don’t forget to add the new user catalog to your list of datasets to backup.

Should I run MQ for Linux, on z/OS on my Linux?

Yes – this sounds crazy.  Let me break it down.

  1. zPDT is a product from IBM that allows me to run system 390 application on my laptop.  I am running z/OS 2.4, and MQ 9.1.3.   For normal editing and running MQ – it feels as fast as when I had my own real hardware at IBM.
  2. z/OS 2.4 can run docker images in a special z/OS address space called zCX.
  3. I can run MQ distributed in docker.

Stringing these together I can run MQ distributed in a docker environment. The docked environment runs on z/OS.    The z/OS runs runs on my laptop!  To learn more about running this scenario on real z/OS hardware… 

20 years ago, someone was enthusiastically telling me how you could partition distributed servers using a product called VMWare.    They kept saying how good it was, and asking why wasn’t I excited about it.  I said that when I joined IBM – 20 years before the discussion (so 40 years ago), the development platform was multiple VS1 (an early MVS) running on VM/360.  Someone had VM/360 running under VM/360 with VS1 running on that!  Now that was impressive!

Now if only I could get z/OS to run in docker….

Should I Use MQ workflow in z/OSMF or not? Not.

It was a bit windy up here in Orkney, so instead of going out for a blustery walk, I thought I would have a look at the MQ Workflow in z/OSMF; in theory it makes it easier to deploy a queue manager on z/OS. It took a couple of days to get z/OSMF working, a few hours to find there were no instructions on how to use the MQ workflow in z/OSMF, and after a few false starts, I gave up because there were so many problem! I hate giving up on a challenge, but I could not find solutions to the problems I experienced.

I’ve written some instructions below on how I used z/OSMF and the MQ workflow. They may be inaccurate or missing information as I was unable to successfully deploy a queue manager.

I think that the workflow technique could be useful, but not with the current implementation. You could use the MQ stuff as an example of how do do things.

I’ve documented below some of the holes I found.

Overall this approach seems broken.

I came to this subject knowing nothing about it.  May be I should have gone on a two week course to find out how to use it.  Even if I had had education I think it is broken. It may be that better documentation is needed.

For example

  • Rather than simplifying the process, the project owner may have to learn about creating and maintaining the workflows and the auxiliary files the flows use – for example loops in the JCL files.
  • I have an MQ fix in a library, so I have a QFIX library in my steplib.   The generated JCL has just the three SCSQLOAD,SCSQAUTH, SCSQANLE in STEPLIB. I cannot see how to get the QFIX library included. This sort of change should be trivial.
  • If I had upgraded from CSQ913 to CSQ920, I do not see how the workflows get refreshed to point to the new libraries.
  • There are bits missing – setting up the STARTED profile for the queue manager, and specifying the userid.

Documentation and instructions to tell you how to use it.

The MQ documentation points to the program directory which just list the contents of the ../zosmf directory. Using your psychic powers you are meant to know that you have to go the z/OSMF web console and use the WorkFlows icon. There is a book z/OSMF Programming Guide which tells you how to create a workflow – but not how to use one!

The z/OSMF documentation (which you can access from z/OSMF) is available here. It is a bit like “here is all we know about how to use it”, rather than “baby steps for the new user”. There is a lot to read, and it looks pretty comprehensive. My eyes glazed over after a few minutes reading and so it was hard to see how to get started.

Before you start

You need access to the files in the ZFS.

  • For example  /usr/lpp/mqm/V9R1M1/zosmf/provision.xml this is used without change.   This has all of the instructions that the system needs to create the workflow.
  • Copy /usr/lpp/mqm/V9R1M1/zosmf/workflow_variables.properties  to your directory.  You need to edit it and fill in all of the variables between <…>.   There are a lot of parameters to configure!

You need a working z/OSMF where you can logon, use TSO etc.

Using z/OSMF to use the work flow

When I logged on to z/OSMF the first time, I had a blank  web page. I played around with the things at the bottom left of the screen, and then I got a lot of large icons.  These look a bit of a mess – rather than autoflow them to fill the screen, you either need to scroll sideways or make your screen very wide (18 inches wide).  These icons do not seem to be in any logical order.  It starts with “Workflows”; “SDSF” is a distance away from “SDSF settings”.  I could not see how to make small icons, or to sort them.

I selected “Classic” from the option by my  userid – This was much more usable, it had a compact list of actions, grouped sensibly,  down the side of the screen,

Using “workflow” to define MQ queue managers.

Baby steps instructions to get you started are below.

  • Click on the Workflows icon.
  • From the actions pull down, select Create Workflow.
  • In the workflow definition file enter /usr/lpp/mqm/V9R1M1/zosmf/provision.xml  or what workflow file you plan to use.
  • In the Workflow variable input file, enter the name of your edited workflow_variables.properties file.  Once you have selected this the system copies the content. To use a different file, or update the file,  you have to create a new Workflow.  If you update it, any workflows based on it do not get updated.
  • From the System pull down, select the system this is for.
  • Click ‘next’.   This will validate the variables it will be using.  Variables it does not use, do not get checked.
    • It complained that CSQ_ARC_LOG_PFX = <MQ.ARC.LOG.PFX> had not been filled in
    • I changed it to CSQ_ARC_LOG_PFX = MQ.ARCLOG – and it still complained
    • It would only allow CSQ_ARC_LOG_PFX = MQ, pity as my standards were MQDATA.ARCHIVE.queue_manager etc.
  • Once the input files have been validated it displays a window “Create Workflow”.  Tick the box “Assign all steps to owner userid”.  You can (re-)assign them to people later.
  • Click Finish.
  • It displays “Procedure to provision a MQ for zOS Queue manager 0 Workflow_o” and lists all of the steps – all 22 of them.
  • You are meant to do the steps in order.   The first step has State “Ready”, the rest are “Not ready”.
  • I found the interface unreliable.  For example
    • Right click on the Title of the first item.  Choose “Assignment And Ownership”.  All of the items are greyed out and cannot be selected.
    • If you click the tick box in the first column, click on “Actions” pull down above the workflow steps.  Select “Assignment And Ownership”.  You can now  assign the item to someone else.
      • If you select this, you get the “Add Assignees” panel.  By default it lists groups.  If you go to the “Actions” pull down, you can add a SAF userid or group.
      • Select the userids or groups and use the “Add >” to assign the task to them.
  • With the list of tasks, you can pick “select all”, and assign them;  go to actions pull down, select Assignment And Ownership, and add the userids or groups.
  • Once you are back at the workflow you have to accept the task(s).   Select those you are interested in, and pick Actions -> Accept. 
  • Single left click on the Title – “Specify Queue Manager Criteria”.  It displays a tabbed pane with tabs
    • General – a description of the task
    • Details – it says who the task has been assigned to  and its status.
    • Notes – you can add items to this
    • Perform – this actually does the task.
  • Click on the “Perform” tab.  If this is greyed out, it may not have been assigned to you, or you may not have accepted it.
    • It gives a choice of environments, eg TEST, Production.  Select one
    • The command prefix eg !ZCT1.
    • The SSID.
    • Click “Next”.  It gives you Review Instructions; click Finish.
  • You get back to the list of tasks.
    • Task 1 is now marked as Complete.
    • Task 2 is now ‘ready’.
    • Left single click task 2 “Validate the Software Service Instance Name Length”.
    • The Dependencies tab now has an item “Task 1 complete”.  This is all looking good.
  • Note: Instead of going into each task, sometimes you can use the Action -> perform to go straight there – but usually not.
  • Click on Perform
    • Click next
    • It displays some JCL which you can change, click Next. 
    • It displays “review JCL”.
      • It displays Maximum Record Length 1024.  This failed for me – I changed it to 80, and it still used 1024!
    • Click Next..  When I clicked Finish, I got “The request cannot be completed because an error occurred. The following error data is returned: “IZUG476E:The HTTP request to the secondary z/OSMF instance “S0W1” failed with error type “HttpConnectionFailed” and response code “0” .”  The customising book mentions this, and you get it if you use AT-TLS – which I was not using.  It may be caused by not having the IP address in my Linux /etc/hosts file.  Later, I added the address to the /etc/hosts file on my laptop, restarted z/OSMF and it worked.
    • I unticked “Submit JCL”, and ticked “Save JCL”.  I saved it in a Unix file, and clicked finish.  It does not save the location so you have to type it in every time (or keep it in your clipboard), so not very usable.
    • Although I had specified  Maximum Record Length of 80, it still had 1024.  I submitted the job and it complained with “IEB311I CONFLICTING DCB PARAMETERS”.   I edited to change 1024 to 80 in the SPACE and DCB, and submitted it.  The JCL then worked.
    • When the rexx ran, it failed with …The workflow Software Service Instance Name (${_workflow-softwareServiceInstanceName})….  The substitution had not been done.  I dont know why – but it means you cannot do any work with it.
    • When I tried another step, this also had no customisation done, so I gave up.
    • Sometimes when I ran this the “Finish” button stayed greyed out, so I was unable to complete the step.  After I shut down z/OSMF and restarted it, it usually fixed it.
  • I looked at the job “Define MQ Queue Manager Security Permissions” – this job creates a profile to disable MQ security – it did not define the security permissions for normal use.
  • I tried the step to Dynamically allocate a port for the MQ chin.  I got the same IZUG476E error as before.   I fixed my IP address, and got another error. It received status 404 from the REST request.   In the /var/zosmfcp/data/logs/zosmfServer/logs/messages.log  I had  SRVE0190E: File not found: /resource-mgmt/rest/1.0/rdp/network/port/actions/obtain.  For more information on getting a port see here.

Many things did not work, so I gave up.

Put messages to a queue workflow. 

I tried this, and had a little (very little) more success.

As before I did

  • Workflows
  • Actions pull down
  • Create workflow.   I used
    • /usr/lpp/mqm/V9R1M1/zosmf/putQueue.xml
    • and the same variable input file
  • Lots of clicks – including  Assign all steps to owner userid
  • Click Finish.   This produced a workflow with one step!
  • Left click on the step.  Go to Perform.  This lists
    • Subsystem ID
    • Queue name
    • Message_data
    • Number of messages.
  • Click Next.
  • Click Next, this shows the job card
  • Click Next,  this shows the job.
  • Click Next.  It has “Submit JCL” ticked.  Click Finish.   This time it managed to submit the JCL successfully!
  • After several seconds it displays the “Status” page, and after some more seconds, it displays the job output.
  • There is a tabbed panel with tabs for JESMSGLG, JESJCL, JESYSMSG,SYSPRINT.
  • I had a JCL error – I had made a mistake in the  MQ libraries High level qualifier.
  • I updated my workflow_variables.properties file, but I could not find say of telling the workflow to use the update variable properties file.  To solve this I had to
    • Go back to the Workflows screen where it lists the workflows I have created. 
    • Delete the workflow instance.
    • Create a new workflow instance, which picks up the changed file
    • I then deployed it and “performed” the change, and the JCL worked.
    • I would have preferred a quick edit to some JCL and resubmit the job, rather than the relatively long process I had to follow.
  • If this had happened during the deploy a queue manager workflow this would have been really difficult.   There is no “Undo Step”, so I think you would have had to create  the De-provision workflow – which would have failed because many of the steps would not have been done, or you delete the provision workflow, fix the script and redo all the steps (or skip them).

If this all worked, would I use it?

There were too many clicks for my liking.  It feels like trying to simplify things has made it more complex.  There are many more things that could go wrong – and many did go wrong, and it was hard to fix.  Some problems I could not fix.  I think these work flows are provided as an example to the customer.  Good luck with it!

A practical guide to getting z/OSMF working.

Every product using Liberty seems to have a different way of configuring the product.  As first I thought specifing parameters to z/OSMF was great, as you do it by a SYS1.PARMLIB member.  Then I found you have other files to configure; then I found that I could not reuse my existing definitions from z/OS Connect, and MQWEB.  Then I found it erases any local copy of the server.xml file, and links to the one shipped as part of the server.xml file from the configuration files each time.   Later I used this to my advantage.  Once again I seemed to be struggling against the product to do simple things.  Having gone through the pain, and learnt how to configure z/OSMF, its configuration is OK.

You specify some parameters in SYS1.PARMLIB(IZUPRMxx) concatenation.  See here for the syntax and the list of parameters. In mid 2020 there was an APAR PH24088  which allowed you to change change these parameters dynamically, using a command such as:


Before you start.

I had many problems with certificates before I could successfully logon to z/OSMF.

I initially found it hard to understand where to specify configuration options, as I was expecting to specify them in the server.xml file. See z/OSMF configuration options mapping for the list of options you can specify.

If you change the options you have to restart the server.   Other systems that use Liberty have a refresh option which tells the system to reread the server.xml file.   z/OSMF stores the variables in variable strings in the bootstrap.options file, and I could not find a refresh command which refreshed the data.   (There is a refresh command which does not refresh.)  See z/OSMF commands.

 Define the userid

I used a userid with a home directory /var/zosmfcp/data/home/izusvr.   I had to issue

mkdir /var/zosmfcp
chown izusvr /var/zosmfcp

mkdir -p /var/zosmfcp/data/home/izusvr
chown izusvr /var/zosmfcp/data/home/izusvr

touch /var/zosmfcp/configuration/local_override.cfg 
chmod g+r /var/zosmfcp/configuration/local_override.cfg
chown :IZUADMIN /var/zosmfcp/configuration/local_override.cfg

Getting the digital certificate right.

I had problems using the provided certificate definitions.

  1. The CN did not match what I expected.
  2. There was no ALTNAME specified.  For example ALTNAME((IP(  (where was the external IP address of my z/OS). The browser complained because it was not acceptable.   An ALTNAME must match the IP address or host name the data came from.  Without a valid ALTNAME you can get into a logon loop.  Using Chrome I got
    1. “Your connection is not private”. Common name invalid.
    2. Click on Advanced and “proceed to..  (unsafe)”
    3. Enter userid and password.
    4. I had the display with all of the icons.  Click on the Pul-up and  switch to “classic interface”
    5. I got “Your connection is not private”. Common name invalid, and round the loop again.
  3. The keystore is also used as the trust store, so it needs the Certificate Authority’s certificates.  Other products using Liberty use a separate trust store.  (The keystore contains the certificate the server uses to identify itself.  The trust store contains the certificates (such as Certificate Authority certificates) to validates certificates sent from clients to the server).   With z/OSMF there is no definition for the trust store.   To make the keystore work as a trust store, the keystore needs:
    1.  the CA for the server (z/OSMF talks to itself over TLS) this means the each end of the conversation within the server, needs it to validate the server’s certificate.
    2. the CA for any certificates in any browsers being used.
    3. I had to use the following statements to convert my keystore to add the trust store entries.

      LABEL('Linux-CA2') -

Reusing my existing keyring

Eventually I got this to work.  I had to…

    1. Connect the CA of the z/OS server into the keyring.
    2. Update /var/zosmfcp/configuration/local_override.cfg for ring //START1/KEY2

The z/OSMF started task userid requests CONTROL access to the keyring. 

It requests CONTROL access (RACF reports this!), but UPDATE  access seems to work See RACF: Sharing one certificate among multiple servers.

 With only READ access I got.

CWWKO0801E: Unable to initialize SSL connection. Unauthorized access was denied or security settings have expired. Exception is javax.net.ssl.SSLException: Received fatal alert: certificate_unknown

If it does not have UPDATE access, then z/OSMF cannot see the private certificate.

Use the correct keystore type. 

My RACF KEYRING keystore only worked when I had a keystore type of JCERACFKS.  I specified it in /var/zosmf/configuration/local_override.cfg


Before starting the server the first time

If you specify TRACE=’Y’, either in the procedure or as part of the start command, it traces the creating of the configuration file, and turns on the debug script.  TRACE=’X’ gives a BASH trace as well.

It looks like the name of the server is hard coded internally as zosmfServer, and the value in the JCL PARMS= is ignored.

Once it has started

If you do not get the logon screen, you may have to wait.  Running z/OS under my Ubuntu Laptop is normally fine for editing etc.  it takes about 10 minutes to start z/OSMF.  If you get problems, with incomplete data displayed, or messages saying resources not found, wait and see if it gets resolved.  Sometimes I had to close and restart my browser.

Useful to know…

  1. Some, but not all, error messages are in the STDERR output from the started task.
  2. The logs and trace are in /var/zosmfcp/data/logs/zosmfServer/logs/.  Other products using Liberty have /var/zosmfcp/data/logs/zosmfServer/logs/ so all the files are under /var/zosmfcp/zosmfServer
  3. The configuration is in /var/zosmfcp/configuration and /var/zosmfcp/configuration/servers/zosmfServer.   This is a different directory tree layout from other Liberty components.
  4. If you want to add one JVM option, edit the local_override.cfg and add   JVM_OPTIONS=-Doption=value.  See here if you want to add more options.  I used JVM_OPTIONS=-Djavax.net.debug=ssl:handshake to give me the trace of the TLS handshake.
  5. If you have problems with certificate not being found on z/OS, you might issue the SETROPTS  command to be 100% sure that what is defined to RACF is active in RACF.  Use SETROPTS RACLIST(DIGTCERT,DIGTRING,RDATALIB) refresh.  
  6. Check keyring contents using racdcert listring(KEY2) id(START1)
  7. Check certificate status using RACDCERT LIST(LABEL(‘IZUZZZZ2’ )) ID(START1) and check it has status:trust and the dates are valid.
  8. If your browser is not working as expected – restart it.

Tailoring your web browser environment.

Some requests, for example workflow, use a REST request to perform an action.  Although I had specified a HOSTNAME of, z/OSMF used an internal address of S0W1.DAL-EBIS.IHOST.COM When the rest request was issued from my browser – it could not find the back end z/OS.  I had to add this site to my Linux /etc/hosts   S0W1.DAL-EBIS.IHOST.COM 

Even after I had resolved the TCPIP problem on z/OS which caused this strange name to be used, z/OSMF continued to use it. When I recreated the environment the problem was resolved. I could not see this value in any of the configuration files.

Getting round the configuration problems

I intially found it hard so specify additional configuration options, to override what was provided by z/OSMF.  For example reuse what I had from other Liberty servers.

I changed the server.xml file to include

<include optional="false" location="${server.config.dir}/colin.xml"/> 

This allowed me to put things in the colin.xml file.  Im sure this solution is not recommended by IBM, but it is a practical solution.  This file may be read only to you.

You should not need this solution if you can use the z/OSMF configuration options mapping.