Getting SSH to work to z/OS

I have two versions of z/OS, old and new(!). I had problems getting ssh to work because of key problems.

The problem

I tried to update my laptop key to the server

ssh-copy-id colin@10.1.1.2

This gave

/usr/bin/ssh-copy-id: INFO: attempting to log in with the new key(s), to filter out any that are already installed

/usr/bin/ssh-copy-id: ERROR: @@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
ERROR: @    WARNING: REMOTE HOST IDENTIFICATION HAS CHANGED!     @
ERROR: @@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
ERROR: IT IS POSSIBLE THAT SOMEONE IS DOING SOMETHING NASTY!
ERROR: Someone could be eavesdropping on you right now (man-in-the-middle attack)!
ERROR: It is also possible that a host key has just been changed.
ERROR: The fingerprint for the ED25519 key sent by the remote host is
ERROR: SHA256:2mUOVfdSedJVQIzZiGsRkOe9Vkc1bkyuDNp5H+VrZ98.
ERROR: Please contact your system administrator.
ERROR: Add correct host key in /home/colin/.ssh/known_hosts to get rid of this message.
ERROR: Offending ED25519 key in /home/colin/.ssh/known_hosts:1
ERROR:   remove with:
ERROR:   ssh-keygen -f '/home/colin/.ssh/known_hosts' -R '10.1.1.2'
ERROR: Host key for 10.1.1.2 has changed and you have requested strict checking.
ERROR: Host key verification failed.

Searching the internet I got suggestions saying “delete the old line from the file”. I didn’t want to do this because it meant I would not be able to go back to the old system and work as before.

Solutions

I edited /home/colin/.ssh/known_hosts and commented out line 1, with a # at the front (the :1 above is the first line). I repeated the command and it report the same message for line :2. I commented that out as well.

I got further

colin@ColinNew:~$ ssh-copy-id colin@10.1.1.2
The authenticity of host '10.1.1.2 (10.1.1.2)' can't be established.
ED25519 key fingerprint is SHA256:2mUOVfdSedJVQIzZiGsRkOe9Vkc1bkyuDNp5H+VrZ98.
This key is not known by any other names.
Are you sure you want to continue connecting (yes/no/[fingerprint])? yes
/usr/bin/ssh-copy-id: INFO: attempting to log in with the new key(s), to filter out any that are already installed
/usr/bin/ssh-copy-id: INFO: 2 key(s) remain to be installed -- if you are prompted now it is to install the new keys
colin@10.1.1.2: Permission denied (publickey,hostbased).

I had to start the SYSLOGD on z/OS to capture the output from SSHD.

In the /var/logSSHD (your’s may be different) it said

FOTS2307 User COLIN from 10.1.0.2 not allowed because not listed in AllowUsers

In my SSHD config file /etc/ssh/sshd_config I had

# Allow specific user IDs 
AllowUsers IBMUSER

I added COLIN to the list and restarted SSHD. (I do not know how to refresh SSHD)

This time the error log had

trying public key file /u/tmp/zowet/colin/.ssh/authorized_keys 
Could not open authorized keys '/u/tmp/zowet/colin/.ssh/authorized_keys': ...

I fixed this, tried to logon, and this time it worked.

On Linux, I edited /home/colin/.ssh/known_hosts and un-commented the lines I had commented out before.
I tried the ssh command again, and it still worked!

Python calling C functions – passing structures

I’ve written how you can pass simple data from Python to a C function, see Python calling C functions.

This article explains how you can pass structures and point to buffers in the Python program. it extends Python calling C functions. It allows you to move logic from the C program to a Python program.

Using complex arguments

The examples in Python calling C functions were for using simple elements, such as Integers or strings.

I have a C structure I need to pass to a C function. The example below passes in an eye catcher, some lengths, and a buffer for the C function to use.

The C structure

typedef struct querycb {                                                         
      char        Eyecatcher[4];  /* Eye catcher   offset    0    */                   
      uint16_t    Length;         /* Length of the block     4    */                   
      char        Rsvd1[1];       /* Reserved                6    */                   
      uint8_t     Version;        /* Version number          7   */                    
      char        Flags[2];       /* Flags                   8    */                   
      uint16_t    Reserved8;      //   10                                              
      uint32_t    Count;          // number returned  12                                       
      uint32_t    lBuffer;        // length of buffer 16                                      
      uint32_t    Reservedx ;     //              20                                    
      void        *pBuffer;       //              24                                 
    } querycb;

The Python code

# create the variables
eyec = "EYEC".encode("cp500")  # char[4] eye catcher
l = 32                         # uint16_t
res1 = 0                       # char[1] 
version = 1                    # uint8_t -same as a char
flags = 0                      # char[2]
res2 = 0                       # uint16_t
count = 0                      # uint32_t  
lBuffer = 4000                 # uint32_t 
res3 = 0                       # uint32_t 
# pBuffer                      # void *  
# allocate a buffer for the C program to use and put some data
# into it
pBuffer = ctypes.create_string_buffer(b'abcdefg',size=lBuffer)
# cast the pBuffer so it is a void * 
pB =  ctypes.cast(pBuffer, ctypes.c_void_p)
# use the struct.pack function.  See @4shbbhhiiiP below
# @4 is 4 bytes, the eye catcher
# h half word
# bb two char fields res1, and version
# hh two half word s flags and res2
# iii three integer fields.  count lBuffer and res3
# P void * pointer 
# Note pB is a ctype, we need the value of it, so pB.value
p = pack("@4shbbhhiiiP", eyec,l,res1,version,flags,
         res2,count,lBuffer,res3,pB.value)

#create first parm
p1 = ctypes.c_int(3)  # pass in the integer 3 as an example
# create second parm
p2 = ctypes.cast(p, ctypes.c_void_p)

# invoke the function 

retcode  = lib.conn(p1,p2)

The C program

int conn(int * p1, char * p2) 
// int conn(int  max,...)
{ 
    typedef struct querycb {                                                         
      char        Eyecatcher[4];  /* Eye catcher             0    */                   
      uint16_t    Length;         /* Length of the block     4    */                   
      char        Rsvd1[1];       /* Reserved                6    */                   
      uint8_t     Version;        /* Version number          7   */                    
      char        Flags[2];       /* Flags                   8    */                   
      uint16_t    Reserved8;      //   10                                              
      uint32_t    Count;  // number returned  12                                       
      uint32_t    lBuffer; // length of buffer 16                                      
      uint32_t    Reservedx ;    //              20                                    
      void        *pBuffer;      //              24                                    
    } querycb;  

    querycb * pcb = (querycb * ) p2;

    printf("P1 %i\n",*p1);
    printHex(stdout,p2,32); 
    printf("Now the structure\n")
    printHex(stdout,pcb -> pBuffer,32); 
    return 0 ;
}

The output

P1 3
00000000 : D8D9D7C2 00200001 00000000 00000000  ..... .......... EYEC............ 
00000010 : 00000FA0 00000000 00000050 0901BCB0  ...........P.... ...........&.... 
Now the structure
00000000 : 61626364 65666700 00000000 00000000  abcdefg......... /............... 
00000010 : 00000000 00000000 00000000 00000000  ................ ................

Where

EYEC is the passed in eye catcher
00000FA0 is the length of 4000
00000050 0901BCB0 is the 64 address of the structure
abcdefg is the data used to initialise the buffer

Observations

It took me a couple of hours to get this to work. I found it hard to get the cast, and the ctype…. functions to work successfully. There may be a better way of coding it, if so please tell me. The code works, which is the objective – but there may be better more correct ways of doing it.

Benefits

By using this technique I was able to move code from my C program to set up the structure needed by the z/OS service into C. My C program was just parse input parameters, set up the linkage for the z/OS service, and invoke the service.

If course I did not have the constants available from the C header file for the service, but that’s a different problem.

Python safely iterating

I was using a Python program to access a z/OS service, and found there were times when my code did not clean up and close the resource.

It took me an afternoon to find out how to do it. I found pyzfile by daveyc an excellent example of how to cover Python advanced topics.

pyzfile example

The documentation has

from pyzfile import *
try:
    with ZFile("//'USERID.CNTL(JCL)'", "rb,type=record",encoding='cp1047') as file:
        for rec in file:
            print(rec)
except ZFileError as e:
    print(e)

Breaking this down

Understanding the “with”

try:
   with ZFile("//'USERID.CNTL(JCL)'", "rb,type=record",encoding='cp1047') as file:
       
    ...   
    do something with file
    ... 
except ZFileError as e:
    print(e)

When the with ZFile(…) as file: is executed the code conceptually does

standard set up processing
open the file and return the handle
do processing using the file handle
when ever it leaves the with code section perform the close activity

Note:This could have been done with

try:
  open the file
  ...
    do something
  ... 
except:
  ...
finally:  # do this every time
  if the file was opened:
    close the file

but this is not quite so tidy and compact as the with syntax

In more detail…

The def __init__(self,..): method is invoked and passed the parameters. It saves parameters using statements like self.p1
The __enter__(self): is invoked passing the instance data(self). It seems to have no other parameters.
- In the pyzfile, the code issues return self._open(). This invokes the function _open to open the data set.
When the with processing completes, it invokes the function __exit__(self, exc_type, exc_value, exc_traceback): This is invoked whether the code returned normally, or got an exception.
- In the pyzfile, the code issue executes self.close(). So however the “with” processing ends, the close is always executed

Handing errors

I’ve seen that using the “with” clause, people tend to throw exceptions when problems are found

For example with the pyfile code there is

class ZFileError(Exception):
    """ ZFile exception """
    def __init__(self, message: str, amrc: dict = None):
        self.message = message
        self.amrc = amrc
        if amrc is None:
            self.amrc = {}
        super().__init__(self.message)

    def __str__(self) -> str:
        return self.message

    def amrc(self):
        """
        Returns the amrc dict at the time of error. 

        :return: The ``__amrc`` structure at the time of error.
        """
        return self.amrc

class ZFile:
  ...
  def _open(self):
    ...
      self.handle = open...
      if not self.handle:
            raise ZFileError(f"Error opening file '{self.filename}':  
                             {self.lib.zfile_strerror().decode('utf-8')}")
        return self

Understanding the “for”

The code above had

    with ZFile("//'USERID.CNTL(JCL)'", "rb,type=record",encoding='cp1047') as file:
        for rec in file:
            print(rec)

How does the “for” work?

The key to this code are the functions

##################################################
# Iterators
##################################################
    def __iter__(self):
        return self

    def __next__(self):
        ret = self.read()
        if not ret:
            raise StopIteration
        return ret

When the for statement is processed it processes the __next__ function. This does the work of getting the next record and returning it.

There is a lot of confusing documentation about iterators, iteration and iterables. Let’s see if my description helps clarify or just adds more confusion.

Something is iter-able of you can do iteration on it; where iteration means taking each element in turn.

In Python a list is iter-able

for l in [0,1,2,3,]
   print(l)

will iterate over the list and return the element from the list

Records in a file are a bit more abstract, you cannot see the whole file, but you can say get the next record – and move through the file until there are no more records.

An iterator is the mechanism by which you iterate. Think of it as a function. The Python documentation is pretty clear.

Most people define

  def __iter__(self):
      return self

For most people, just specify this. The PhD class may use something different.

The mechanism of “for” uses the __next__ function

    def __next__(self):
        ret = self.read()
        if not ret:
            raise StopIteration
        return ret

Which obtains the next element of data. If there are no more elements, then raise the StopIteration exception.

If you do not handle the StopIteration exception, then Python handles it for you and leaves the processing loop.

Conclusion

With both of these techniques “with” and “for” I could extract records from a z/OS resource.

I’ve used the “with” and “for” with yield to hide implementation detail

# create the function to read the file
def readfile(name): 
    try: 
        with ZFile(name, "rb,type=record,noseek") as file: 
            for rec in file: 
                yield  rec 
    except ZFileError as e: 
        print(e) 
# process the file using for ...  readline()
def reader(...):                                                                
    for line in readfile("//'IBMUSER.RMF'"): 
        do something with the data

Many is so last year – logstreams is the way to go.

I’ve been looking into the SMF Real Time, where an application program can get records directly from SMF, and not have to post-process SMF datasets or log streams. To use the real time support, SMF needs to use log streams.

What is SMF?

SMF is System Management Facility. z/OS and the subsystems can write data to SMF for post processing. Typical records are audit and accounting records from z/OS, RACF or CICS, changes to SMS, and changes to resources. Each product has one or more SMF record-type numbers allocated to it. Within each SMF record type you can have sub-types, for example the z/OS SMF 30 record has a sub-type for job start, another sub-type for job step end, and another sub-type for job end.

Display SMF options

The command

d smf

gave

   NAME                VOLSER SIZE(BLKS) %FULL  STATUS    
 P-SYS1.S0W1.MAN1      B3SYS1      7200     0  ALTERNATE  
 S-SYS1.S0W1.MAN3      USER04     72000     1  ACTIVE

showing the dataset are being used, and giving information about the datasets

The command

d smf,o

displays all of the SMF options, and where they came from – for example a parmlib member, or from the SETSMF command.

IEE967I 08.44.41 SMF PARAMETERS 489                
        MEMBER = SMFPRM00   
        ...   
        SYNCVAL(00) -- DEFAULT                                      
        DUMPABND(RETRY) -- DEFAULT                                  
        INMEM(IFASMF.COLIN,TYPE(30,42),RESSIZMAX(0128M)) -- PARMLIB 
        SUBSYS(STC,NOTYPE(14:19,62:69,99)) -- SYS   
        ...
        STATUS(010000) -- PARMLIB            
        INTVAL(01) -- PARMLIB                
        MAXDORM(0001) -- PARMLIB             
        REC(PERM) -- PARMLIB                 
        NOPROMPT -- PARMLIB                  
        DSNAME(SYS1.S0W1.MAN3) -- PARMLIB    
        DSNAME(SYS1.S0W1.MAN1) -- PARMLIB    
        ACTIVE -- PARMLIB

The old way of recording SMF data

SMF had set of datasets it would use in turn. Typically these were named like SYS1.MANX, SYS1.MANY, or SYS1.PROD.MAN2 etc.. When the active dataset filled up, SMF would switch to the next empty dataset. You (or automation) then runs a job to either copy the records to another dataset, or post process the records; and then clear the dataset for reuse.

As computers got bigger, more work was done, more records were written and writing records to disk could not keep up.

Logstreams is the way forward.

A log stream is a stream of data which can be written to a Coupling Facility(CF) structure, or to a dataset on disk. Typically writing to a CF is faster than writing to disk.

With MANx datasets, all records were written to one dataset. With logstreams, you can configure SMF have multiple logstreams and you configure which record type(s) go to which log stream. This means you can have CICS records going to the “CICS log stream”, and RACF records going to the “RACF logstream”, and the remainder going to a default log stream.

Having multiple logstreams means data can be written to many log streams concurrently, and so avoids the bottleneck of writing to a MANx dataset.

Setting up security profiles

It took me several attempts to configure the security profiles.

Be able to define and delete logstreams

//IBMUSER1 JOB   1,MSGCLASS=H 
//KEYCERTS EXEC PGM=IKJEFT01 
//SYSPRINT DD SYSOUT=* 
//SYSTSPRT DD SYSOUT=* 
//SYSTSIN  DD * 
 RDEFINE FACILITY RESOURCE(MVSADMIN.LOGR) UACC(NONE) 
 permit  MVSADMIN.LOGR class(FACILITY)   - 
                    access(control) ID(SYS1) 
 setr raclist(facility) refresh

Define individual logstreams

RDEFINE LOGSTRM IFASMF.** UACC(NONE) 
PERMIT IFASMF.** class(LOGSTRM ) - 
                    access(ALTER  ) ID(SYS1) 
setr raclist(logstrm ) refresh

Giving SMF access to the logstreams

RDEFINE FACILITY IFA.IFASMF.* UACC(READ)
setr raclist(facility) refresh

Setting up logstreams

You need to set up at least one log stream. It is easy to define more and change the SMF configuation.

I used the define logstream command

//IBMLOG JOB 1,MSGCLASS=H 
//LOGDEF EXEC PGM=IXCMIAPU,REGION=4M 
//SYSPRINT DD SYSOUT=* 
//SYSIN DD * 
DATA TYPE(LOGR) REPORT(YES) 
                                                                       
DELETE LOGSTREAM NAME(IFASMF.DEFAULT) 
DEFINE LOGSTREAM NAME(IFASMF.DEFAULT) 
                   DESCRIPTION(SMF_LOGSTREAM) 
                   MODEL(NO) 
                   DASDONLY(YES)         
                   STG_SIZE(65532) 
                   LS_SIZE(15000) 
                   HLQ(IXGLOGR) 
                   HIGHOFFLOAD(80) 
                   LOWOFFLOAD(0) 
                   AUTODELETE(YES)           /* DELETE OPTION */ 
                   OFFLOADRECALL(NO) 
                   MAXBUFSIZE(65532) 
                   DIAG(NO) 
                   RETPD(1)                  /* DELETE 1 DAYS */ 
//

I also define a log stream IFASMF.COLIN

With the HLQ(IXGLOGR) definition, behind the logstreams were data sets like

Dataset                              Volume  
IXGLOGR.IFASMF.COLIN.ADCDPL          *VSAM*
IXGLOGR.IFASMF.COLIN.ADCDPL.DATA     USER05
IXGLOGR.IFASMF.COLIN.A0000000        *VSAM*
IXGLOGR.IFASMF.COLIN.A0000000.DATA   USER04

Configure SMF

I created a member SMFPRMLS in a user.parmlib

ACTIVE                          /* ACTIVE SMF RECORDING             */ 
DSNAME(SYS1.&SYSNAME..MAN1, 
       SYS1.&SYSNAME..MAN3) 
RECORDING(LOGSTREAM) 
NOPROMPT                        /* DO NOT PROMPT OPERATOR           */ 
REC(PERM)                       /* TYPE 17 PERM RECORDS ONLY        */ 
MAXDORM(0001)                   /* WRITE IDLE BUFFER AFTER 1 SEC    */ 
INTVAL(01)                      /* EVEY MINUTE                      */ 
STATUS(010000)                  /* WRITE SMF STATS AFTER 1 HOUR     */ 
JWT(0400)                       /* 522 AFTER 30 MINUTES             */ 
SID(&SYSNAME(1:4)) 
LISTDSN                         /* LIST DATA SET STATUS AT IPL      */ 
DEFAULTLSNAME(IFASMF.DEFAULT) 
LSNAME(IFASMF.COLIN,TYPE(30,42)) 
AUTHSETSMF 
SYS(NOTYPE(14:19,62:69,99),EXITS(IEFU83,IEFU84,IEFACTRT, 
              IEFUSI,IEFUJI,IEFU29),NOINTERVAL,NODETAIL) 
SUBSYS(STC,EXITS(IEFU29,IEFU83,IEFU84,IEFUJP,IEFUSO)) 
INMEM(IFASMF.COLI2,RESSIZMAX(128M),TYPE(30,42))

I activated it using the command

t smf=ls

When this failed, because my log stream definitions were not correct, the SMF collection defaulted to using the specified SYS1.MANx datasets.
The important bits of the SMFPRMxx file are

RECORDING(LOGSTREAM) – use logstreams rather than datasets
LSNAME(IFASMF.COLIN,TYPE(30,42)) for record types 30 and 42 write them to this log stream
DEFAULTLSNAME(IFASMF.DEFAULT) If there is no LSNAME for a record type – then write them to this log stream

You can issue setsmf commands to override the existing definition.

Processing SMF records

For SMF datasets

For the Use JCL like

// SET SMFPDS=SYS1.S0W1.MAN1                
// SET SMFSDS=SYS1.S0W1.MAN3                
//SMFDUMP  EXEC PGM=IFASMFDP 
//DUMPINA  DD   DSN=&SMFPDS,DISP=SHR,AMP=('BUFSP=65536') 
//DUMPINB  DD   DSN=&SMFSDS,DISP=SHR,AMP=('BUFSP=65536') 
//DUMPOUT  DD   DISP=(NEW,CATLG),DSN=&RMF,SPACE=(CYL,(10,10)) 
//*             DCB=(LRECL=32760,RECFM=VBS) 
//*            DCB=(BLKSIZE=0,LRECL=32760,RECFM=VBS) 
//*UMPOUT  DD   DISP=SHR,DSN=IBMUSER.RMF,SPACE=(CYL,(1,1)) 
//SYSPRINT DD   SYSOUT=* 
//SYSIN  DD * 
  INDD(DUMPINA,OPTIONS(DUMP)) 
  INDD(DUMPINB,OPTIONS(DUMP)) 
  OUTDD(DUMPOUT,TYPE(42,80,30)) 
  RELATIVEDATE(BYDAY,0,1)
  START(0000) 
  END(2300) 
/*

This processes records within the specified time range in the datasets.

For log streams

Use JCL like the following – using PGM=IFASMFDL

//IBMSMFL  JOB 1,MSGCLASS=H 
//* DUMP THE SMF DATASETS 
// SET SMF=IBMUSER.SMF 
//* 
//S1 EXEC PGM=IEFBR14 
//DUMPOUT  DD   DISP=(MOD,DELETE),DSN=&SMF,SPACE=(CYL,(1,1)) 
//* 
//SMFDUMP   EXEC PGM=IFASMFDL,REGION=0M 
//DUMPOUT  DD   DISP=(NEW,CATLG),DSN=&SMF,SPACE=(CYL,(10,10)) 
//SYSPRINT DD   SYSOUT=* 
//SYSIN  DD * 
  LSNAME(IFASMF.COLIN,OPTIONS(DUMP)) 
  OUTDD(DUMPOUT,TYPE(30)) 
  RELATIVEDATE(BYDAY,0,1)
  START(0000) 
  END(2300) 
/* 
//

When you specify a date range, it will read not only the active log stream datasets, but any archive ones it created, and which are available.

Display SMF

With logstream the D SMF command gave

   LOGSTREAM NAME               BUFFERS        STATUS            
 A-IFASMF.DEFAULT                    774       CONNECTED         
 A-IFASMF.COLIN                      584       CONNECTED         
 A-IFASMF.INMEM                        0       IN-MEMORY

Dumping SMF data – last n day’s worth

For many years, I’ve been processing SMF data, and using the date option like DATE(2026012,2027000). Every day, I had to change it to match today’s date, and submit the job.

I’ve just discovered you can give relative dates. For example RELATIVEDATE(BYDAY,0,1), which says go back 0 days and includes 1 day – so just do today.

The output listing has, for today’s date day 19 of 2026:

IFA834I RELATIVEDATE PARAMETER RESULTS IN START DATE 2026.019, END         
                DATE 2026.019                                                
IFA836I RELATIVEDATE RANGE EXTENDS INTO FUTURE, END DATE AND TIME USED       
                IS 2026.019 11:29

You can specify BYDAY, BYWEEK, and BYMONTH.

This function has been around for years! I wonder how much time I’ve wasted on doing it the old way.

The Python interface to RACF is great.

The Python package pysear to work with RACF is great. The source is on github, and the documentation starts here. It is well documented, and there are good examples.

I’ve managed to do a lot of processing with very little of my own code.

One project I’ve been meaning to do for a time is to extract the contents of a RACF database and compare them with a different database and show the differences. IBM provides a batch program, and a very large Rexx exec. This has some bugs and is not very nice to use. There is a Rexx interface, which worked, but I found I was writing a lot of code. Then I found the pysear code.

Background

The data returned for userids (and other types of data) have segments.
You can display the base segment for a user.

tso lu colin

To display the tso base segment

tso lu colin tso

Field names returned by pysear have the segment name as a prefix, for example base:max_incorrect_password_attempts.

My first query

What are the active classes in RACF?

See the example.

from sear import sear
import json
import sys
result = sear(
    {
        "operation": "extract",
        "admin_type": "racf-options"
    },
)
json_data = json.dumps(result.result   , indent=2)
print(json_data)

For error handling see error handling

This produces output like

{
  "profile": {
    "base": {
      "base:active_classes": [
        "DATASET",
        "USER",...
  ],
      "base:add_creator_to_access_list": true,
      ... 
      "base:max_incorrect_password_attempts": 3,
 
...
}

To process the active classes one at a time you need code like

for ac in result.result["profile"]["base"]["base:active_classes"]:
    print("Active class:",ac)

The returned attributes are called traits. See here for the traits for RACF options. The traits show

Trait	`base:max_incorrect_password_attempts`
RACF Key	`revoke`
Data Types	String
Operators Allowed	“set”,”delete”
Supported Operations	“alter”,”extract”

For this attribute because it is a single valued object, you can set it or delete it.

You can use this attribute for example

result = sear(
    {
        "operation": "alter",
        "admin_type": "racf-options",
        "traits": {
            "base:max_incorrect_password_attempts": 5,
        },
    },
)

The trait “base:active_classes” is list of classes [“DATASET”, “USER”,…]

The trait is

Trait	`base:active_classes`
RACF Key	classact
Data Types	`string`
Operators Allowed	`"add"`, `"remove"`
Supported Operations	`"alter"`, `"extract"`

Because it is a list, you can add or remove an element, you do not use set or delete which would replace the whole list.

Some traits, such as use counts, have Operators Allowed of N/A. You can only extract and display the information.

My second query

What are the userids in RACF?

The traits are listed here, and code examples are here.

I used

from sear import sear
import json

# get all userids begining with ZWE
users = sear(
    {
        "operation": "search",
        "admin_type": "user",
        "userid_filter": "ZWE",
    },
)
profiles  = users.result["profiles"]
# Now process each profile in turn.
# because this is for userid profiles we need admin_type=user and userid=....
for profile in profiles:
    user = sear(
       {
          "operation": "extract",
          "admin_type": "user",
          "userid": profile,
       }, 
    )
    segments = user.result["profile"]
    #print("segment",segments)
    for segment in segments:   # eg base or omvs
      for w1,v1 in segments[segment].items():
          #print(w1,v1)
          #for w2,v2 in v1.items():
          #  print(w1,w2,v2 )
          json_data = json.dumps(v1  , indent=2)
          print(w1,json_data)

This gave

==PROFILE=== ZWESIUSR
base:auditor false
base:automatic_dataset_protection false
base:create_date "05/06/20"
base:default_group "ZWEADMIN"
base:group_connections [
  {
    ...
    "base:group_connection_group": "IZUADMIN",
    ...
    "base:group_connection_owner": "IBMUSER",
    ...
},
{
    ...
    "base:group_connection_group": "IZUUSER",
   ...
}
...
omvs:default_shell "/bin/sh"
omvs:home_directory "/apps/zowe/v10/home/zwesiusr"
omvs:uid 990017
===PROFILE=== ZWESVUSR
...

Notes on using search and extract

If you use “operation”: “search” you need a ….._filter. If you use extract you use the data type directly, such as “userid”:…

Processing resources

You can process RACF resources. For example a OPERCMDS provide for MVS.DISPLAY commands.

The sear command need a “class”:…. value, for example

result = sear(
    {
        "operation": "search",
        "admin_type": "resource",
        "class": "OPERCMDS",
        "resource_filter": "MVS.**",
    },
)
result = sear(
    {
        "operation": "extract",
        "admin_type": "resource",
        "resource": "MVS.DISPLAY",
        "class": "Opercmds",
    },
)

The value of the class is converted to upper case.

Changing a profile

If you change a profile, for example to issue the PERMIT command

from sear import sear
import json

result = sear(
    {   "operation": "alter",
        "admin_type": "permission",
        "resource": "MVS.DISPLAY.*",
        "userid": "ADCDG",
        "traits": {
          "base:access": "CONTROL"
        },
        "class": "OPERCMDS"

    },
)
json_data = json.dumps(result.result   , indent=2)
print(json_data)

The output was

{
  "commands": [
    {
      "command": "PERMIT MVS.DISPLAY.* CLASS(OPERCMDS)ACCESS (CONTROL) ID(ADCDG)",
      "messages": [
        "ICH06011I RACLISTED PROFILES FOR OPERCMDS WILL NOT REFLECT THE UPDATE(S) UNTIL A SETROPTS REFRESH IS ISSUED"
      ]
    }
  ],
  "return_codes": {
    "racf_reason_code": 0,
    "racf_return_code": 0,
    "saf_return_code": 0,
    "sear_return_code": 0
  }
}

Error handling

Return codes and errors messages

There are two layers of error handling.

Invalid requests – problems detected by pysear
Non zero return code from the underlying RACF code.

If pysear detects a problem it returns it in

result.result.get("errors")

For example you have specified an invalid parameter such as “userzzz“:”MINE”

If you do not have this field, then the request was passed to the RACF service. This returns multiple values. See IRRSMO00 return and reason codes. There will be values for

SAF return code
RACF return code
RACF reason code
sear return code.

If the RACF return code is zero then the request was successful.

To make error handling easier – and have one error handling for all requests I used


try:
   result = try_sear(search)     
except Exception as ex:
   print("Exception-Colin Line112:",ex)
   quit()

Where try_sear was

def try_sear(data):
    # execute the request
    result = sear(data)
    if  result.result.get("errors") != None:
        print("Request:",result.request)
        print("Error with request:",result.result["errors"])
        raise ValueError("errors")
    elif (result.result["return_codes"]["racf_reason_code"] != 0):
        rcs = result.result["return_codes"]
        print("SAF Return code",rcs["saf_return_code"],
              "RACF Return code",  rcs["racf_return_code"],
              "RACF Reason code",["racf_reason_code"],
        )
        raise ValueError("return codes")
    return result

Overall

This interface is very easy to do.
I use it to extract definitions from one RACF database, save them as JSON files. Repeat with a different (historical) RACF database, then compare the two JSON files to see the differences.

Note: The sear command only works with the active database, so I had to make the historical database active, run the commands, and switch back to the current data base.

Using ISPF edit macros to displaying the junk in a catalog

You can use IDCAMS DCOLLECT to collect SMS information about data sets on your z/OS system. This gives lots of information about a dataset, size, creation date, SMS attributes etc.

With processing you can get reports on dataset, volumes, and what is using all the space. This allows you to delete dataset which are no longer needed.

This does not help when you are trying to clean out your catalogs, and removing stuff which should not be in that catalog. For example there are usually entries in a catalog which should really be in user catalogs.

I could not find tools to help me with this. I fell back to using and ISPF edit macro to process a LISTCAT listing and extracting relevant data. It is not difficult (once you know) and it is quick and easy.

This blog post gives some examples of how you can use ISPF edit macros to process data in data sets or spool.

The output from the short Rexx exec is

TCPIP.ETC.SERVICES             1998.284 B3SYS1
SYS1.RACFDS                    1999.288 B3CFG1
SYS1.IPLPARM                   1999.288 B3SYS1
...
LOG.MISC                       2025.107 USER04 
IBMUSER.S0W1.SPFTEMP3.CNTL     2026.002 USER07
IBMUSER.S0W1.SPFLOG1.LIST      2026.013 USER04
IBMUSER.SMF                    2026.013 USER07

With this I asked What is LOG.MISC 2025.107 doing in the catalog? It is there because I did not have the controls in place to stop people putting datasets into the catalog.

Instead of just displaying the information, I could have had the exec create IDCAMS statements, for example to get it recataloged, or deleted; based on creating date or other information.

Get your LISTCAT listing

I used

//IBMLISC JOB 1,MSGCLASS=H 
// EXPORT SYMLIST=(*) 
// SET CAT=&SYSVER. 
//S1  EXEC PGM=IDCAMS 
//SYSPRINT DD SYSOUT=* 
//SYSIN DD *,SYMBOLS=JCLONLY 
 LISTCAT NONVSAM CATALOG(CATALOG.&CAT..MASTER) ALL 
/*

The // SET CAT=&SYSVER. gets a local copy of the system symbol &SYSVER. You can use the operator command D SYMBOLS to list all the system symbols defined. On my system &SYSVER is Z31B
In //SYSIN DD *,SYMBOLS=JCLONLY the SYMBOLS=JCLONLY says substitute variables in the following SYSIN data, and substitute from the JCL symbols. &CAT is Z31B, and so the catalog name becomes CATALOG.Z31B.MASTER. You cannot use &SYMVER directly in the SYSIN data.

Edit the listing

I used SDSF and the SE line command on the output of the LISTCAT. You get an ISPF edit session with the spool data.

Run the exec

I have a Rexx exec called LISTCATN in USER.Z31B.CLIST. I’ll describe it in sections below

Standard Rexx starting code

/* REXX */ 
/* 
exec to Nonvsam records from a catalog listing 
*/ 
ADDRESS ISPEXEC 
'ISREDIT MACRO (parms) '

Use MACRO(parms) to get the parameters passed to the macro

Define parsing arguments

I define search arguments in a variable stem. This separates the data from the logic, and makes it easy to change or extend.

  data.1 = "NONVSAM 2 10" 
  fcol.1 = 18 
  flen.1 = 48 

  data.2 = "CREATION 38 48 " 
  fcol.2 = 54 
  flen.2 = 8 

  data.3 = "VOLSER 8 15  " 
  fcol.3 = 27 
  flen.3 = 8 

  data.0 = 3 
  sortcols  = "50 60"

Later there is code

do I = 1 to data.0 this processes each section in the stems
There a find data.1 which substitutes to “find NONVSAM 2 10”. This says Find the string NONVSAM in columns 2 to 10
If the find locates the string, the code retrieves the line. The code does a substring from fcol.1 for length flen.1 and saves the value
data.0 = 3 says there are three data sections.
sortcols = “50 60” is used at the end sort the file by the date column.

Remove uninteresting records

  "ISREDIT autosave off     " 
  "ISREDIT exclude all" 
  "ISREDIT find NONVSAM   2 10 ALL       " 
  "ISREDIT find CREATION ALL         " 
  "ISREDIT find VOLSER   ALL         " 
   if rc != 0 then data.0 = 2 /* ignore the volser */ 
  "ISREDIT delete all  x             "

“ISREDIT autosave off ” I have this as standard in ISPF edit macros, basically it says do not save the data if I press PF3.
“ISREDIT exclude all” –
“ISREDIT find NONVSAM 2 10 ALL ” find these lines
“ISREDIT find CREATION ALL ”
“ISREDIT find VOLSER ALL ”
If volser was not found, then listcat wasn’t specified with the right statement, so do not try to process any VOLSER records
- if rc != 0 then data.0 = 2 /* ignore the volser */
“ISREDIT delete all x ” delete all the records which are still excluded leaving only the records I searched for.

Process the records

do j = 1  by 1 
  string = "" 
  do i = 1 to data.0 
    "ISREDIT find "data.i 
    if rc <> 0 then leave 
    "ISREDIT  (f) = LINENUM .ZCSR " 
    "ISREDIT  (d)  = LINE   " f 
     name = substr(d,fcol.i,flen.i) /* from col and length */ 
     string = string || " " || name 
  end 

  if rc <> 0 then leave 
  out.j = string 
end

This code uses the data in the variable stems defined higher up. It keeps the logic separate from the search data.

do j = 1 by 1 iterate through the whole file until the end of file
string = “” preset the output string
do i = 1 to data.0 for the records we specified
“ISREDIT find “data.i find it
if rc <> 0 then leave if not found then leave
“ISREDIT (f) = LINENUM .ZCSR ” Get the line number where the find found the data.
- See A dialog variable name enclosed in parentheses (varname) … If the dialog variable name is on the left, its content is totally replaced.. So variable f gets the value of the line number of the current line
“ISREDIT ( d ) = LINE ” f get the line contents – getting the line number found in the previous step
name = substr(d,fcol.i,flen.i) /* from col and length */ extract the field of interest from the line
string = string || ” ” || name build up a string of the values found
end
if rc <> 0 then leave we got a not found, so end of file,
out.j = string save the data in a stem for processing below
end

Do something with the records

You can do processing on the data, for example create JCL to delete the dataset.

In this example I delete all records from the file, and insert the saved records

  "ISREDIT exclude all" 
  "ISREDIT delete all  x             " 
  do  i = 1 to j -1 
   v = out.i 
    "ISREDIT  LINE_after .zcsr  =   (v)" 
  end 
 "ISREDIT sort     " sortcols 
exit

“ISREDIT delete all “ delete all processed the lines in the file
do i = 1 to j -1 we have a stem of the records we processed iterate over them
v = out.i make a copy of the data, make it easy for ISPF. ISPF only does simple substitutions
“ISREDIT LINE_after .zcsr = (v)” insert after the current (last) line the value from v, which is the saved string.
- See A dialog variable name enclosed in parentheses (varname) … If the dialog variable name is on the right, the entire contents of the variable are considered part of the data, including any quotes, apostrophes, blanks, commas, or other special characters.
end
“ISREDIT sort ” sortcols sort on the creation date
exit

The output

The output from this is the dataset name, the create date, and the volume it is on.

TCPIP.ETC.SERVICES             1998.284 B3SYS1
SYS1.RACFDS                    1999.288 B3CFG1
SYS1.IPLPARM                   1999.288 B3SYS1
...
IBMUSER.S0W1.SPFTEMP3.CNTL     2026.002 USER07
IBMUSER.S0W1.SPFLOG1.LIST      2026.013 USER04
IBMUSER.SMF                    2026.013 USER07

From the data information I can see which entries were due to me – because they were all after the Jan 2025.

Different ways of processing records

Not every dataset has the same information. For example, deleting uninteresting rows

NONVSAM ------- ADCD.DYNISPF.ISPPLIB 
       DATASET-OWNER-----(NULL)     CREATION--------2016.236 
       VOLSER------------B3SYS1     DEVTYPE------X'3010200F' FSEQN------------------0 
NONVSAM ------- ADCD.WLM 
       DATASET-OWNER-----(NULL)     CREATION--------2023.010 
       STORAGECLASS -----SCBASE     MANAGEMENTCLASS---(NULL) 
       DATACLASS --------(NULL)     LBACKUP ---0000.000.0000 
       VOLSER------------B3USR1     DEVTYPE------X'3010200F' FSEQN------------------0

The second dataset ADCD.WLM has SMS information, Storage Class, Management Class, and Data Class, which are not present with the first dataset ADCD.DYNISPF.ISPPLIB.

You could process this sequentially and have logic like…

If the row starts with

NONVSAM – then write out the previous information, get the dataset name, and start again
VOLSER – then parse the volser value
DATASET – then parse the creation date
STORAGECLASS – then parse the SC and MC values
DATACLASS – then parse the DC value

For example

"ISREDIT     (last)  = LINENUM .ZLAST" 
do j = 1  by 1  to last 
  "ISREDIT      ( d )  = LINE   " j 
  if substr(d,2,7) = "NONVSAM" then 
  do 
      count = count + 1 
      string =  dsn cd vol sc mc dc 
      sc = "        " 
      mc = "        " 
      dc = "        " 
      vol= "      " 
      dsn= "      " 
      cd = "      " 
      out.count = string 
      say string 
      /* do the next */ 
      dsn = substr(d,18,48) 
  end 
  else 
  if substr(d,9,7) = "DATASET" then cd = substr(d,54,8) 
  else 
  if substr(d,9,6) = "VOLSER" then vol = substr(d,27,6) 
  else 
  if substr(d,9,6) = "STORAG" then 
  do 
     sc = substr(d,27,8) 
     mc = substr(d,56,8) 
  end 
  else 
  if substr(d,9,6) = "DATACL" then vol = substr(d,27,8) 
end

This gives output like

NFS.CNTL                       2000.336 B3SYS1 
SYS1.RACFDS.BACKUP             2001.164 B3CFG1 
SYS1.UADS                      2003.137 B3CFG1 
NETVIEW.ADCD.NTVTABS           2009.027 B3USR1 SCBASE   (NULL) 
SYT1.ZOS.CNTL                  2012.013 B3USR1 SCBASE   (NULL) 
TCPIP.PROFILE.TCPIP            2016.236 B3SYS1

So not difficult at all.

What’s going on in my program in Unix Services?

On Linux, starting a Python program is subsecond. On z/OS, running on zD&T (so emulated hardware) it takes about 2 seconds. I wondered if what was causing this – is my ZFS file system slow?

I used the Unix command

bpxtrace -o /tmp/trace -f format -c  python ac.py

to capture a trace.

This produced output like

       PID ASID TCB    Local time      System call           Additional trace
-  -  -  -  -  -  -  -  -  -  -  -  -  -  -  -  -  -  -  -  -  -  -  -  -  - 
     65589 0049 8FB2F8 08:27:13.722080 Call open             parms: 0000004D 
     65589 0049 8FB2F8 08:27:13.722714 Exit open             rv=00000004 
     65589 0049 8FB2F8 08:27:13.722741 Call fstat            parms: 00000004 
     65589 0049 8FB2F8 08:27:13.722785 Exit fstat            rv=00000000 
     65589 0049 8FB2F8 08:27:13.722817 Call lseek            parms: 00000004 
     65589 0049 8FB2F8 08:27:13.722824 Exit lseek            rv=00000000 
     65589 0049 8FB2F8 08:27:13.722836 Call lseek            parms: 00000004 
     65589 0049 8FB2F8 08:27:13.722883 Exit lseek            rv=00000000 
     65589 0049 8FB2F8 08:27:13.722896 Call lseek            parms: 00000004 
     65589 0049 8FB2F8 08:27:13.722901 Exit lseek            rv=00000000

This is ok, but I want to know how long the calls took.

I wrote an ISPF Rexx script

/* REXX */ 
/* 
exec to extract the alias records from a catalog listing 
*/ 
ADDRESS ISPEXEC 
'ISREDIT MACRO (lines) ' 
"ISREDIT         (f) = LINENUM .ZLAST" 
sum = 0
do j = 3  by 2 to f 
   k = j + 1 
   "ISREDIT      ( bef) = LINE   " j 
   "ISREDIT      ( aft) = LINE   " k 
   parse var bef . 27 mm 29 . 30 ss 39 . 40 what 
   parse var aft . 27 am 29 . 30 as 39 
   before = 60 * mm    + ss 
   after  = 60 * am    + as 
   sum = sum + (after - before) 
   delta = format(after - before,1,6) 
   string = "== "delta what 
   "ISREDIT  LINE  (k) =  (string)" 
    
end 
say sum
exit

running the Rexx script produced output in the file like

     65589 0049 8FB2F8 08:27:13.722080 Call open      
== 0.000630 Call open 
     65589 0049 8FB2F8 08:27:13.722741 Call fstat     
== 0.000050 Call fstat 
     65589 0049 8FB2F8 08:27:13.722817 Call lseek     
== 0.000000 Call lseek

Using the ISPF commands

X all
f “==” all
del all x
sort 1 11

This gave

== 0.000000 Call lseek 
== 0.000000 Call lseek 
...
0.000820 Call close            parms: 00000005 00000000 00000000 05FC0119 
0.001450 Call cond_timed_wait  parms: 00000000 000F4240 00000001 00000000 20861040
0.002450 Call loadhfs          parms: 00000048 /u/tmp/zowet/colin/envz/lib/python3...
0.004200 Call loadhfs          parms: 00000008 CELQDCPP 00000000 0000010C 61E9F3F1
0.004310 Call loadhfs          parms: 00000007 CXXRT64 00000000 0000010C 61E9F3F1 
0.034780 Call mvsprocclp       parms: 00000100 00000000 00000000 00000000 
0.042970 Call mvsprocclp       parms: 00000100 2081D1D8 2086CC75 00000000

There were nearly 500 lines in the output file. 400 entries were 100 microseconds or less. There were 6 entries taking longer than 1 millisecond.
My trace file was of duration 1.1 seconds. Adding up the individual times took 0.13 seconds, so it looks like the delays are not caused by the file system, and I need to look else where.

From data to reports missing the potholes

I’ve been doing work with datasets on z/OS to produce reports. These range from SMF data to DCOLLECT data on datasets and SMS data.

It took a while to “get it right”, because I made some poor decisions as to how to process the data, and some of my processing was much more complex than it needed to be. It was easiest to start again!

I’ve been working with Python and Python tools, and other tools available on the platforms. See Pandas 102, notes on using pandas.

My current environment is to use some Python code to read a record, parse the record into a dictionary(dict), then add the dict to a list of records. Then either pass the list of dicts to Pandas to display, or to externalise the data, and have a second Python program to read the externalised data, and do the Pandas processing.

Reading the data
Processing the data
Now I’ve got a record – now what?
- Calculations
- Adding or deleting fields
How you accumulate the data, dicts or lists?
Error handling
Using the data

Reading the data

The data is usually in data sets rather than files in Unix Services. You can copy a dataset to a file, but it is easier to use the python package pyzfile to read datasets directly.

from pyzfile import * 

try:
  with ZFile("//'COLIN.DCOLLECT.OUT'", "rb,type=record,noseek") as file: 
     for rec in file: 
        #l = len(rec) 
        yield  rec 
    except ZFileError as e: 
        print(e,file=sys.stderr)

Often a data source will contain a mixture of record types, for example a dump of SMF datasets, may contain many different record types and subtypes.

You need to consider if you want to process all record types in one pass, or process one record type in one run, and a different record type in a different run.

Processing the data

You will normally have a mapping of the layout of the data in a record. Often there are a mix of records types, you need to decide which record types you process and which you ignore.

Field names

Some of the field names in a record are cryptic, they were created when field names could only be 8 characters or less. For example DCDDSNAM. This stands for DCollect records record type D, field name DS NAMe. You need to decide what you name the field. Do you name it DCDDSNAM, and tell the reader to go and look in the documentation to understand the field names in the report, or do you try to add value and just call it DSN, or DataSetName. You cannot guess some fields, such as DCDVSAMI. This is VSAM Inconsistency.

You also need to consider the printed report. If you have a one character field in the record, and a field name which is 20 characters long – by the default the printed field will be 20 characters long, and so waste space in the report. If the field is rarely used you could call it BF1 for Boring Field 1.

Character strings

Python works in ASCII, and strings need to be in ASCII to be printable. You will need to convert character data from EBCIDC to ASCII.

You can use substring to extract data from a record for example. So to extract a string, and convert it…

DSN =  record[20:63].decode('cp500').strip())

Integers

Integers – you will need to covert this to internal format. I found using the Python Struct very good to use. You give a string of conversion characters (integer, integer, …) and it returns an array of the data. If you are processing the data on a different platform, you may need to worry about big end and little end conversion of numbers.

Strange integers

Some records have units like hundredths of a second. You may want to convert these to float

float_value = float(input_value)/100

Packed numbers

Packed numbers are a representation of a date in “decimal” format. For example a yyyyddd for year 2025, day 5 is 0X2025005F” where the F is a sign digit. You cannot just print it (it comes out as 539295839).

Bit masks

Bit masks are very common, for example there is a 1 byte field DCVFLAG1 with values:

DCVUSPVT 0x20 Private
DCVUSPUB 0x10 Public
DCVUSSTG 0x08 Storage
DCVSHRDS 0X04 Device is sharable

If the value of the field is 0x14 – what do you return? I would create a field Flag1 with value of a list[“Public”,”Shareable”]. If all the bits were off, this would return an empty list []. It would be easy to create [“DCVUSPUB”,”DCVSHRDS”] or just display the hex value 14 (or 0x14) – but this makes it hard to interpret the data for the people reading the reports.

Triplets

SMF records contain triplets. These are defined by [offset to start, length of data, count of data] within the record.

For example in the SMF30 record there are many triplet sections. There is one for “usage data” involved in usage based pricing. There can be zero or more sections like

Product owner
Product name
…
TCB time used in hundredths of a second

How are you going to process this?
The SMF record has 3 fields for usage

SMF30UDO Offset to Usage Data section in SMF 30 record
SMF30UDL Length of each Usage Data section in SMF 30 record
SMF30UDN Number of Usage Data section in SMF 30 record

I would create a variable UsageData = [{“ProdOwner”: …,”ProdName”: …, “TCBTime”: …},{“ProdOwner”: …,”ProdName”: …, “TCBTime”: …},]

and convert TCBTime from an integer representing a hundreds of a second, to a floating point number.

Having these triplets make a challenge when printing the record. You could decide to

omit this data
summarise the data – and provide only a sum of the TCBTime value
give the data as a list of dicts, then have a Pandas step to copy only the fields you need for your reports.

For this usage data, I may want a report showing which jobs used which product, and how my much CPU the job used in that product. Although I may capture the data as a list of products, i could extract the data, and create another data record with

jobname1, product1, … CPU used1
jobname1, product2, … CPU used2
jobname2, product1, … CPU used1
jobname2, product3, … CPU used3
…

and remove the product data from the original data record.

Do you want all of the fields?

You may want to ignore fields, such as reserved values, length of record values, record_ type, and any fields you are not interested in. Record length tends to be the first field, and this is usually not interesting when generating default reports.

How to handle a different length record?

The format of many records change with new releases, typically adding new fields.

You need to be able to handle records from the previous release, where the record is shorter. For example do not add these fields to your dict, or give add them with a “None” value.

Now I’ve got a record – now what?

Once you have got your record, and created a dict from the contents {fieldname1=value, fieldname2=value2…} , you could just add it to the list to be passed to Pandas. It is not always that simple.

I found that some records need post processing before saving.

Calculations

For a DCOLLECT record, there is a field which says

DCVFRESP: Free Space on Volume (in KB when DCVCYLMG is set to 0 or in MB when DCVCLYMG is set to 1)

You need to check bit DCVCYLMG and have logic like

if DCVCYLMG  == 1:
	  data["FreeSpVolKB"] = data["FreeSpVolKB"] * 1024

Adding or deleting fields

For some fields I did some calculations to simplify the processing. For example I wanted average time when I had total_time, and count.

I created average_time = total_time / count, added this field, and deleted total_time and count fields.

Error handling

I found some records have an error flag, for example “Error calculating volume capacity”. You need to decide what to do.

Do you include them, and have the risk, that the calculations/display of volume capacity might be wrong?
Do you report record during the collection stage, and not include them in the overall data?

How you accumulate the data, dicts or lists?

When using Pandas you can build each record as a dict of values {“kw1″:”v1″,”kw2″:”v2”}, then build a list of dicts [{}, {}…]

or have “column” have a list of values {“Jobname”: [“job1″,”job2″…],”CPUUsed”:[99,101…] … }. As you process each field you append it to the appropriate “column” field.

# a dict of lists
datad = {"dsn":["ABC","DEF"],
         "volser":["SYSRES","USER02"]}
datad["dsn"].append("GHI")
datad["volser"].append("OLDRES")

pdd =  pd.DataFrame(datal) 
 

# a list of dicts
dictl = [
    {"dsn":"ABC","volser":"SYSRES"},
    {"dsn":"DEF","volser":"USER02S"}]
newdict = {"dsn":"GHI","volser":"OLDRES"}

dictl.append(newdict)

pdl = pd.DataFrame.from_records(datal)

I think it is better to capture your data in a dict, then add the dict to the list of records.

For example with

DCVFRESP: Free Space on Volume (in KB when DCVCYLMG is set to 0 or in MB when DCVCLYMG is set to 1)

If you use a dict to collect the data, you can then easily massage the values, before adding the dict to the list.

if DCVCYLMG  == 1:
  data["FreeSpVolKB"] = data["FreeSpVolKB"] * 1024

grand_data.append[data]

If you try to do this using “column” values it gets really messy trying to do a similar calculation.

Using the data

It took a long time to process the dataset and create the Python data. I found it quicker overall to process the dataset once, and externalise the data using Pickle, or JSON. Then have different Python programs which read the data in and processed it. For example

Creating a new data structure using just the columns I was interested in.
Filtering which rows I wanted.
Save it

Pandas 102, notes on using pandas

Pandas is a great tool for displaying data from Python. You give it arrays of data, and it can display, summarise, group, print and plot. It is used for the simplest data, up to data analysts processing megabytes of data.

There is a lot of good information about getting started with Pandas, and how you can do advanced things with Pandas. I did the Pandas 101 level of reading, I struggled with the next step, so my notes for the 102 level of reading are below. Knowing that something can be done means you can go and look for it. If you look but cannot find, it may be that you are using the wrong search arguments, or there is no information on it.

Working with data
Feeding data into Pandas
What can you do with it?
Which columns are displayed ?
Select which rows you want displayed
You can select rows and columns
- What does .loc do?
You can process the data, such as sort
Saving data
- Export the data
- Import and do the analysis
Processing multiple data sources as one
Processing data within fields
Basic operations on columns
Doing aggregation, count, sum, maximum, minimum etc.
- Simple aggregation .sum()
- More complex aggregation .agg()

Working with data

I’ve been working with “flat files” on z/OS. For example the output of DCOLLECT which is information about dataset etc from SMS.

One lesson I learned was you should isolate the extraction from the processing (except for trivial amounts of data). Extracting data from flat files can be expensive, and take a long time, for example it may include conversion from EBCDIC to ASCII. It is better to capture the data from the flat file in python variables, then write the data to disk using JSON, or pickle (Python object serialisation). As a separate step read the data into memory from your saved file, then do your data processing work, with pandas, or other tools.

Feeding data into Pandas

The work I’ve done has been two dimensional, rows and columns; you can have multi dimensional.

You can use a list of dictionaries(dicts), or dict of list:

# a dict of lists
datad = {"dsn":["ABC","DEF"],
         "volser":["SYSRES","USER02"]}
pdd =  pd.DataFrame(datal)  

# a list of dicts
datal = [{"dsn":"ABC","volser":"SYSRES"},
         {"dsn":"DEF","volser":"USER02S"},
        ]   

pdl = pd.DataFrame.from_records(datal)

Processing data like pdd = pd.DataFrame(datal) creates a pandas data frame. You take actions on this data frame. You can create other data frames from an original data fram, for example with a subset of the rows and columns.

I was processing a large dataset of data, and found it easiest to create a dict for each row of data, and then accumulate each row as a list. Before I used Pandas, I had just printed out each row. I do not know which performs better. Someone else used a dict of lists, and appended each row’s data to the “dsn” or “volser” list.

What can you do with it?

The first thing is to print it. Once the data is in Pandas you can use either of pdd or pdl above.

print(pdd)

gave

   dsn   volser
0  ABC   SYSRES
1  DEF  USER02S

Where the 0, 1 are the row numbers of the data.

With my real data I got

                                             DSN  ... AllocSpace
0   SYS1.VVDS.VA4RES1                             ...       1660
1   SYS1.VTOCIX.A4RES1                            ...        830
2   CBC.SCCNCMP                                   ...     241043
3   CBC.SCLBDLL                                   ...        885
4   CBC.SCLBDLL2                                  ...        996
..                                           ...  ...        ...
93  SYS1.SERBLPA                                  ...        498
94  SYS1.SERBMENU                                 ...        277
95  SYS1.SERBPENU                                 ...      17652
96  SYS1.SERBT                                    ...        885
97  SYS1.SERBTENU                                 ...        332

[98 rows x 7 columns]

The data was formatted to match my window size. With a larger window I got more columns.

You can change this by using

pd.set_option('display.max_rows', 500)
pd.set_option('display.max_columns', 500)
pd.set_option('display.width', 1000)

Which columns are displayed ?

Rather than all of the columns being displayed you can select which columns are displayed.

You can tell from the data you passed to pandas, or use the command

print(list(opd.columns.values))

This displays the values of the column names, as a list.

To display the columns you specify use

print(opd[["DSN","VOLSER","ExpDate","CrDate","LastRef","63bit alloc space KB", "AllocSpace"]])

You can say display all but the specified columns

print(opd.loc[:, ~opd.columns.isin(["ExpDate","CrDate","LastRef"])])

Select which rows you want displayed

print(opd.loc[opd["VOLSER"].str.startswith("A4"),["DSN","VOLSER",]])

print(opd.loc[opd["DSN"].str.startswith("SYS1."),["DSN","VOLSER",]])

gave

                                             DSN  VOLSER  
0   SYS1.VVDS.VA4RES1                             A4RES1  
1   SYS1.VTOCIX.A4RES1                            A4RES1  
12  SYS1.ADFMAC1                                  A4RES1  
13  SYS1.CBRDBRM                                  A4RES1  
14  SYS1.CMDLIB                                   A4RES1  
..                                           ...     ...  
93  SYS1.SERBLPA                                  A4RES1  
94  SYS1.SERBMENU                                 A4RES1  
95  SYS1.SERBPENU                                 A4RES1  
96  SYS1.SERBT                                    A4RES1  
97  SYS1.SERBTENU                                 A4RES1  

[88 rows x 2 columns]

From this we can see 88 (out of 97) rows were displayed. Row 0, 1 , 12, 13, but not rows 2, 3, …

What does .loc do?

My interpretation of this (which I haven’t seen documented)

If there is one parameter, this is a list of the columns you want.

If there are two parameters, the second is the list of the columns you want displayed. The first column is conceptually a list of True or False, with one value per row, saying if the row should be selected or not. So for

print(opd.loc[opd["VOLSER"].str.startswith("A4"),["DSN","VOLSER",]])

opd[“VOLSER”].str.startswith(“A4”)

says take the column called VOLSER, convert it to a string. If it starts with the string “A4” then return True, else return False. This returns a list of one entry per row.
The opd.loc[opd[“VOLSER”].str.startswith(“A4”),…) then selects the rows.

You can select rows and columns

print(opd.loc[opd["VOLSER"].str.startswith("A4"),["DSN","VOLSER","63bit alloc space KB",]])

You can process the data, such as sort

The following statements extracts columns from the original data, sorts the data, and creates a new data frame. The new data frame is printed.

sdata= opd[["DSN","VOLSER","63bit alloc space KB",]].sort_values(by=["63bit alloc space KB","DSN"], ascending=False)
print(sdata)

This gave

                                             DSN  VOLSER  63bit alloc space KB
2   CBC.SCCNCMP                                   A4RES1                241043
35  SYS1.MACLIB                                   A4RES1                210664
36  SYS1.LINKLIB                                  A4RES1                166008
90  SYS1.SEEQINST                                 A4RES1                103534
42  SYS1.SAMPLIB                                  A4RES1                 82617
..                                           ...     ...                   ...
62  SYS1.SBPXTENU                                 A4RES1                    55
51  SYS1.SBDTMSG                                  A4RES1                    55
45  SYS1.SBDTCMD                                  A4RES1                    55
12  SYS1.ADFMAC1                                  A4RES1                    55
6   FFST.SEPWMOD3                                 A4RES1                    55

[98 rows x 3 columns]

Showing that all the rows, and all the (three) columns which had been copied to the sdata data frame.

Saving data

Reading an external file and processing the data into Python arrarys took an order of magnitude longer than processing it Pandas.

You should consider a two step approach to looking at data

Extract the data and exported it in an access format, such as Pickle or JSON. While getting this part working, use only a few rows of data. Once it works, you can process all of the data.
Do the analysis using the exported data.

Export the data

You should consider externalising the data in JSON or pickles format for example

# write out the data to a file
fPickle = open('pickledata', 'wb')    
    # source, destination
pickle.dump(opd, fPickle)                    
fPickle.close()

Import and do the analysis


# and read it in
fPickle = open('pickledata', 'rb')    
opd = pickle.load(fPickle)
fPickle.close()
print(odp)

Processing multiple data sources as one

If you have multiple sets of data, for example for Monday, Tuesday, Wednesday, etc you can use

week =  pd.concat(monday,tuesday,wednesday,thursday,friday)

Processing data within fields

Within my data, I have a field with information like

                   DSN  VOLSER           FormatType
0    SYS1.VVDS.VC4RES1  C4RES1                   []
1   SYS1.VTOCIX.C4RES1  C4RES1              [Fixed]
2          CBC.SCCNCMP  C4RES1    [Fixed, Variable]
3          CBC.SCLBDLL  C4RES1    [Fixed, Variable]
4         CBC.SCLBDLL2  C4RES1    [Fixed, Variable]

Where the data under FormatType is a list. You can reference elements in a list.

For example

x =  data.FormatType.apply(lambda x: 'Variable' in x)
print(x)

gives

0     False
1     False
2      True
3      True
4      True

The command

print(data.loc[ data.FormatType.apply(lambda x: 'Blocked' in x)])

gives

              DSN  VOLSER           FormatType
2     CBC.SCCNCMP  C4RES1    [Fixed, Variable]
3     CBC.SCLBDLL  C4RES1    [Fixed, Variable]
4    CBC.SCLBDLL2  C4RES1    [Fixed, Variable]

Basic operations on columns

You can do basic operations on columns such as

print(dataset[["CountIO","CacheHits"]].sum())

The sum() (and count() etc) functions add up the specified columns.

This gave

[361 rows x 10 columns]
CountIO      74667.0
CacheHits     1731.0
dtype: float64

An operation like

print(dataset.sum())

Would have totalled all the columns, including some which are meaningless, for example, maximum value found.

Doing aggregation, count, sum, maximum, minimum etc.

Simple aggregation

You can aggregate data

# Extract just the fields of interest
d = dataset[["DSN","CountIO","CacheHits"]]
print(d.groupby("DSN").sum())

Gave

                                        CountIO  CacheHits
DSN                                                       
ADCD.Z31B.PARMLIB                          68.0       60.0
ADCD.Z31B.PROCLIB                          66.0       66.0
ADCD.Z31B.VTAMLST                         141.0      141.0
COLIN.TCPPARMS                              4.0        4.0
FEU.Z31B.PARMLIB                            4.0        0.0
IXGLOGR.ATR.S0W1.RM.DATA.A0000000.DATA      4.0        0.0
SYS1.DAE                                    0.0        0.0
SYS1.DBBLIB                               974.0      932.0

More complex aggregation

The .agg() gives you much more control as to what, and how you want to process data.

print(d.groupby("DSN").agg({'DSN' : ['count'], 'CountIO' : ['sum','max'],"CacheHits": ["sum"]}))

gave

                                         DSN  CountIO          CacheHits
                                       count      sum      max       sum
DSN                                                                     
ADCD.Z31B.PARMLIB                         19     68.0      7.0      60.0
ADCD.Z31B.PROCLIB                         30     66.0      3.0      66.0
ADCD.Z31B.VTAMLST                          6    141.0     41.0     141.0
COLIN.TCPPARMS                             2      4.0      3.0       4.0
FEU.Z31B.PARMLIB                           1      4.0      4.0       0.0
IXGLOGR.ATR.S0W1.RM.DATA.A0000000.DATA     4      4.0      1.0       0.0
SYS1.DAE                                   1      0.0      NaN       0.0
SYS1.DBBLIB                                2    974.0    932.0     932.0

Notes:

The columns are not in the order I specified. It is hard to see which field Max applies to
There is a Not a Number (Nan) in one of the value. You need to allow for this.
In the simple case using .sum() by default it tries to sum all of the columns. Using .agg you can specify which columns you want to process