Getting SSH to work to z/OS

I have two versions of z/OS, old and new(!). I had problems getting ssh to work because of key problems.

The problem

I tried to update my laptop key to the server

ssh-copy-id colin@10.1.1.2

This gave

/usr/bin/ssh-copy-id: INFO: attempting to log in with the new key(s), to filter out any that are already installed

/usr/bin/ssh-copy-id: ERROR: @@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
ERROR: @ WARNING: REMOTE HOST IDENTIFICATION HAS CHANGED! @
ERROR: @@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
ERROR: IT IS POSSIBLE THAT SOMEONE IS DOING SOMETHING NASTY!
ERROR: Someone could be eavesdropping on you right now (man-in-the-middle attack)!
ERROR: It is also possible that a host key has just been changed.
ERROR: The fingerprint for the ED25519 key sent by the remote host is
ERROR: SHA256:2mUOVfdSedJVQIzZiGsRkOe9Vkc1bkyuDNp5H+VrZ98.
ERROR: Please contact your system administrator.
ERROR: Add correct host key in /home/colin/.ssh/known_hosts to get rid of this message.
ERROR: Offending ED25519 key in /home/colin/.ssh/known_hosts:1
ERROR: remove with:
ERROR: ssh-keygen -f '/home/colin/.ssh/known_hosts' -R '10.1.1.2'
ERROR: Host key for 10.1.1.2 has changed and you have requested strict checking.
ERROR: Host key verification failed.

Searching the internet I got suggestions saying “delete the old line from the file”. I didn’t want to do this because it meant I would not be able to go back to the old system and work as before.

Solutions

I edited /home/colin/.ssh/known_hosts and commented out line 1, with a # at the front (the :1 above is the first line). I repeated the command and it report the same message for line :2. I commented that out as well.

I got further

colin@ColinNew:~$ ssh-copy-id colin@10.1.1.2
The authenticity of host '10.1.1.2 (10.1.1.2)' can't be established.
ED25519 key fingerprint is SHA256:2mUOVfdSedJVQIzZiGsRkOe9Vkc1bkyuDNp5H+VrZ98.
This key is not known by any other names.
Are you sure you want to continue connecting (yes/no/[fingerprint])? yes
/usr/bin/ssh-copy-id: INFO: attempting to log in with the new key(s), to filter out any that are already installed
/usr/bin/ssh-copy-id: INFO: 2 key(s) remain to be installed -- if you are prompted now it is to install the new keys
colin@10.1.1.2: Permission denied (publickey,hostbased).

I had to start the SYSLOGD on z/OS to capture the output from SSHD.

In the /var/logSSHD (your’s may be different) it said

FOTS2307 User COLIN from 10.1.0.2 not allowed because not listed in AllowUsers 

In my SSHD config file /etc/ssh/sshd_config I had

# Allow specific user IDs 
AllowUsers IBMUSER

I added COLIN to the list and restarted SSHD. (I do not know how to refresh SSHD)

This time the error log had

trying public key file /u/tmp/zowet/colin/.ssh/authorized_keys 
Could not open authorized keys '/u/tmp/zowet/colin/.ssh/authorized_keys': ...

I fixed this, tried to logon, and this time it worked.

On Linux, I edited /home/colin/.ssh/known_hosts and un-commented the lines I had commented out before.
I tried the ssh command again, and it still worked!

Python calling C functions – passing structures

I’ve written how you can pass simple data from Python to a C function, see Python calling C functions.

This article explains how you can pass structures and point to buffers in the Python program. it extends Python calling C functions. It allows you to move logic from the C program to a Python program.

Using complex arguments

The examples in Python calling C functions were for using simple elements, such as Integers or strings.

I have a C structure I need to pass to a C function. The example below passes in an eye catcher, some lengths, and a buffer for the C function to use.

The C structure

typedef struct querycb {                                                         
char Eyecatcher[4]; /* Eye catcher offset 0 */
uint16_t Length; /* Length of the block 4 */
char Rsvd1[1]; /* Reserved 6 */
uint8_t Version; /* Version number 7 */
char Flags[2]; /* Flags 8 */
uint16_t Reserved8; // 10
uint32_t Count; // number returned 12
uint32_t lBuffer; // length of buffer 16
uint32_t Reservedx ; // 20
void *pBuffer; // 24
} querycb;

The Python code

# create the variables
eyec = "EYEC".encode("cp500") # char[4] eye catcher
l = 32 # uint16_t
res1 = 0 # char[1]
version = 1 # uint8_t -same as a char
flags = 0 # char[2]
res2 = 0 # uint16_t
count = 0 # uint32_t
lBuffer = 4000 # uint32_t
res3 = 0 # uint32_t
# pBuffer # void *
# allocate a buffer for the C program to use and put some data
# into it
pBuffer = ctypes.create_string_buffer(b'abcdefg',size=lBuffer)
# cast the pBuffer so it is a void *
pB = ctypes.cast(pBuffer, ctypes.c_void_p)
# use the struct.pack function. See @4shbbhhiiiP below
# @4 is 4 bytes, the eye catcher
# h half word
# bb two char fields res1, and version
# hh two half word s flags and res2
# iii three integer fields. count lBuffer and res3
# P void * pointer
# Note pB is a ctype, we need the value of it, so pB.value
p = pack("@4shbbhhiiiP", eyec,l,res1,version,flags,
res2,count,lBuffer,res3,pB.value)

#create first parm
p1 = ctypes.c_int(3) # pass in the integer 3 as an example
# create second parm
p2 = ctypes.cast(p, ctypes.c_void_p)

# invoke the function

retcode = lib.conn(p1,p2)

The C program

int conn(int * p1, char * p2) 
// int conn(int max,...)
{
typedef struct querycb {
char Eyecatcher[4]; /* Eye catcher 0 */
uint16_t Length; /* Length of the block 4 */
char Rsvd1[1]; /* Reserved 6 */
uint8_t Version; /* Version number 7 */
char Flags[2]; /* Flags 8 */
uint16_t Reserved8; // 10
uint32_t Count; // number returned 12
uint32_t lBuffer; // length of buffer 16
uint32_t Reservedx ; // 20
void *pBuffer; // 24
} querycb;

querycb * pcb = (querycb * ) p2;

printf("P1 %i\n",*p1);
printHex(stdout,p2,32);
printf("Now the structure\n")
printHex(stdout,pcb -> pBuffer,32);
return 0 ;
}

The output

P1 3
00000000 : D8D9D7C2 00200001 00000000 00000000 ..... .......... EYEC............
00000010 : 00000FA0 00000000 00000050 0901BCB0 ...........P.... ...........&....
Now the structure
00000000 : 61626364 65666700 00000000 00000000 abcdefg......... /...............
00000010 : 00000000 00000000 00000000 00000000 ................ ................

Where

  • EYEC is the passed in eye catcher
  • 00000FA0 is the length of 4000
  • 00000050 0901BCB0 is the 64 address of the structure
  • abcdefg is the data used to initialise the buffer

Observations

It took me a couple of hours to get this to work. I found it hard to get the cast, and the ctype…. functions to work successfully. There may be a better way of coding it, if so please tell me. The code works, which is the objective – but there may be better more correct ways of doing it.

Benefits

By using this technique I was able to move code from my C program to set up the structure needed by the z/OS service into C. My C program was just parse input parameters, set up the linkage for the z/OS service, and invoke the service.

If course I did not have the constants available from the C header file for the service, but that’s a different problem.

Python safely iterating

I was using a Python program to access a z/OS service, and found there were times when my code did not clean up and close the resource.

It took me an afternoon to find out how to do it. I found pyzfile by daveyc an excellent example of how to cover Python advanced topics.

pyzfile example

The documentation has

from pyzfile import *
try:
with ZFile("//'USERID.CNTL(JCL)'", "rb,type=record",encoding='cp1047') as file:
for rec in file:
print(rec)
except ZFileError as e:
print(e)

Breaking this down

Understanding the “with”

try:
with ZFile("//'USERID.CNTL(JCL)'", "rb,type=record",encoding='cp1047') as file:

...
do something with file
...
except ZFileError as e:
print(e)

When the with ZFile(…) as file: is executed the code conceptually does

  • standard set up processing
  • open the file and return the handle
  • do processing using the file handle
  • when ever it leaves the with code section perform the close activity

Note:This could have been done with

try:
open the file
...
do something
...
except:
...
finally: # do this every time
if the file was opened:
close the file

but this is not quite so tidy and compact as the with syntax

In more detail…

  • The def __init__(self,..): method is invoked and passed the parameters. It saves parameters using statements like self.p1
  • The __enter__(self): is invoked passing the instance data(self). It seems to have no other parameters.
    • In the pyzfile, the code issues return self._open(). This invokes the function _open to open the data set.
  • When the with processing completes, it invokes the function __exit__(self, exc_type, exc_value, exc_traceback): This is invoked whether the code returned normally, or got an exception.
    • In the pyzfile, the code issue executes self.close(). So however the “with” processing ends, the close is always executed

Handing errors

I’ve seen that using the “with” clause, people tend to throw exceptions when problems are found

For example with the pyfile code there is

class ZFileError(Exception):
""" ZFile exception """
def __init__(self, message: str, amrc: dict = None):
self.message = message
self.amrc = amrc
if amrc is None:
self.amrc = {}
super().__init__(self.message)

def __str__(self) -> str:
return self.message

def amrc(self):
"""
Returns the amrc dict at the time of error.

:return: The ``__amrc`` structure at the time of error.
"""
return self.amrc

class ZFile:
...
def _open(self):
...
self.handle = open...
if not self.handle:
raise ZFileError(f"Error opening file '{self.filename}':
{self.lib.zfile_strerror().decode('utf-8')}")
return self

Understanding the “for”

The code above had

    with ZFile("//'USERID.CNTL(JCL)'", "rb,type=record",encoding='cp1047') as file:
for rec in file:
print(rec)

How does the “for” work?

The key to this code are the functions

##################################################
# Iterators
##################################################
def __iter__(self):
return self

def __next__(self):
ret = self.read()
if not ret:
raise StopIteration
return ret

When the for statement is processed it processes the __next__ function. This does the work of getting the next record and returning it.

There is a lot of confusing documentation about iterators, iteration and iterables. Let’s see if my description helps clarify or just adds more confusion.

Something is iter-able of you can do iteration on it; where iteration means taking each element in turn.

In Python a list is iter-able

for l in [0,1,2,3,]
print(l)

will iterate over the list and return the element from the list

0
1
2
3

Records in a file are a bit more abstract, you cannot see the whole file, but you can say get the next record – and move through the file until there are no more records.

An iterator is the mechanism by which you iterate. Think of it as a function. The Python documentation is pretty clear.

Most people define

  def __iter__(self):
return self

For most people, just specify this. The PhD class may use something different.

The mechanism of “for” uses the __next__ function

    def __next__(self):
ret = self.read()
if not ret:
raise StopIteration
return ret

Which obtains the next element of data. If there are no more elements, then raise the StopIteration exception.

If you do not handle the StopIteration exception, then Python handles it for you and leaves the processing loop.

Conclusion

With both of these techniques “with” and “for” I could extract records from a z/OS resource.

I’ve used the “with” and “for” with yield to hide implementation detail

# create the function to read the file
def readfile(name):
try:
with ZFile(name, "rb,type=record,noseek") as file:
for rec in file:
yield rec
except ZFileError as e:
print(e)
# process the file using for ... readline()
def reader(...):
for line in readfile("//'IBMUSER.RMF'"):
do something with the data

Many is so last year – logstreams is the way to go.

I’ve been looking into the SMF Real Time, where an application program can get records directly from SMF, and not have to post-process SMF datasets or log streams. To use the real time support, SMF needs to use log streams.

What is SMF?

SMF is System Management Facility. z/OS and the subsystems can write data to SMF for post processing. Typical records are audit and accounting records from z/OS, RACF or CICS, changes to SMS, and changes to resources. Each product has one or more SMF record-type numbers allocated to it. Within each SMF record type you can have sub-types, for example the z/OS SMF 30 record has a sub-type for job start, another sub-type for job step end, and another sub-type for job end.

Display SMF options

The command

d smf

gave

   NAME                VOLSER SIZE(BLKS) %FULL  STATUS    
P-SYS1.S0W1.MAN1 B3SYS1 7200 0 ALTERNATE
S-SYS1.S0W1.MAN3 USER04 72000 1 ACTIVE

showing the dataset are being used, and giving information about the datasets

The command

d smf,o

displays all of the SMF options, and where they came from – for example a parmlib member, or from the SETSMF command.

IEE967I 08.44.41 SMF PARAMETERS 489                
MEMBER = SMFPRM00
...
SYNCVAL(00) -- DEFAULT
DUMPABND(RETRY) -- DEFAULT
INMEM(IFASMF.COLIN,TYPE(30,42),RESSIZMAX(0128M)) -- PARMLIB
SUBSYS(STC,NOTYPE(14:19,62:69,99)) -- SYS
...
STATUS(010000) -- PARMLIB
INTVAL(01) -- PARMLIB
MAXDORM(0001) -- PARMLIB
REC(PERM) -- PARMLIB
NOPROMPT -- PARMLIB
DSNAME(SYS1.S0W1.MAN3) -- PARMLIB
DSNAME(SYS1.S0W1.MAN1) -- PARMLIB

ACTIVE -- PARMLIB

The old way of recording SMF data

SMF had set of datasets it would use in turn. Typically these were named like SYS1.MANX, SYS1.MANY, or SYS1.PROD.MAN2 etc.. When the active dataset filled up, SMF would switch to the next empty dataset. You (or automation) then runs a job to either copy the records to another dataset, or post process the records; and then clear the dataset for reuse.

As computers got bigger, more work was done, more records were written and writing records to disk could not keep up.

Logstreams is the way forward.

A log stream is a stream of data which can be written to a Coupling Facility(CF) structure, or to a dataset on disk. Typically writing to a CF is faster than writing to disk.

With MANx datasets, all records were written to one dataset. With logstreams, you can configure SMF have multiple logstreams and you configure which record type(s) go to which log stream. This means you can have CICS records going to the “CICS log stream”, and RACF records going to the “RACF logstream”, and the remainder going to a default log stream.

Having multiple logstreams means data can be written to many log streams concurrently, and so avoids the bottleneck of writing to a MANx dataset.

Setting up security profiles

It took me several attempts to configure the security profiles.

Be able to define and delete logstreams

//IBMUSER1 JOB   1,MSGCLASS=H 
//KEYCERTS EXEC PGM=IKJEFT01
//SYSPRINT DD SYSOUT=*
//SYSTSPRT DD SYSOUT=*
//SYSTSIN DD *
RDEFINE FACILITY RESOURCE(MVSADMIN.LOGR) UACC(NONE)
permit MVSADMIN.LOGR class(FACILITY) -
access(control) ID(SYS1)
setr raclist(facility) refresh

Define individual logstreams

RDEFINE LOGSTRM IFASMF.** UACC(NONE) 
PERMIT IFASMF.** class(LOGSTRM ) -
access(ALTER ) ID(SYS1)
setr raclist(logstrm ) refresh

Giving SMF access to the logstreams

RDEFINE FACILITY IFA.IFASMF.* UACC(READ)
setr raclist(facility) refresh

Setting up logstreams

You need to set up at least one log stream. It is easy to define more and change the SMF configuation.

I used the define logstream command

//IBMLOG JOB 1,MSGCLASS=H 
//LOGDEF EXEC PGM=IXCMIAPU,REGION=4M
//SYSPRINT DD SYSOUT=*
//SYSIN DD *
DATA TYPE(LOGR) REPORT(YES)

DELETE LOGSTREAM NAME(IFASMF.DEFAULT)
DEFINE LOGSTREAM NAME(IFASMF.DEFAULT)
DESCRIPTION(SMF_LOGSTREAM)
MODEL(NO)
DASDONLY(YES)
STG_SIZE(65532)
LS_SIZE(15000)
HLQ(IXGLOGR)
HIGHOFFLOAD(80)
LOWOFFLOAD(0)
AUTODELETE(YES) /* DELETE OPTION */
OFFLOADRECALL(NO)
MAXBUFSIZE(65532)
DIAG(NO)
RETPD(1) /* DELETE 1 DAYS */
//

I also define a log stream IFASMF.COLIN

With the HLQ(IXGLOGR) definition, behind the logstreams were data sets like

Dataset                              Volume  
IXGLOGR.IFASMF.COLIN.ADCDPL *VSAM*
IXGLOGR.IFASMF.COLIN.ADCDPL.DATA USER05
IXGLOGR.IFASMF.COLIN.A0000000 *VSAM*
IXGLOGR.IFASMF.COLIN.A0000000.DATA USER04

Configure SMF

I created a member SMFPRMLS in a user.parmlib

ACTIVE                          /* ACTIVE SMF RECORDING             */ 
DSNAME(SYS1.&SYSNAME..MAN1,
SYS1.&SYSNAME..MAN3)
RECORDING(LOGSTREAM)
NOPROMPT /* DO NOT PROMPT OPERATOR */
REC(PERM) /* TYPE 17 PERM RECORDS ONLY */
MAXDORM(0001) /* WRITE IDLE BUFFER AFTER 1 SEC */
INTVAL(01) /* EVEY MINUTE */
STATUS(010000) /* WRITE SMF STATS AFTER 1 HOUR */
JWT(0400) /* 522 AFTER 30 MINUTES */
SID(&SYSNAME(1:4))
LISTDSN /* LIST DATA SET STATUS AT IPL */
DEFAULTLSNAME(IFASMF.DEFAULT)
LSNAME(IFASMF.COLIN,TYPE(30,42))

AUTHSETSMF
SYS(NOTYPE(14:19,62:69,99),EXITS(IEFU83,IEFU84,IEFACTRT,
IEFUSI,IEFUJI,IEFU29),NOINTERVAL,NODETAIL)
SUBSYS(STC,EXITS(IEFU29,IEFU83,IEFU84,IEFUJP,IEFUSO))
INMEM(IFASMF.COLI2,RESSIZMAX(128M),TYPE(30,42))

I activated it using the command

t smf=ls

When this failed, because my log stream definitions were not correct, the SMF collection defaulted to using the specified SYS1.MANx datasets.
The important bits of the SMFPRMxx file are

  • RECORDING(LOGSTREAM) – use logstreams rather than datasets
  • LSNAME(IFASMF.COLIN,TYPE(30,42)) for record types 30 and 42 write them to this log stream
  • DEFAULTLSNAME(IFASMF.DEFAULT) If there is no LSNAME for a record type – then write them to this log stream

You can issue setsmf commands to override the existing definition.

Processing SMF records

For SMF datasets

For the Use JCL like

// SET SMFPDS=SYS1.S0W1.MAN1                
// SET SMFSDS=SYS1.S0W1.MAN3
//SMFDUMP EXEC PGM=IFASMFDP
//DUMPINA DD DSN=&SMFPDS,DISP=SHR,AMP=('BUFSP=65536')
//DUMPINB DD DSN=&SMFSDS,DISP=SHR,AMP=('BUFSP=65536')
//DUMPOUT DD DISP=(NEW,CATLG),DSN=&RMF,SPACE=(CYL,(10,10))
//* DCB=(LRECL=32760,RECFM=VBS)
//* DCB=(BLKSIZE=0,LRECL=32760,RECFM=VBS)
//*UMPOUT DD DISP=SHR,DSN=IBMUSER.RMF,SPACE=(CYL,(1,1))
//SYSPRINT DD SYSOUT=*
//SYSIN DD *
INDD(DUMPINA,OPTIONS(DUMP))
INDD(DUMPINB,OPTIONS(DUMP))
OUTDD(DUMPOUT,TYPE(42,80,30))
RELATIVEDATE(BYDAY,0,1)
START(0000)
END(2300)
/*

This processes records within the specified time range in the datasets.

For log streams

Use JCL like the following – using PGM=IFASMFDL

//IBMSMFL  JOB 1,MSGCLASS=H 
//* DUMP THE SMF DATASETS
// SET SMF=IBMUSER.SMF
//*
//S1 EXEC PGM=IEFBR14
//DUMPOUT DD DISP=(MOD,DELETE),DSN=&SMF,SPACE=(CYL,(1,1))
//*
//SMFDUMP EXEC PGM=IFASMFDL,REGION=0M
//DUMPOUT DD DISP=(NEW,CATLG),DSN=&SMF,SPACE=(CYL,(10,10))
//SYSPRINT DD SYSOUT=*
//SYSIN DD *
LSNAME(IFASMF.COLIN,OPTIONS(DUMP))
OUTDD(DUMPOUT,TYPE(30))
RELATIVEDATE(BYDAY,0,1)
START(0000)
END(2300)
/*
//

When you specify a date range, it will read not only the active log stream datasets, but any archive ones it created, and which are available.

Display SMF

With logstream the D SMF command gave

   LOGSTREAM NAME               BUFFERS        STATUS            
A-IFASMF.DEFAULT 774 CONNECTED
A-IFASMF.COLIN 584 CONNECTED
A-IFASMF.INMEM 0 IN-MEMORY

Dumping SMF data – last n day’s worth

For many years, I’ve been processing SMF data, and using the date option like DATE(2026012,2027000). Every day, I had to change it to match today’s date, and submit the job.

I’ve just discovered you can give relative dates. For example RELATIVEDATE(BYDAY,0,1), which says go back 0 days and includes 1 day – so just do today.

The output listing has, for today’s date day 19 of 2026:

IFA834I RELATIVEDATE PARAMETER RESULTS IN START DATE 2026.019, END         
DATE 2026.019
IFA836I RELATIVEDATE RANGE EXTENDS INTO FUTURE, END DATE AND TIME USED
IS 2026.019 11:29

You can specify BYDAY, BYWEEK, and BYMONTH.

This function has been around for years! I wonder how much time I’ve wasted on doing it the old way.

The Python interface to RACF is great.

The Python package pysear to work with RACF is great. The source is on github, and the documentation starts here. It is well documented, and there are good examples.

I’ve managed to do a lot of processing with very little of my own code.

One project I’ve been meaning to do for a time is to extract the contents of a RACF database and compare them with a different database and show the differences. IBM provides a batch program, and a very large Rexx exec. This has some bugs and is not very nice to use. There is a Rexx interface, which worked, but I found I was writing a lot of code. Then I found the pysear code.

Background

The data returned for userids (and other types of data) have segments.
You can display the base segment for a user.

tso lu colin

To display the tso base segment

tso lu colin tso

Field names returned by pysear have the segment name as a prefix, for example base:max_incorrect_password_attempts.

My first query

What are the active classes in RACF?

See the example.

from sear import sear
import json
import sys
result = sear(
    {
        "operation": "extract",
        "admin_type": "racf-options"
    },
)
json_data = json.dumps(result.result   , indent=2)
print(json_data)

For error handling see error handling

This produces output like

{
"profile": {
"base": {
"base:active_classes": [
"DATASET",
"USER",...
],
"base:add_creator_to_access_list": true,
...
"base:max_incorrect_password_attempts": 3,

...
}

To process the active classes one at a time you need code like

for ac in result.result["profile"]["base"]["base:active_classes"]:
    print("Active class:",ac)

The returned attributes are called traits. See here for the traits for RACF options. The traits show

Traitbase:max_incorrect_password_attempts
RACF Keyrevoke
Data TypesString
Operators Allowed“set”,”delete”
Supported Operations“alter”,”extract”

For this attribute because it is a single valued object, you can set it or delete it.

You can use this attribute for example

result = sear(
    {
        "operation": "alter",
        "admin_type": "racf-options",
        "traits": {
            "base:max_incorrect_password_attempts": 5,
        },
    },
)

The trait “base:active_classes” is list of classes [“DATASET”, “USER”,…]

The trait is

Traitbase:active_classes
RACF Keyclassact
Data Typesstring
Operators Allowed"add", "remove"
Supported Operations"alter", "extract"

Because it is a list, you can add or remove an element, you do not use set or delete which would replace the whole list.

Some traits, such as use counts, have Operators Allowed of N/A. You can only extract and display the information.

My second query

What are the userids in RACF?

The traits are listed here, and code examples are here.

I used

from sear import sear
import json

# get all userids begining with ZWE
users = sear(
    {
        "operation": "search",
        "admin_type": "user",
        "userid_filter": "ZWE",
    },
)
profiles  = users.result["profiles"]
# Now process each profile in turn.
# because this is for userid profiles we need admin_type=user and userid=....
for profile in profiles:
    user = sear(
       {
          "operation": "extract",
          "admin_type": "user",
          "userid": profile,
       }, 
    )
    segments = user.result["profile"]
    #print("segment",segments)
    for segment in segments:   # eg base or omvs
      for w1,v1 in segments[segment].items():
          #print(w1,v1)
          #for w2,v2 in v1.items():
          #  print(w1,w2,v2 )
          json_data = json.dumps(v1  , indent=2)
          print(w1,json_data)

This gave

==PROFILE=== ZWESIUSR
base:auditor false
base:automatic_dataset_protection false
base:create_date "05/06/20"
base:default_group "ZWEADMIN"
base:group_connections [
  {
    ...
    "base:group_connection_group": "IZUADMIN",
    ...
    "base:group_connection_owner": "IBMUSER",
    ...
},
{
    ...
    "base:group_connection_group": "IZUUSER",
   ...
}
...
omvs:default_shell "/bin/sh"
omvs:home_directory "/apps/zowe/v10/home/zwesiusr"
omvs:uid 990017
===PROFILE=== ZWESVUSR
...

Notes on using search and extract

If you use “operation”: “search” you need a ….._filter. If you use extract you use the data type directly, such as “userid”:…

Processing resources

You can process RACF resources. For example a OPERCMDS provide for MVS.DISPLAY commands.

The sear command need a “class”:…. value, for example

result = sear(
{
"operation": "search",
"admin_type": "resource",
"class": "OPERCMDS",
"resource_filter": "MVS.**",
},
)
result = sear(
{
"operation": "extract",
"admin_type": "resource",
"resource": "MVS.DISPLAY",
"class": "Opercmds",
},
)

The value of the class is converted to upper case.

Changing a profile

If you change a profile, for example to issue the PERMIT command

from sear import sear
import json

result = sear(
    {   "operation": "alter",
        "admin_type": "permission",
        "resource": "MVS.DISPLAY.*",
        "userid": "ADCDG",
        "traits": {
          "base:access": "CONTROL"
        },
        "class": "OPERCMDS"

    },
)
json_data = json.dumps(result.result   , indent=2)
print(json_data)

The output was

{
  "commands": [
    {
      "command": "PERMIT MVS.DISPLAY.* CLASS(OPERCMDS)ACCESS (CONTROL) ID(ADCDG)",
      "messages": [
        "ICH06011I RACLISTED PROFILES FOR OPERCMDS WILL NOT REFLECT THE UPDATE(S) UNTIL A SETROPTS REFRESH IS ISSUED"
      ]
    }
  ],
  "return_codes": {
    "racf_reason_code": 0,
    "racf_return_code": 0,
    "saf_return_code": 0,
    "sear_return_code": 0
  }
}

Error handling

Return codes and errors messages

There are two layers of error handling.

  • Invalid requests – problems detected by pysear
  • Non zero return code from the underlying RACF code.

If pysear detects a problem it returns it in

result.result.get("errors") 

For example you have specified an invalid parameter such as “userzzz“:”MINE”

If you do not have this field, then the request was passed to the RACF service. This returns multiple values. See IRRSMO00 return and reason codes. There will be values for

  • SAF return code
  • RACF return code
  • RACF reason code
  • sear return code.

If the RACF return code is zero then the request was successful.

To make error handling easier – and have one error handling for all requests I used


try:
result = try_sear(search)
except Exception as ex:
print("Exception-Colin Line112:",ex)
quit()

Where try_sear was

def try_sear(data):
# execute the request
result = sear(data)
if result.result.get("errors") != None:
print("Request:",result.request)
print("Error with request:",result.result["errors"])
raise ValueError("errors")
elif (result.result["return_codes"]["racf_reason_code"] != 0):
rcs = result.result["return_codes"]
print("SAF Return code",rcs["saf_return_code"],
"RACF Return code", rcs["racf_return_code"],
"RACF Reason code",["racf_reason_code"],
)
raise ValueError("return codes")
return result

Overall

This interface is very easy to do.
I use it to extract definitions from one RACF database, save them as JSON files. Repeat with a different (historical) RACF database, then compare the two JSON files to see the differences.

Note: The sear command only works with the active database, so I had to make the historical database active, run the commands, and switch back to the current data base.

Using ISPF edit macros to displaying the junk in a catalog

You can use IDCAMS DCOLLECT to collect SMS information about data sets on your z/OS system. This gives lots of information about a dataset, size, creation date, SMS attributes etc.

With processing you can get reports on dataset, volumes, and what is using all the space. This allows you to delete dataset which are no longer needed.

This does not help when you are trying to clean out your catalogs, and removing stuff which should not be in that catalog. For example there are usually entries in a catalog which should really be in user catalogs.

I could not find tools to help me with this. I fell back to using and ISPF edit macro to process a LISTCAT listing and extracting relevant data. It is not difficult (once you know) and it is quick and easy.

This blog post gives some examples of how you can use ISPF edit macros to process data in data sets or spool.

The output from the short Rexx exec is

TCPIP.ETC.SERVICES             1998.284 B3SYS1
SYS1.RACFDS 1999.288 B3CFG1
SYS1.IPLPARM 1999.288 B3SYS1
...
LOG.MISC 2025.107 USER04
IBMUSER.S0W1.SPFTEMP3.CNTL 2026.002 USER07
IBMUSER.S0W1.SPFLOG1.LIST 2026.013 USER04
IBMUSER.SMF 2026.013 USER07

With this I asked What is LOG.MISC 2025.107 doing in the catalog? It is there because I did not have the controls in place to stop people putting datasets into the catalog.

Instead of just displaying the information, I could have had the exec create IDCAMS statements, for example to get it recataloged, or deleted; based on creating date or other information.

Get your LISTCAT listing

I used

//IBMLISC JOB 1,MSGCLASS=H 
// EXPORT SYMLIST=(*)
// SET CAT=&SYSVER.
//S1 EXEC PGM=IDCAMS
//SYSPRINT DD SYSOUT=*
//SYSIN DD *,SYMBOLS=JCLONLY
LISTCAT NONVSAM CATALOG(CATALOG.&CAT..MASTER) ALL
/*
  • The // SET CAT=&SYSVER. gets a local copy of the system symbol &SYSVER. You can use the operator command D SYMBOLS to list all the system symbols defined. On my system &SYSVER is Z31B
  • In //SYSIN DD *,SYMBOLS=JCLONLY the SYMBOLS=JCLONLY says substitute variables in the following SYSIN data, and substitute from the JCL symbols. &CAT is Z31B, and so the catalog name becomes CATALOG.Z31B.MASTER. You cannot use &SYMVER directly in the SYSIN data.

Edit the listing

I used SDSF and the SE line command on the output of the LISTCAT. You get an ISPF edit session with the spool data.

Run the exec

I have a Rexx exec called LISTCATN in USER.Z31B.CLIST. I’ll describe it in sections below

Standard Rexx starting code

/* REXX */ 
/*
exec to Nonvsam records from a catalog listing
*/
ADDRESS ISPEXEC
'ISREDIT MACRO (parms) '

Use MACRO(parms) to get the parameters passed to the macro

Define parsing arguments

I define search arguments in a variable stem. This separates the data from the logic, and makes it easy to change or extend.

  data.1 = "NONVSAM 2 10" 
fcol.1 = 18
flen.1 = 48

data.2 = "CREATION 38 48 "
fcol.2 = 54
flen.2 = 8

data.3 = "VOLSER 8 15 "
fcol.3 = 27
flen.3 = 8

data.0 = 3
sortcols = "50 60"

Later there is code

  • do I = 1 to data.0 this processes each section in the stems
  • There a find data.1 which substitutes to “find NONVSAM 2 10”. This says Find the string NONVSAM in columns 2 to 10
  • If the find locates the string, the code retrieves the line. The code does a substring from fcol.1 for length flen.1 and saves the value
  • data.0 = 3 says there are three data sections.
  • sortcols = “50 60” is used at the end sort the file by the date column.

Remove uninteresting records

  "ISREDIT autosave off     " 
"ISREDIT exclude all"
"ISREDIT find NONVSAM 2 10 ALL "
"ISREDIT find CREATION ALL "
"ISREDIT find VOLSER ALL "
if rc != 0 then data.0 = 2 /* ignore the volser */
"ISREDIT delete all x "
  • “ISREDIT autosave off ” I have this as standard in ISPF edit macros, basically it says do not save the data if I press PF3.
  • “ISREDIT exclude all” –
  • “ISREDIT find NONVSAM 2 10 ALL ” find these lines
  • “ISREDIT find CREATION ALL ”
  • “ISREDIT find VOLSER ALL ”
  • If volser was not found, then listcat wasn’t specified with the right statement, so do not try to process any VOLSER records
    • if rc != 0 then data.0 = 2 /* ignore the volser */
  • “ISREDIT delete all x ” delete all the records which are still excluded leaving only the records I searched for.

Process the records

do j = 1  by 1 
string = ""
do i = 1 to data.0
"ISREDIT find "data.i
if rc <> 0 then leave
"ISREDIT (f) = LINENUM .ZCSR "
"ISREDIT (d) = LINE " f
name = substr(d,fcol.i,flen.i) /* from col and length */
string = string || " " || name
end

if rc <> 0 then leave
out.j = string
end

This code uses the data in the variable stems defined higher up. It keeps the logic separate from the search data.

  • do j = 1 by 1 iterate through the whole file until the end of file
  • string = “” preset the output string
  • do i = 1 to data.0 for the records we specified
  • “ISREDIT find “data.i find it
  • if rc <> 0 then leave if not found then leave
  • “ISREDIT (f) = LINENUM .ZCSR ” Get the line number where the find found the data.
  • “ISREDIT ( d ) = LINE ” f get the line contents – getting the line number found in the previous step
  • name = substr(d,fcol.i,flen.i) /* from col and length */ extract the field of interest from the line
  • string = string || ” ” || name build up a string of the values found
  • end
  • if rc <> 0 then leave we got a not found, so end of file,
  • out.j = string save the data in a stem for processing below
  • end

Do something with the records

You can do processing on the data, for example create JCL to delete the dataset.

In this example I delete all records from the file, and insert the saved records

  "ISREDIT exclude all" 
"ISREDIT delete all x "
do i = 1 to j -1
v = out.i
"ISREDIT LINE_after .zcsr = (v)"
end
"ISREDIT sort " sortcols
exit
  • “ISREDIT delete all “ delete all processed the lines in the file
  • do i = 1 to j -1 we have a stem of the records we processed iterate over them
  • v = out.i make a copy of the data, make it easy for ISPF. ISPF only does simple substitutions
  • “ISREDIT LINE_after .zcsr = (v)” insert after the current (last) line the value from v, which is the saved string.
  • end
  • “ISREDIT sort ” sortcols sort on the creation date
  • exit

The output

The output from this is the dataset name, the create date, and the volume it is on.

TCPIP.ETC.SERVICES             1998.284 B3SYS1
SYS1.RACFDS 1999.288 B3CFG1
SYS1.IPLPARM 1999.288 B3SYS1
...
IBMUSER.S0W1.SPFTEMP3.CNTL 2026.002 USER07
IBMUSER.S0W1.SPFLOG1.LIST 2026.013 USER04
IBMUSER.SMF 2026.013 USER07

From the data information I can see which entries were due to me – because they were all after the Jan 2025.

Different ways of processing records

Not every dataset has the same information. For example, deleting uninteresting rows

NONVSAM ------- ADCD.DYNISPF.ISPPLIB 
DATASET-OWNER-----(NULL) CREATION--------2016.236
VOLSER------------B3SYS1 DEVTYPE------X'3010200F' FSEQN------------------0
NONVSAM ------- ADCD.WLM
DATASET-OWNER-----(NULL) CREATION--------2023.010
STORAGECLASS -----SCBASE MANAGEMENTCLASS---(NULL)
DATACLASS --------(NULL) LBACKUP ---0000.000.0000
VOLSER------------B3USR1 DEVTYPE------X'3010200F' FSEQN------------------0

The second dataset ADCD.WLM has SMS information, Storage Class, Management Class, and Data Class, which are not present with the first dataset ADCD.DYNISPF.ISPPLIB.

You could process this sequentially and have logic like…

If the row starts with

  • NONVSAM – then write out the previous information, get the dataset name, and start again
  • VOLSER – then parse the volser value
  • DATASET – then parse the creation date
  • STORAGECLASS – then parse the SC and MC values
  • DATACLASS – then parse the DC value

For example

"ISREDIT     (last)  = LINENUM .ZLAST" 
do j = 1  by 1  to last 
  "ISREDIT      ( d )  = LINE   " j 
  if substr(d,2,7) = "NONVSAM" then 
  do 
      count = count + 1 
      string =  dsn cd vol sc mc dc 
      sc = "        " 
      mc = "        " 
      dc = "        " 
      vol= "      " 
      dsn= "      " 
      cd = "      " 
      out.count = string 
      say string 
      /* do the next */ 
      dsn = substr(d,18,48) 
  end 
  else 
  if substr(d,9,7) = "DATASET" then cd = substr(d,54,8) 
  else 
  if substr(d,9,6) = "VOLSER" then vol = substr(d,27,6) 
  else 
  if substr(d,9,6) = "STORAG" then 
  do 
     sc = substr(d,27,8) 
     mc = substr(d,56,8) 
  end 
  else 
  if substr(d,9,6) = "DATACL" then vol = substr(d,27,8) 
end 

This gives output like

NFS.CNTL                       2000.336 B3SYS1 
SYS1.RACFDS.BACKUP 2001.164 B3CFG1
SYS1.UADS 2003.137 B3CFG1
NETVIEW.ADCD.NTVTABS 2009.027 B3USR1 SCBASE (NULL)
SYT1.ZOS.CNTL 2012.013 B3USR1 SCBASE (NULL)
TCPIP.PROFILE.TCPIP 2016.236 B3SYS1

So not difficult at all.

What’s going on in my program in Unix Services?

On Linux, starting a Python program is subsecond. On z/OS, running on zD&T (so emulated hardware) it takes about 2 seconds. I wondered if what was causing this – is my ZFS file system slow?

I used the Unix command

bpxtrace -o /tmp/trace -f format -c  python ac.py

to capture a trace.

This produced output like

       PID ASID TCB    Local time      System call           Additional trace
- - - - - - - - - - - - - - - - - - - - - - - - - -
65589 0049 8FB2F8 08:27:13.722080 Call open parms: 0000004D
65589 0049 8FB2F8 08:27:13.722714 Exit open rv=00000004
65589 0049 8FB2F8 08:27:13.722741 Call fstat parms: 00000004
65589 0049 8FB2F8 08:27:13.722785 Exit fstat rv=00000000
65589 0049 8FB2F8 08:27:13.722817 Call lseek parms: 00000004
65589 0049 8FB2F8 08:27:13.722824 Exit lseek rv=00000000
65589 0049 8FB2F8 08:27:13.722836 Call lseek parms: 00000004
65589 0049 8FB2F8 08:27:13.722883 Exit lseek rv=00000000
65589 0049 8FB2F8 08:27:13.722896 Call lseek parms: 00000004
65589 0049 8FB2F8 08:27:13.722901 Exit lseek rv=00000000

This is ok, but I want to know how long the calls took.

I wrote an ISPF Rexx script

/* REXX */ 
/*
exec to extract the alias records from a catalog listing
*/
ADDRESS ISPEXEC
'ISREDIT MACRO (lines) '
"ISREDIT (f) = LINENUM .ZLAST"
sum = 0
do j = 3 by 2 to f
k = j + 1
"ISREDIT ( bef) = LINE " j
"ISREDIT ( aft) = LINE " k
parse var bef . 27 mm 29 . 30 ss 39 . 40 what
parse var aft . 27 am 29 . 30 as 39
before = 60 * mm + ss
after = 60 * am + as
sum = sum + (after - before)
delta = format(after - before,1,6)
string = "== "delta what
"ISREDIT LINE (k) = (string)"

end
say sum
exit

running the Rexx script produced output in the file like

     65589 0049 8FB2F8 08:27:13.722080 Call open      
== 0.000630 Call open
65589 0049 8FB2F8 08:27:13.722741 Call fstat
== 0.000050 Call fstat
65589 0049 8FB2F8 08:27:13.722817 Call lseek
== 0.000000 Call lseek

Using the ISPF commands

  • X all
  • f “==” all
  • del all x
  • sort 1 11

This gave

== 0.000000 Call lseek 
== 0.000000 Call lseek
...
0.000820 Call close parms: 00000005 00000000 00000000 05FC0119
0.001450 Call cond_timed_wait parms: 00000000 000F4240 00000001 00000000 20861040
0.002450 Call loadhfs parms: 00000048 /u/tmp/zowet/colin/envz/lib/python3...
0.004200 Call loadhfs parms: 00000008 CELQDCPP 00000000 0000010C 61E9F3F1
0.004310 Call loadhfs parms: 00000007 CXXRT64 00000000 0000010C 61E9F3F1
0.034780 Call mvsprocclp parms: 00000100 00000000 00000000 00000000
0.042970 Call mvsprocclp parms: 00000100 2081D1D8 2086CC75 00000000

There were nearly 500 lines in the output file. 400 entries were 100 microseconds or less. There were 6 entries taking longer than 1 millisecond.
My trace file was of duration 1.1 seconds. Adding up the individual times took 0.13 seconds, so it looks like the delays are not caused by the file system, and I need to look else where.

From data to reports missing the potholes

I’ve been doing work with datasets on z/OS to produce reports. These range from SMF data to DCOLLECT data on datasets and SMS data.

It took a while to “get it right”, because I made some poor decisions as to how to process the data, and some of my processing was much more complex than it needed to be. It was easiest to start again!

I’ve been working with Python and Python tools, and other tools available on the platforms. See Pandas 102, notes on using pandas.

My current environment is to use some Python code to read a record, parse the record into a dictionary(dict), then add the dict to a list of records. Then either pass the list of dicts to Pandas to display, or to externalise the data, and have a second Python program to read the externalised data, and do the Pandas processing.

Reading the data

The data is usually in data sets rather than files in Unix Services. You can copy a dataset to a file, but it is easier to use the python package pyzfile to read datasets directly.

from pyzfile import * 

try:
with ZFile("//'COLIN.DCOLLECT.OUT'", "rb,type=record,noseek") as file:
for rec in file:
#l = len(rec)
yield rec
except ZFileError as e:
print(e,file=sys.stderr)

Often a data source will contain a mixture of record types, for example a dump of SMF datasets, may contain many different record types and subtypes.

You need to consider if you want to process all record types in one pass, or process one record type in one run, and a different record type in a different run.

Processing the data

You will normally have a mapping of the layout of the data in a record. Often there are a mix of records types, you need to decide which record types you process and which you ignore.

Field names

Some of the field names in a record are cryptic, they were created when field names could only be 8 characters or less. For example DCDDSNAM. This stands for DCollect records record type D, field name DS NAMe. You need to decide what you name the field. Do you name it DCDDSNAM, and tell the reader to go and look in the documentation to understand the field names in the report, or do you try to add value and just call it DSN, or DataSetName. You cannot guess some fields, such as DCDVSAMI. This is VSAM Inconsistency.

You also need to consider the printed report. If you have a one character field in the record, and a field name which is 20 characters long – by the default the printed field will be 20 characters long, and so waste space in the report. If the field is rarely used you could call it BF1 for Boring Field 1.

Character strings

Python works in ASCII, and strings need to be in ASCII to be printable. You will need to convert character data from EBCIDC to ASCII.

You can use substring to extract data from a record for example. So to extract a string, and convert it…

DSN =  record[20:63].decode('cp500').strip())

Integers

Integers – you will need to covert this to internal format. I found using the Python Struct very good to use. You give a string of conversion characters (integer, integer, …) and it returns an array of the data. If you are processing the data on a different platform, you may need to worry about big end and little end conversion of numbers.

Strange integers

Some records have units like hundredths of a second. You may want to convert these to float

float_value = float(input_value)/100

Packed numbers

Packed numbers are a representation of a date in “decimal” format. For example a yyyyddd for year 2025, day 5 is 0X2025005F” where the F is a sign digit. You cannot just print it (it comes out as 539295839).

Bit masks

Bit masks are very common, for example there is a 1 byte field DCVFLAG1 with values:

  • DCVUSPVT 0x20 Private
  • DCVUSPUB 0x10 Public
  • DCVUSSTG 0x08 Storage
  • DCVSHRDS 0X04 Device is sharable

If the value of the field is 0x14 – what do you return? I would create a field Flag1 with value of a list[“Public”,”Shareable”]. If all the bits were off, this would return an empty list []. It would be easy to create [“DCVUSPUB”,”DCVSHRDS”] or just display the hex value 14 (or 0x14) – but this makes it hard to interpret the data for the people reading the reports.

Triplets

SMF records contain triplets. These are defined by [offset to start, length of data, count of data] within the record.

For example in the SMF30 record there are many triplet sections. There is one for “usage data” involved in usage based pricing. There can be zero or more sections like

  • Product owner
  • Product name
  • TCB time used in hundredths of a second

How are you going to process this?
The SMF record has 3 fields for usage

  • SMF30UDO Offset to Usage Data section in SMF 30 record
  • SMF30UDL Length of each Usage Data section in SMF 30 record
  • SMF30UDN Number of Usage Data section in SMF 30 record

I would create a variable UsageData = [{“ProdOwner”: …,”ProdName”: …, “TCBTime”: …},{“ProdOwner”: …,”ProdName”: …, “TCBTime”: …},]

and convert TCBTime from an integer representing a hundreds of a second, to a floating point number.

Having these triplets make a challenge when printing the record. You could decide to

  • omit this data
  • summarise the data – and provide only a sum of the TCBTime value
  • give the data as a list of dicts, then have a Pandas step to copy only the fields you need for your reports.

For this usage data, I may want a report showing which jobs used which product, and how my much CPU the job used in that product. Although I may capture the data as a list of products, i could extract the data, and create another data record with

  • jobname1, product1, … CPU used1
  • jobname1, product2, … CPU used2
  • jobname2, product1, … CPU used1
  • jobname2, product3, … CPU used3

and remove the product data from the original data record.

Do you want all of the fields?

You may want to ignore fields, such as reserved values, length of record values, record_ type, and any fields you are not interested in. Record length tends to be the first field, and this is usually not interesting when generating default reports.

How to handle a different length record?

The format of many records change with new releases, typically adding new fields.

You need to be able to handle records from the previous release, where the record is shorter. For example do not add these fields to your dict, or give add them with a “None” value.

Now I’ve got a record – now what?

Once you have got your record, and created a dict from the contents {fieldname1=value, fieldname2=value2…} , you could just add it to the list to be passed to Pandas. It is not always that simple.

I found that some records need post processing before saving.

Calculations

For a DCOLLECT record, there is a field which says

DCVFRESP: Free Space on Volume (in KB when DCVCYLMG is set to 0 or in MB when DCVCLYMG is set to 1)

You need to check bit DCVCYLMG and have logic like

if DCVCYLMG  == 1:
data["FreeSpVolKB"] = data["FreeSpVolKB"] * 1024

Adding or deleting fields

For some fields I did some calculations to simplify the processing. For example I wanted average time when I had total_time, and count.

I created average_time = total_time / count, added this field, and deleted total_time and count fields.

Error handling

I found some records have an error flag, for example “Error calculating volume capacity”. You need to decide what to do.

  • Do you include them, and have the risk, that the calculations/display of volume capacity might be wrong?
  • Do you report record during the collection stage, and not include them in the overall data?

How you accumulate the data, dicts or lists?

When using Pandas you can build each record as a dict of values {“kw1″:”v1″,”kw2″:”v2”}, then build a list of dicts [{}, {}…]

or have “column” have a list of values {“Jobname”: [“job1″,”job2″…],”CPUUsed”:[99,101…] … }. As you process each field you append it to the appropriate “column” field.

# a dict of lists
datad = {"dsn":["ABC","DEF"],
"volser":["SYSRES","USER02"]}
datad["dsn"].append("GHI")
datad["volser"].append("OLDRES")

pdd = pd.DataFrame(datal)


# a list of dicts
dictl = [
{"dsn":"ABC","volser":"SYSRES"},
{"dsn":"DEF","volser":"USER02S"}]
newdict = {"dsn":"GHI","volser":"OLDRES"}

dictl.append(newdict)

pdl = pd.DataFrame.from_records(datal)

I think it is better to capture your data in a dict, then add the dict to the list of records.

For example with

DCVFRESP: Free Space on Volume (in KB when DCVCYLMG is set to 0 or in MB when DCVCLYMG is set to 1)

If you use a dict to collect the data, you can then easily massage the values, before adding the dict to the list.

if DCVCYLMG  == 1:
  data["FreeSpVolKB"] = data["FreeSpVolKB"] * 1024

grand_data.append[data]

If you try to do this using “column” values it gets really messy trying to do a similar calculation.

Using the data

It took a long time to process the dataset and create the Python data. I found it quicker overall to process the dataset once, and externalise the data using Pickle, or JSON. Then have different Python programs which read the data in and processed it. For example

  • Creating a new data structure using just the columns I was interested in.
  • Filtering which rows I wanted.
  • Save it

Pandas 102, notes on using pandas

Pandas is a great tool for displaying data from Python. You give it arrays of data, and it can display, summarise, group, print and plot. It is used for the simplest data, up to data analysts processing megabytes of data.

There is a lot of good information about getting started with Pandas, and how you can do advanced things with Pandas. I did the Pandas 101 level of reading, I struggled with the next step, so my notes for the 102 level of reading are below. Knowing that something can be done means you can go and look for it. If you look but cannot find, it may be that you are using the wrong search arguments, or there is no information on it.

Working with data

I’ve been working with “flat files” on z/OS. For example the output of DCOLLECT which is information about dataset etc from SMS.

One lesson I learned was you should isolate the extraction from the processing (except for trivial amounts of data). Extracting data from flat files can be expensive, and take a long time, for example it may include conversion from EBCDIC to ASCII. It is better to capture the data from the flat file in python variables, then write the data to disk using JSON, or pickle (Python object serialisation). As a separate step read the data into memory from your saved file, then do your data processing work, with pandas, or other tools.

Feeding data into Pandas

The work I’ve done has been two dimensional, rows and columns; you can have multi dimensional.

You can use a list of dictionaries(dicts), or dict of list:

# a dict of lists
datad = {"dsn":["ABC","DEF"],
"volser":["SYSRES","USER02"]}
pdd = pd.DataFrame(datal)

# a list of dicts
datal = [{"dsn":"ABC","volser":"SYSRES"},
{"dsn":"DEF","volser":"USER02S"},
]

pdl = pd.DataFrame.from_records(datal)

Processing data like pdd = pd.DataFrame(datal) creates a pandas data frame. You take actions on this data frame. You can create other data frames from an original data fram, for example with a subset of the rows and columns.

I was processing a large dataset of data, and found it easiest to create a dict for each row of data, and then accumulate each row as a list. Before I used Pandas, I had just printed out each row. I do not know which performs better. Someone else used a dict of lists, and appended each row’s data to the “dsn” or “volser” list.

What can you do with it?

The first thing is to print it. Once the data is in Pandas you can use either of pdd or pdl above.

print(pdd)

gave

   dsn   volser
0 ABC SYSRES
1 DEF USER02S

Where the 0, 1 are the row numbers of the data.

With my real data I got

                                             DSN  ... AllocSpace
0 SYS1.VVDS.VA4RES1 ... 1660
1 SYS1.VTOCIX.A4RES1 ... 830
2 CBC.SCCNCMP ... 241043
3 CBC.SCLBDLL ... 885
4 CBC.SCLBDLL2 ... 996
.. ... ... ...
93 SYS1.SERBLPA ... 498
94 SYS1.SERBMENU ... 277
95 SYS1.SERBPENU ... 17652
96 SYS1.SERBT ... 885
97 SYS1.SERBTENU ... 332

[98 rows x 7 columns]

The data was formatted to match my window size. With a larger window I got more columns.

You can change this by using

pd.set_option('display.max_rows', 500)
pd.set_option('display.max_columns', 500)
pd.set_option('display.width', 1000)

Which columns are displayed ?

Rather than all of the columns being displayed you can select which columns are displayed.

You can tell from the data you passed to pandas, or use the command

print(list(opd.columns.values))

This displays the values of the column names, as a list.

To display the columns you specify use

print(opd[["DSN","VOLSER","ExpDate","CrDate","LastRef","63bit alloc space KB", "AllocSpace"]])

You can say display all but the specified columns

print(opd.loc[:, ~opd.columns.isin(["ExpDate","CrDate","LastRef"])])

Select which rows you want displayed

print(opd.loc[opd["VOLSER"].str.startswith("A4"),["DSN","VOLSER",]])

or

print(opd.loc[opd["DSN"].str.startswith("SYS1."),["DSN","VOLSER",]])

gave

                                             DSN  VOLSER  
0 SYS1.VVDS.VA4RES1 A4RES1
1 SYS1.VTOCIX.A4RES1 A4RES1
12 SYS1.ADFMAC1 A4RES1
13 SYS1.CBRDBRM A4RES1
14 SYS1.CMDLIB A4RES1
.. ... ...
93 SYS1.SERBLPA A4RES1
94 SYS1.SERBMENU A4RES1
95 SYS1.SERBPENU A4RES1
96 SYS1.SERBT A4RES1
97 SYS1.SERBTENU A4RES1

[88 rows x 2 columns]

From this we can see 88 (out of 97) rows were displayed. Row 0, 1 , 12, 13, but not rows 2, 3, …

What does .loc do?

My interpretation of this (which I haven’t seen documented)

If there is one parameter, this is a list of the columns you want.

If there are two parameters, the second is the list of the columns you want displayed. The first column is conceptually a list of True or False, with one value per row, saying if the row should be selected or not. So for

print(opd.loc[opd["VOLSER"].str.startswith("A4"),["DSN","VOLSER",]])

opd[“VOLSER”].str.startswith(“A4”)

says take the column called VOLSER, convert it to a string. If it starts with the string “A4” then return True, else return False. This returns a list of one entry per row.
The opd.loc[opd[“VOLSER”].str.startswith(“A4”),…) then selects the rows.

You can select rows and columns

print(opd.loc[opd["VOLSER"].str.startswith("A4"),["DSN","VOLSER","63bit alloc space KB",]])

You can process the data, such as sort

The following statements extracts columns from the original data, sorts the data, and creates a new data frame. The new data frame is printed.

sdata= opd[["DSN","VOLSER","63bit alloc space KB",]].sort_values(by=["63bit alloc space KB","DSN"], ascending=False)
print(sdata)

This gave

                                             DSN  VOLSER  63bit alloc space KB
2 CBC.SCCNCMP A4RES1 241043
35 SYS1.MACLIB A4RES1 210664
36 SYS1.LINKLIB A4RES1 166008
90 SYS1.SEEQINST A4RES1 103534
42 SYS1.SAMPLIB A4RES1 82617
.. ... ... ...
62 SYS1.SBPXTENU A4RES1 55
51 SYS1.SBDTMSG A4RES1 55
45 SYS1.SBDTCMD A4RES1 55
12 SYS1.ADFMAC1 A4RES1 55
6 FFST.SEPWMOD3 A4RES1 55

[98 rows x 3 columns]

Showing that all the rows, and all the (three) columns which had been copied to the sdata data frame.

Saving data

Reading an external file and processing the data into Python arrarys took an order of magnitude longer than processing it Pandas.

You should consider a two step approach to looking at data

  • Extract the data and exported it in an access format, such as Pickle or JSON. While getting this part working, use only a few rows of data. Once it works, you can process all of the data.
  • Do the analysis using the exported data.

Export the data

You should consider externalising the data in JSON or pickles format for example

# write out the data to a file
fPickle = open('pickledata', 'wb')
# source, destination
pickle.dump(opd, fPickle)
fPickle.close()

Import and do the analysis


# and read it in
fPickle = open('pickledata', 'rb')
opd = pickle.load(fPickle)
fPickle.close()
print(odp)

Processing multiple data sources as one

If you have multiple sets of data, for example for Monday, Tuesday, Wednesday, etc you can use

week =  pd.concat(monday,tuesday,wednesday,thursday,friday)

Processing data within fields

Within my data, I have a field with information like

                   DSN  VOLSER           FormatType
0 SYS1.VVDS.VC4RES1 C4RES1 []
1 SYS1.VTOCIX.C4RES1 C4RES1 [Fixed]
2 CBC.SCCNCMP C4RES1 [Fixed, Variable]
3 CBC.SCLBDLL C4RES1 [Fixed, Variable]
4 CBC.SCLBDLL2 C4RES1 [Fixed, Variable]

Where the data under FormatType is a list. You can reference elements in a list.

For example

x =  data.FormatType.apply(lambda x: 'Variable' in x)
print(x)

gives

0     False
1 False
2 True
3 True
4 True

The command

print(data.loc[ data.FormatType.apply(lambda x: 'Blocked' in x)])

gives

              DSN  VOLSER           FormatType
2 CBC.SCCNCMP C4RES1 [Fixed, Variable]
3 CBC.SCLBDLL C4RES1 [Fixed, Variable]
4 CBC.SCLBDLL2 C4RES1 [Fixed, Variable]

Basic operations on columns

You can do basic operations on columns such as

print(dataset[["CountIO","CacheHits"]].sum())

The sum() (and count() etc) functions add up the specified columns.

This gave

[361 rows x 10 columns]
CountIO 74667.0
CacheHits 1731.0
dtype: float64

An operation like

print(dataset.sum())

Would have totalled all the columns, including some which are meaningless, for example, maximum value found.

Doing aggregation, count, sum, maximum, minimum etc.

Simple aggregation

You can aggregate data

# Extract just the fields of interest
d = dataset[["DSN","CountIO","CacheHits"]]
print(d.groupby("DSN").sum())

Gave

                                        CountIO  CacheHits
DSN
ADCD.Z31B.PARMLIB 68.0 60.0
ADCD.Z31B.PROCLIB 66.0 66.0
ADCD.Z31B.VTAMLST 141.0 141.0
COLIN.TCPPARMS 4.0 4.0
FEU.Z31B.PARMLIB 4.0 0.0
IXGLOGR.ATR.S0W1.RM.DATA.A0000000.DATA 4.0 0.0
SYS1.DAE 0.0 0.0
SYS1.DBBLIB 974.0 932.0

More complex aggregation

The .agg() gives you much more control as to what, and how you want to process data.

print(d.groupby("DSN").agg({'DSN' : ['count'], 'CountIO' : ['sum','max'],"CacheHits": ["sum"]}))

gave

                                         DSN  CountIO          CacheHits
count sum max sum
DSN
ADCD.Z31B.PARMLIB 19 68.0 7.0 60.0
ADCD.Z31B.PROCLIB 30 66.0 3.0 66.0
ADCD.Z31B.VTAMLST 6 141.0 41.0 141.0
COLIN.TCPPARMS 2 4.0 3.0 4.0
FEU.Z31B.PARMLIB 1 4.0 4.0 0.0
IXGLOGR.ATR.S0W1.RM.DATA.A0000000.DATA 4 4.0 1.0 0.0
SYS1.DAE 1 0.0 NaN 0.0
SYS1.DBBLIB 2 974.0 932.0 932.0

Notes:

  • The columns are not in the order I specified. It is hard to see which field Max applies to
  • There is a Not a Number (Nan) in one of the value. You need to allow for this.
  • In the simple case using .sum() by default it tries to sum all of the columns. Using .agg you can specify which columns you want to process