Moving to the z/OS standard image and onward

For vendors and people like me who used ZD&T or zPDT to run z/OS on an IBM provided emulator on Linux, moving to the new standard image is a challenge.

Below are my thoughts on how to make it easier to use the standard image.

What does migration mean?

The term migration may mean different things to different people.

  • “Production customers” have a z/OS image, and they refresh the products, while keeping userid, user datasets etc. the same. The products (from IBM and vendors) gradually changes over time, typically changing every 3-6 months. This process is well know, and has been used over many decades.
  • With the IBM standard image, IBM makes a new level of z/OS available, and you have to migrate userids, datasets etc into the image. Every 3-6 months there may be a new image available. Moving from one level of standard image to another level of standard image is new and not documented. It looks easy to do it wrong, and make migration hard. It may take time to migrate to the first standard image, but moving to later images should take no more than half an hour.

This blog post is to suggest ways of making it easy to set up the to use the standard image.

Moving to the first standard image may mean a lot of work, but if you do it the right way moving on should be easy.

Setting the direction

My recommendations are (I would welcome discussion on these topics).

A couple of years ago I wrote a series of blog post starting with Migrating an ADCD z/OS release to the next release. A lot of the information is still relevant. Below I’ve tried to refine it for the migration to the standard image.


Restrict what you put into the master catalog.

You can restrict what user put into the master catalog. For example, enforce every data set High level qualifier has a RACF profile, and only allow user catalog entries to be added to the catalog by general users.

See

Ensure you use a user catalog

If your datasets are in a user catalog, then to go to the next standard image, you just import the user catalog. If you’ve cataloged dataset in the master catalog, then these are not immediately transferable to a new system.

Use USER. datasets, not SYS1. datasets

You can configure z/OS so it uses parmlib and proclib datasets you specify. On the ZD&T there are USER.Z31B PROCLIB, PARMLIB, CLIST datasets etc. You can copy/use these on each new standard image.

If you have changed ADCD.* or SYS1.* datasets, you can use ISPF 3.4, then sort on the “changed” column to see members changed since you first used the system. Then move them to the USER.* dataset.

Create resources using JCL rather than issuing commands, or using the ISPF panels

Use JCL to issue commands in batch TSO, rather than issue the commands manually. For example with the standard image you may get one userid (IBMUSER), and you want to create more userids. Have a JCL member with the commands to create the additional userid commands.

Once created, you just submit the JCL for the follow-on standard image.

Have an ordering to the members in your migration dataset.

If you have to define a group before you create a userid which uses this group, then have members R1GROUP, R2USER1, or have multiple PDSEs, eg COLIN.DO1GROUP.JCL, COLIN.DO2USERS.JCL. where the members within a data set can be issued in any order.

OMVS file systems

I have multiple ZFS (file systems) which I mount on the z/OS image. If these are cataloged in the user catalog, they can be mounted on the new system and used.

You need to think about where to mount them. If the new image has been configured to use automount, this can cause problems. Automount is an OMVS facility which can create a ZFS and mount it for a user. You can allocate a ZFS on a per userid basis, so if one userid use lots of disk space, it does not affect other users. They just run out of space.

When automount is active on the /u directory, if I try to mount my file system on /u, for example /u/colinzfs, the mount will fail because /u/colinzfs is already allocated.

You need to use another directory perhaps /my to mount your ZFS on.

If user’s home directory is something like /my/colin, SSH certificates will be available on the new system, without having to set them up again.

Changing files in system file systems

Try to avoid changing the system file systems, for example /etc/ /var, /usr/

If you have changed the system file systems, see here to see which files have changes since you started using the current image, and move them to your own file system.

Userids and OMVS

You can use the RACF autoid facility which allocates a UID for the userid. This means you do not need to mange the list of UIDs. This makes life easier for an administrator, but harder for a standard image user.

If you use the autoid on the current system you may get an UID such as 990021. On the newer image, your userid may be given a difference UID – depending on the order and number of requests made. Having a different UID can cause problems when using your ZFS. For example the files for my userid COLIN have owner with UID 990021. On the newer system, I may get UID 990033. As this UID is different to 990021, I will not have access to my files.

You should consider explicitly allocating a UID which stays with the user.

If you want to extract RACF profiles from the current system. See the extract program. This will create the RACF command needed to define the profiles. You can specify userids, datasets or classes.

Certificates

You can use RACF commands to display and extract keyring information, and certificates (public and private parts). These can be imported on the newer system. This means your client applications will continue to work.

ICSF

You can configure which data sets ICSF uses in the (CSFPRMXX) member in parmlib. Mine are prefixed with COLIN…

Started tasks

Many started tasks associated with OMVS, (or TCPIP) store configuration in /etc/. For example the file /etc/hosts and the directory /etc/ssh.

You may be able to change the started tasks to use files in your ZFS.

For example

//SSHD    PROC 
//SSHD EXEC PGM=BPXBATCH,REGION=0M,TIME=NOLIMIT,
// PARM='PGM /usr/sbin/sshd -f /my/etc/ssh/sshd_config '

What packages are installed?

You can issue

zopen query -i > installed 

to see what is installed

This gave me

Package   Version  File                               Releaseline             
bash 5.3.9 bash-5.3.20260204_143226.zos STABLE
curl 8.18.0 curl-8.18.0.20260205_151329.zos STABLE
git 2.53.0 git-v2.53.0.20260212_134939.zos STABLE
gpg 2.5.17 gnupg-2.5.17.20260130_021013.zos STABLE
jq 1.8.1 jq-jq-1.8.1.20250919_125054.zos STABLE
less 692 less-v692-rel.20260209_153821.zos STABLE
libpsl 1.0.0 libpsl-master.20260102_060204.zos STABLE
libssh2 1.11.1 libssh2-1.11.1.20260102_060940.zos STABLE
meta 0.8.4 meta-main.20260116_055504.zos STABLE
ncurses 6.6 ncurses-6.6.20260129_223023.zos STABLE
openssl 3.6.0 openssl-3.6.0.20260101_102819. STABLE

and

pip list

which gave

Package      Version
------------ -----------
ansible-core 2.20.3
cffi 1.14.6
cryptography 3.3.2
Jinja2 3.1.6
MarkupSafe 3.0.3
packaging 26.0
pip 26.0.1
pycparser 2.20
pysear 0.4.0
PyYAML 6.0.3
pyzfile 1.0.0.post2
resolvelib 1.2.1
six 1.16.0
tzdata 2025.3
zoautil-py 1.2.5.10

How do I allocate a Unix id on z/OS?

To use Unix services (sometimes known as USS) on z/OS, a userid needs a UserID (UID). This, as on Unix,is an integer. A user can be pre-allocated a permanent UID, or be allocated a UID when when needed. See Automatically assigning unique IDs through UNIX services.

Unique or not Unique?

It is good practice for each userid to have a unique UID. If users share the same UID,

  • The users share ownership and access to the same files.
  • If you ask for the userid associated with an id – you may get the wrong answer!

However some super users need a id of 0.

You can set this as shared with

altuser colin OMVS(UID(0)SHARED)

Instead of allocating uid(0) you can use the profile BPX.SUPERUSER resource in the FACILITY class to get the authority to do most of the tasks that require superuser authority.

  1. You can explicitly specify an id which you allocate (this means you need a list of ids and owners, so you know which ids are free).
  2. You can have z/OS do this for you. See Enabling automatic assignment of unique UNIX identities.

You can use ADDUSER COLIN OMVS(AUTOUID) which allocates an available UID.

Should I used AUTOID?

I run z/OS on a zD&T image. Every 6 months or so there is a new level of z/OS which I can download. I then need to migrate userid, datasets etc to this new system. This is different to a normal customer z/OS where you have an existing system and you migrate a new version of z/OS into it.

I have ZFS file systems for all of my user data.
On the current system my userid COLIN was automatically allocated as 0000990021. Files that I own have this id.

When I get my next system, if I allocate userid COLIN with AUTOUID, it may get a different UID say 990011. Because my userid 990011 is different to the owner of the files 990021, I may not be able to access “my” files.

I could change all of my files to have a new owner (and group), or I could ensure my userid on both systems is the same 990021. Using the same UID was much easier.

How is the range of AUTOIDs defined?

This is done with the RACF FACILITY profile BPX.NEXT.USER. On my system has has

APPLICATION DATA 990041-1000000/990020-1000000

Can I define a model profile?

You can configure OMVS to automatically give a userid a UID (if it does not have one) and define the rest of the OMVS profile using a model OMVS segment. See Steps for automatically assigning unique IDs through UNIX services.

Users need a home directory

Users need a home directory. There are several ways of doing this.

  • Give users an entry HOME(‘/u/mostusers’). Every one shares the same directory – not a good idea, because they would all share the SSH keys etc.
  • You could specify HOME(‘/u/mostusers/&racuid’) and specify the userid as part of the definition. This could be done in the model profile mentioned above. If you use this method you need to create the directory, for example as part of creating the userid.
  • Use automount. See Unix services automount is a blessing and curse. Where you define a template and the hard word is done for you. For example for each userid create a ZFS and use that.

I only use a few userids, so manually allocating the userid and the home directory was easy to do.

Note: If you use automount of a directory, such as /u/, you cannot mount other file systems in /u/; you would have to use a different directory, for example /usr/.

How do I create a load module in a PDS from Unix?

This is another of the little problems which are easy once you know the anwser.

I used the shell program to compile my program.

name=extract 

export _C89_CCMODE=1

p1="-Wc,arch(8),target(zOSV2R3),list,source,ilp32,gonum,asm,float(ieee)"
p7="-Wc,ASM,ASMLIB(//'SYS1.MACLIB') "
p8="-Wc,LIST(c.lst),SOURCE,NOWARN64,XREF,SHOWINC -Wa,LIST(133),RENT"

# compile it
xlc $p1 $p7 $p8 -c $name.c -o $name.o

l1="-Wl,LIST,MAP,XREF,AC=1 "
# create an executable in the file system
/bin/xlc $name.o -o $name -V $l1 1>a
extattr +a $name

# create a load module in a PDS
/bin/xlc $name.o -o "//'COLIN.LOAD(EXTRACT)'" -V $l1 1>a

Create an executable in the file system

The first bind xlc step creates an object with name “extract” in the file system.

Specify the load module

The second bind step specified a load module in a PDS. The load module is stored in COLIN.LOAD. If you copy and paste the line, make sure you have the correct quotes ( double quote, //, single quote, dataset(member),single quote,double quote). Sometimes my pasting lost a quote.

Process assembler code

My program has some assembler code…

 asm( ASM_PREFIX 
" STORAGE RELEASE,...
:"r0", "r1" , "r15" );

It needs the options “-Wc,ASM,ASMLIB(//’SYS1.MACLIB’) ” to compile it, and specify the location of the assembler macros.

Binder parameters

The line parameters in -Wl,LIST,MAP,XREF,AC=1 are passed to the binder.

Message – wrong suffix on the source file

Without the export _C89_CCMODE=1 I got the message

FSUM3008 Specify a file with the correct suffix (.c, .i, .s, .o, .x, .p, .I, or .a), or a corresponding data set name, instead of -o ./extract.

How do I enter a password on the z/OS console for my program?

I wanted to run a job/started task which prompts the operator for a password. Of course being a password, you do not want it written to the job log for every one to see.

In assembler you can write a message on the console, and have z/OS post an ECB when the message is replied to.

         WTOR  'ROUTECD9 ',reply,40,ecb,ROUTCDE=(9) 
wait 1,ECB=ECB
...
ECB DC F'0'
REPLY DS CL40

The documentation for ROUTCDE says

  • 9 System Security. The message gives information about security checking, such as a request for a password.

When this ran, the output on the console was as follows The … is where I typed R 6,abcdefg

@06 ROUTECD9 
...
R 6 SUPPRESSED
IEE600I REPLY TO 06 IS;SUPPRESSED

With ROUTCDE=(1) the output was

@07 ROUTECD1                      
R 7,ABCDEFG
IEE600I REPLY TO 07 IS;ABCDEFG

With no ROUTCDE keyword specified the output was

@08 NOROUTECD                          
R 8 SUPPRESSED
IEE600I REPLY TO 08 IS;SUPPRESSED

The lesson is that you have to specify ROUTCDE=(1) if you want the reply to be displayed. If you omit the ROUTCDE keyword, or specify a value of 9 – the output is supressed.

Can I do this from a C program?

The C run time _console2() function allows you to issue console messages. If you pass and address for modstr, the _console2() function waits until there is an operator stop of modify command issued for the job. If a NULL address is passed in the modstr, then the message is displayed, and control returns immediately. The text of the modify command is visible on the console.

To get suppressed text you would need to issue the WTOR Macro using __ASM(…) in your C program.

Can I share a VSAM file (ZFS) between systems?

I had the situation where I am using ZD&T – which is a z/OS emulator running on Linux, where there 3390 disks are emulated on Linux files. I have an old image, and a new image, and I want to use a ZFS from the new image on the old image to test out a fix.

The high level answer to the original question is “it depends”.

Run in a sysplex

This is how you run in a production environment. You have a SYSPLEX, and have a (master) catalog shared by all systems. I cannot create the environment in zD&T. Setting up a sysplex is a lot of work for a simple requirement.

Copy the Linux file

Because the 3390 volumes are emulated as Linux files, you can copy the Linux file and use that file in the old zPTD image, and avoid the risk of damaging the new copy. The Linux file name is different, but the VOLID is the same. I was told you can use import catalog to get this to work. I haven’t tried it.

The cluster is in a shared user catalog.

If the VSAM cluster is defined in a user catalog, and the user catalog can be used on both systems, then the cluster can be used on both systems (but not at the same time). When the cluster is used, information about the active system is stored in the cluster. When the file system is unmounted, or OMVS is shutdown, this system information is removed. If you do not unmount, or shutdown OMVS cleanly, then when the file system is mounted on the other system, the mount will detect the file system was last used on another system, and wait for a minute or so to make sure the other system is inactive. If the mount command is issued during OMVS startup OMVS will wait for this time. If you have 10 file systems shared, OMVS will wait for each in turn – which can significantly delay OMVS start up.

When the cluster is in the master catalog

Someone suggested

You could mount the volume to your new system and import connect the master catalog of the old system to the new one and define the old alias for the ZFS in the new master pointing to the old master which is now a user catalog to the new system.  If it’s not currently different, you could rename it on the old system to a new HLQ that is different from the existing one and then do the import connect of the master as a usercat and define the new alias pointing to the old ZFS.

This feels too dangerous to me!

Pax the files in the directory

You can use Pax to unload the contents of the directory to a dataset, then load the data from the dataset on the other system.

cd /usr/lpp....
pax -W “seqparms=’space=(cyl,(10,10))'” -wzvf “//’COLIN.PAX.PYMQI2′” -x os390 .

On the other system

mkdir mydir
cd mydir
pax -rf “//’COLIN.PAX.PYMQI2A'” .

Note when using cut and paste make sure you have all of the single quotes and double quotes. I found they sometimes got lost in the pasting.

Using DFDSS

See Migrating an ADCD z/OS release: VSAM files

I can’t even spell Ansible on z/OS

The phrase “I can’t even spell….” is a British phrase which means “I know so little about this that I cannot even pronounce or write the word.”

I wanted to see if I could use Ansible to extract some information from z/OS. There is a lot of documentation available, but it felt like the documentation started at chapter 2 of the instruction book, and missing the first set of instructions.

Below are the instructions to get the most basic ping request working.

On z/OS

Ansible is a python package which you need to install.

pip install ansible-core

This may install several packages

It is better to do this in an SSH terminal session rather than from ISPF -> OMVS. For example it may display a progress bar.

On Linux

Setup

sudo apt install ansible

I made a directory to store my Ansible files in

mkdir ansible
cd ansible

There is some good documentation here.

Edit the inventory.ini

[myhosts]
10.1.1.2

[myhosts:vars]
ansible_python_interpreter=/usr/lpp/IBM/cyp/v3r12/pyz/bin/python

Where

  • [myhosts]… is the IP address of the remote system.
  • [myhosts:vars] ansible_python_interpreter=… is needed for Ansible to work. It it the location of Python on z/OS.

Check the connection

Ansible uses an SSH session to get to the back end. Check this works before you use Ansible.

ssh colin@10.1.1.2

I have set this up for password less logon.

Try the ping

ansible myhosts -u colin -m ping -i inventory.ini

Where

  • -i inventory.ini specifies the configuration file
  • myhosts which sections in the configuration file
  • -u colin logon with this userid
  • -m ping issue this command

When this worked I got

10.1.1.2 | SUCCESS => {
"changed": false,
"ping": "pong"
}

The command took about 10 seconds to run.

You may not need to specify the -u information.

What can go wrong?

I experienced

Invalid userid

ansible myhosts -u colinaa -m ping -i inventory.ini

10.1.1.2 | UNREACHABLE! => {
"changed": false,
"msg": "Failed to connect to the host via ssh: colinaa@10.1.1.2: Permission denied (publickey,password).",
"unreachable": true
}

This means you got to the system, but you specified an invalid user, or the userid was unable to connect over SSH.

Python configuration missing

ansible myhosts -u colin -m ping -i inventory.ini

This originally gave me

[WARNING]: No python interpreters found for host 10.1.1.2 (tried ['python3.12', 'python3.11',
'python3.10', 'python3.9', 'python3.8', 'python3.7', 'python3.6', '/usr/bin/python3',
'/usr/libexec/platform-python', 'python2.7', '/usr/bin/python', 'python'])
10.1.1.2 | FAILED! => {
"ansible_facts": {
"discovered_interpreter_python": "/usr/bin/python"
},
"changed": false,
"module_stderr": "Shared connection to 10.1.1.2 closed.\r\n",
"module_stdout": "/usr/bin/python: FSUM7351 not found\r\n",
"msg": "MODULE FAILURE\nSee stdout/stderr for the exact error",
"rc": 127
}

Edit the inventoy.ini and add the ansible_python_interpreter information.

[myhosts]
10.1.1.2

[myhosts:vars]
ansible_python_interpreter=/usr/lpp/IBM/cyp/v3r12/pyz/bin/python

My certificate has expired – how do I renew it ?

Once you know this is a question with an easy answer.

//IBMRACF  JOB 1,MSGCLASS=H 
//S1 EXEC PGM=IKJEFT01,REGION=0M
//SYSPRINT DD SYSOUT=*
//SYSTSPRT DD SYSOUT=*
//SYSTSIN DD *
RACDCERT ID(START1) GENREQ(LABEL('NEWTECCTEST')) -
DSN('COLIN.CERT.REQ')

RACDCERT ID(START1) GENCERT('COLIN.CERT.REQ') -
NOTAFTER( DATE(2027-12-21)) -
SIGNWITH (CERTAUTH LABEL('DOCZOSCA'))
RACDCERT LIST (LABEL('NEWTECCTEST')) ID(START1)
//

The first command takes my existing (expired) certificate belonging to userid START1 and creates a certificate request in the data set. The request looks like

-----BEGIN NEW CERTIFICATE REQUEST-----                               
MIIBgjCCAQcCAQAwNzEUMBIGA1UEChMLTkVXVEVDQ1RFU1QxDDAKBgNVBAsTA1NT
...
qZgQtwIwbYYgRWDQcPOZ92sVszf5Bv+mslcDjNAuM5Sj4Z9uadnKsaTmiy6h16tr
TpPAW84d
-----END NEW CERTIFICATE REQUEST-----

The Gencert command renews it with the specified date. If you omit the date it defaults to a year from the start date.

With most of my gencert requests, I have specified information like

RACDCERT ID(COLIN) GENCERT -                                
SUBJECTSDN(CN('10.1.1.2') -
O('NISTEC256') -
OU('SSS')) -
ALTNAME(IP(10.1.1.2))-
NISTECC -
KEYUSAGE( HANDSHAKE ) -
SIZE(256 ) -

SIGNWITH (CERTAUTH LABEL('DOCZOSCA')) -
WITHLABEL('NISTEC256') -

Because I passed a data set it, the information was taken from the dataset. I think it ignores SUBJECTDSN etc data if a data set is used.

When I specified a 2028 date I got message

IRRD113I The certificate that you are creating has an incorrect date range.  The certificate is added with NOTRUST status.  

The IRRD113I message says

“has an incorrect date range”, the date range of the certificate being added is not within the date range established by the CA (certificate authority) certificate.

This is a hint that I need to renew my CA certificate as it will expire in the next two years.

After the gencert command was successful, the list command gave

Digital certificate information for user START1:                    

Label: NEWTECCTEST
Certificate ID: 2Qbi48HZ4/HVxebjxcPD48Xi40BA
Status: NOTRUST
Start Date: 2026/02/25 00:00:00
End Date: 2027/12/21 23:59:59
Serial Number:
>5B<
Issuer's Name:
>CN=DocZosCA.OU=CA.O=COLIN<
Subject's Name:
>CN=10.1.1.2.OU=SSS.O=NEWTECCTEST<
Subject's AltNames:
IP: 10.1.1.2
Signing Algorithm: sha256RSA
Key Usage: HANDSHAKE
Key Type: NIST ECC
Key Size: 384
Private Key: YES
...

Once I had renewed it, I had to restart the servers using it so they picked up the updated certificate.

Logging on to Git (on z/OS)

I’ve gradually been moving away from being 100% ISPF, and moving to OMVS. I use SSH terminals to access the Command Line Interface (CLI) just like I use on Linux, and I do most of my editing with VScode on Linux accessing the files on z/OS over sshfs so they look as if they are in a local Linux directory.

I wanted to use Git on z/OS. It was easy to install and start using, but I had problems logging on to Git.

As I understand it there are several ways of logging on to Git. I’ve used two, HTTPS and SSH.

HTTPS

You can logon to Git with a userid and a Personal Access Token. A PAT is like a sophisticated password. To get a PAT, go to your Git home page, click on your photo, and click settings. On the public profile page which is displayed, at the bottom of the left hand column is<> Developer settings. Click on this link. Click on Personal access tokens.

Click on Tokens (classic) -> Generate new token (classic). You have to verify, so I clicked send code via email. Copy the PAT.

When you create a new PAT you can specify what the token can do, for example

  • full control of the private repository, or just access the public repository, or access the commit state.
  • can control the public keys
  • delete repositories

Click on generate token. A token is displayed such as ghp_7OSehXd6lP1234Gy0KRvqpmABALX8L618ycad. Copy this and save it somewhere securely. If you lose it, it is easy to delete and create another.

If you use Git using https, for example https://github.com/colinpaiceABC/ColinsRepo it will prompt for userid (colinpaiceABC) and password. Password means use a PAT.

You can store the userid and PAT for scripts etc to use to logon.

When you create the PAT you specify the validity period, for example two weeks, so you will need to have a process in place to renew the token.

SSH

You can logon to Git using SSH. Because keys are stored on your local machine, and on the Git server, you do not need to enter userid and password/PAT each time.

Git has excellent documentation on using ssh.

You need an SSH key. Check in directory ~/ssl, for files like id_….pub I have id_ed25519.pub and id_rsa.pub . If you do not have a key, follow the git documentation to create one.

Once I had my key I used the documentation on how to add it to Git.

Check you are using the ….pub file. It looks like

ssh-ed25519 AAAAC3NzaC1lZDI1NTE5AAAA...XX/Xk colin@ColinNew

Add it using picture -> settings -> SSH and GPG keys ….

To use this you access Git via

git clone git@github.com/colinpaicemq/MQTools.git

If it doesn’t work as expected.

I got into a mess because I used

git clone https://github.com/colinpaicemq/MQTools.git

to clone the repository. When I tried to update the repository it asked me for userid and password!

You can change whether you use HTTPS or SSH to logon. For example to set SSH

git remote set-url origin git@github.com:colinpaicetest/testrepro.git

See the documentation.

How to stop when blocked.

I hit an interesting challenge when working with SMF-real-time.

An application can connect to the SMF real time service, and it will be sent data as it is produced. Depending on a flag the application can use:

  • Non blocking. The application gets a record if there is one available. If not it gets are return code saying “No records available”.
  • Blocking. The application waits (potentially for ever) for a record.

Which is the best one to use?

This is a general problem, it not just SMF real time that has these challenges.

What are the options- generally?

Non blocking

You need to loop to see if there are records, and if not wait. The challenge is how long should you wait for. If you wait for 10 seconds, then you may get a 10 second delay between the record being created, and your application getting it. If you specify a shorter time you reduce this window. If you reduce the delay time, then the application does more loops per minute and so there is an increase in CPU cost.

Blocking

If you use blocking – there is no extra CPU looping round. The problem is how to do you stop processing your application cleanly when it is in a blocked wait, to allow cleaning up at the end of the processing.

A third way

MQSeries process application messages asynchronously. You can say wait for a message – but time out after a specified time interval if no messages have been received.

This method is very successful. But some application abused it. They want their application to wake up every n second, and check their application’s shutdown flag. If their flag is set, then shutdown.

The correct answer in this case is to have MQ Post an Event Control Block(ECB), the application’s code posts another ECB; gthe mainline code waits for either of the EBCs to be posted, and take the appropriate action. However the lazy way of sleeping, waking, checking and sleeping is quick and easy to code.

What are the options for SMF real time?

With the SMF real time code, while one thread is running – and in a blocked wait, another thread can issue a disconnect() request. This wakes up the waiting thread with a “you’ve been woken up because of a disconnect” return code.

The solution is to use a threading model.

The basic SMF get code

while True:
x, rc = f.get(wait=True) # Blocking get
if x == "" : # Happens after disc()
print("Get returned Null string, exiting loop")
break
if rc != "OK":
print("get returned",rc)
print("Disconnect",f.disc())
i += 1
# it worked do something with the record
...
print(i, len(x)) # Process the data
...

Make the SMF get a blocking get with timeout

With every get request, create another thread to wake up and issue the disconnect, to cause the blocked get to wake up.

This may be expensive doing a timer thread create and cancel with every record.

def blocking_get_loop(f, timeout=10, event=None,max_records=None,):
i = 0
while True:
t = Timer(timeout, timerpop(f)) # execute the timeout function after 10 seconds
x, rc = f.get(wait=True) # Blocking get
t.cancel()

....


# This wakes after the specified interval then executes.
def timerpop(f):
f.disc()

Wait for a terminal interrupt

Vignesh S sent me some code which I’ve taken and modified.

The code to the SMF get is in a separate thread. This means the main thread can execute the disconnect and wake up the blocked request.

# Usage example: run until Enter or interrupt, or max_records
if __name__ == "__main__":
blocking_smf(max_records=6) # execute the code below.

def blocking_smf(stream_name="IFASMF.INMEM", debug=0, max_records=None):
f = pySMFRT(stream_name, debug=debug)
f.conn(stream_name,debug=2) # Explicit connect

# Start the blocking loop in a separate thread
get_thread = threading.Thread(target=mainlogic,args=(f),kwargs={"max_records": 4}))
get_thread.start()


try:
# Main thread: wait for user input to stop
input("Press Enter to stop...\n")
print("Stopping...")
except KeyboardInterrupt:
print("Interrupted, stopping...")
finally:
f.disc() # This unblocks the get() call
get_thread.join() # Wait for thread to exit

The key code is

get_thread = threading.Thread(target=mainlogic,args=(f),kwargs={"max_records": 4}))
get_thread.start()

This attaches a thread, execute the function mainlogic, passing the positional parameters f, and keyword arguments max_records.

The code to do the gets and process the requests is the same as before, with the addition of the count of records processed.

def blocking_get_loop(f, max_records=None):
i = 0
while True:
x, rc = f.get(wait=True) # Blocking get
if x == "" : # Happens after disc()
print("Get returned Null string, exiting loop")
break
if rc != "OK":
print("get returned",rc)
print("Disconnect",f.disc())
i += 1
# it worked do something with the record
...
print(i, len(x)) # Process the data
...
#
if max_records and i >= max_records:
print("Reached max records, stopping")
print("Disconnect",f.disc()) # clean up if ended because of number of records
break

If the requested number of records has been processed, or there has been an error, or unexpected data, then disconnect is called, and the function returns.

Handle a terminal interrupt

The code

try:
# Main thread: wait for user input to stop
input("Press Enter to stop...\n")
print("Stopping...")
except KeyboardInterrupt:
print("Interrupted, stopping...")
finally:
f.disc() # This unblocks the get() call
get_thread.join() # Wait for thread to exit

waits for input from the terminal.

This solution is not perfect because if the requested number of records are processed quickly, you still have to enter something at the keyboard.

Use an event with time out

One problem has been notifying the main task when the SMF get task has finished. You can use an event for this.

In the main logic have

def blocking_smf(stream_name="IFASMF.INMEM", debug=0, max_records=None):
f = pySMFRT(stream_name, debug=debug)
f.conn(stream_name,debug=2) # Explicit connect
myevent = threading.Event()

# Start the blocking loop in a separate thread
get_thread = threading.Thread(target=blocking_get_loop, args=(f,),
kwargs={"max_records": max_records,
"event":myevent})
get_thread.start()
# wait for the SMF get task to end - or the event time out
if myevent.wait(timeout=30) is False:
print("We timed out")
f.disc() # wakeup the blocking get

get_thread.join # Wait for it to finish

In the SMF code

def blocking_get_loop(f, t,max_records=None, event=None):
i = 0
while True:
#t = Timer(timeout, timerpop(f))
x, rc = f.get(wait=True) # Blocking get
#t.cancel()
if x == "" : # Happens after disc()
print("Get returned Null string, exiting loop")
break
if rc != "OK":
print("get returned",rc)
print("Disconnect",f.disc())
i += 1
print(i, len(x)) # Process the data
if max_records and i >= max_records:
print("Reached max records, stopping")
print("Disconnect",f.disc())
break
if event is not None:
event.set() # wake up the main task

s

Why oh why is my application waiting?

I’ve been working on a presentation on performance, and came up with an analogy which made one aspect really obvious…. but I’ll come to that.

This blog post is a short discussion about software performance, and what affects it.

Obvious statement #1

The statement used to be An application is either using, CPU, or waiting. I prefer to add or using CPU and waiting which is not obvious unless you know what it means.

Obvious statement #2

All applications wait at the same speed. If you are waiting for a request from a remote server, it does not matter how fast your client machine is.

Where can an application wait?

I’ll go from longest to shortest wait times.

Waiting for the end user

If you have displayed a menu for an end user to complete, you might wait minutes (or hours) for the end user to complete the information and send it.

Waiting for a remote request

This can be a request to a remote server to do something. This could be to buy something, or simple web look up, or a Name Server lookup. These should all be under a second.

Waiting for disk I/O

If your application is doing database work, such as DB2 there can be many disk I/Os. Any updates are logged to disk for recovery purposes. If your disk response time is typically 1 ms, then you may have to wait several milliseconds. When your application issues a commit, and wants to log data – there will likely to be an I/O in progress – so you have to wait for that I/O to complete before any more data can be written. Typically a database can write 16 4KB pages at a time. If the database logging is very active you may have to wait until any queued data in log buffers is written, before your application’s data can be written. An I/O consists of a set up followed by data transmission. The set up time is usually pretty constant – but more data takes more time to transfer. Writing 16 * 4 KB pages will usually take longer than writing one 4KB page.

An application writing to a file may buffer up several records before writing one record to the external medium. You application wrote 10 records, but there was only one I/O.

These I/Os should be measured in milliseconds (or microseconds).

Database and record locks

If your application want to update some information in a database record it could do

  • Get record for update (this prevents other threads from updating it)
  • Display a menu for the end user to complete
  • When the form has been completed, update the record and commit.

This is an example of “Waiting for the end user”. Another application wanting to update the same record may get an “unavailable” response, or wait until the first application has finished.

You can work around this using logic like

  • Each record has a last updated timestamp.
  • Read the record note the last updated timestamp, display the menu
  • When the form has been completed..
    • Read the record for update from the database, and check the “last updated time”.
    • If the time stamp matches the saved value, update the information and commit the changes.
    • If the time stamp does not match, then the record has been updated – release it, and go to the top and try again.

Coupling Facility access

This is measured in 10s of microseconds. The busier the CF is, the longer requests take.

Latches

Latches are used for serialisation of small sections of code. For example updating storage chains.

If you have two queues of work elements, one queued work, on in-progress work. In a single threaded application you can move a work element between queues. With multiple threads you need some form of locking.

In its simplest form it is

pthread_mutex_lock(mymutex)
work = waitqueue.pop()
active.push(work)
pthread_mutex_unlock(mymutext)

You should design your code so few threads have to wait.

Waiting for CPU

This can be due to

  • The LPAR is constrained for CPU; other work gets priority, and your application is not dispatched.
  • The CEC (physical box) is constrained for CPU and your LPAR is not being dispatched.

If your LPAR has been configured to use only one CPU, and there is space capacity in the CEC your LPAR will not be able to use it.

Waiting for paging etc

In these days of lots of real storage in the CEC, waiting for paging etc is not much of an issue. If the virtual page you want is not available to you the operating system has to allocate the page, and map it to real storage.

Waiting for data – using CPU and waiting.

Some 101 education on computer Z architecture

  • The processors for the z architecture are in books. Think of a book as being a physical card which you can plug/unplug from a rack.
  • You can have multiple books.
  • Each book has one or more chips
  • Each chip has one or more CPUs.
  • There is cache (RAM) for each CPU
  • There is cache for each chip
  • There is cache for each book
  • At a hardware level, when you are updating a real page, it is locked to your CPU.
  • If another CPU wants to use the same real page, it has to send a message to the holding CPU requesting exclusive use
  • The physical distance between two CPUs on the same chip is measured in millimeters
  • The distance between two CPUs in the same book is measured in centimeters
  • The distance between two CPUs in different books could be a metre.
  • The time to send information depends on the distance it has to travel. Sharing data between data two CPUs on the same chip will be faster than sharing data between CPUs in different books.

Some instructions like compare and swap are used for serialising access to one field.

  • Load register 4 with value from data field. This could be slow if the real page has to be got from another CPU. It could be fast it the storage is in the CPU, chip or book cache.
  • Load register 5 with new value
  • Compare and swap does
    • Get the exclusive lock on the data field
    • If the value of the data field matches the value in register 4 (the compare)
    • then replace it with the value in register 5 (the swap)
    • else say mismatch
    • Unlock.

These instruction (especially the first load) can take a long time, especially if the data field is “owned” by another CPU, and the hardware has to go and get the storage from another CPU in a different book, a metre away.

A common technique for Compare and Swap is to have a common trace table. Each thread gets the next free element, and sets the next free. With many CPU’s actively using the Compare and Swap, these instructions could be a major bottleneck.

A better design is to give each application thread their own trace buffer to avoid the need for a serialisation instruction, and so there is no contention.

Storage contention

We finally get to the bit with the analogy to explain storage contention

You have an array of counters with one slot for each potential thread. You have 16 threads, your array is size 16.

Each thread updates its counter regularly.

Imaging you are sitting in a class room listening to me lecture about performance and storage contention.

I have a sheet of paper with 16 boxes drawn on it, one per person (equivalent to one per thread).
I pick a person in the front row, and ask them to make a tick on the page in their box every 5 seconds.

Tick, tick, tick … easy

Now I introduce a second person and it gets harder. The first person make a tick – I then walk the piece of paper across the classroom to the second person, who makes a tick. I walk back to the first, who makes another tick etc

This will be very slow.

It gets worse. My colleague is giving the same lecture upstairs. I now do my two people, then go up a floor, so someone in the other classroom can make a mark. I then go back down to my class room and my people (who have been waiting for me) can then make their ticks.

How to solve the contention?

The obvious answer is to give each person their own page, and there is no contention. In hardware terms it might be a 4KB page – or it may be a 256 cache line.

I love this analogy; it has many levels of truth.