Python could not read a data set I sent from z/OS USS.

I created a file in Unix System Services, and FTPed it down to my Linux box. I could edit it, and process it with no problems, until I came to read in the file using Python.

Python gave me

File “<frozen codecs>”, line 322, in decode
UnicodeDecodeError: ‘utf-8’ codec can’t decode byte 0xb8 in position 3996: invalid start byte

The Linux command file pagentn.txt gave me

pagentn.txt: ISO-8859 text

whereas other files had ASCII text.

I changed my Python program to have

with open(“/home/colinpaice/python/pagentn.txt”,encoding=”ISO-8859-1″) as file:

and it worked!

I browsed the web, and found a Python way of finding the code page of a file

import chardet    
rawdata = open(infile, 'rb').read()
result = chardet.detect(rawdata)
charenc = result['encoding']

it returned a dict with

result {‘encoding’: ‘ISO-8859-1’, ‘confidence’: 0.73, ‘language’: ”}

One minute mvs: data set file system (/dsfs) on z/OS

There is a good overview of dsfs here.

My understanding of how dsfs works is that you can access z/OS datasets, members and spool, through a Unix file system interface, and so use commands like “ls”. When you read a member, it is read into the dsfs file system, and the Unix commands work on this data. When you write to the Unix file, it is written to the dsfs file system, and when the file is closed, the data is written to the dataset.

Some information is cached in the dsfs file system, and some definitions, such as DCB information on a per userid basis is also stored.

DSFS is defined to OMVS as a Physical File System. Part of the definition of the PFS is the started task name. When the PFS is defined to OMVS, it starts the DSFS started task.

To stop it, you issue a command to OMVS to stop the PFS.

Permissions

The dsfs started task has threads which do work. When you issue a request, one of the threads becomes your userid and accesses the data sets, so there is no change to the standard RACF data set protection.

User defaults

You can define defaults for when data sets are created. These are user dependant. To issue these for another user, you have to become a super user to become that id, and then issue the dfsadm command.

Changing configuration

You can change the configuration file and restart DSFS (which will be disruptive).

You can make changes to the active system (which are not saved) using the omvs command

Diagnosis

Messages are produced, and are documented in the M&C manual. if you get a reason code, you can use bpxmtext code to display the meaning of the code.

Creating data sets and members from /dsfs

You can configure defaults when using dsfs to create data sets and members. For example

Creating a PDS(E)

cd /dsfs/txt/colin
dsadm createparm -path . -pdsmodel "dsntype(library) dsorg(po) lrecl(133) recfm(vb)
blksize(0)"

For path /dsfs/txt/colin (-path .) it creates a model for pds (-pdsmodel) with the parameters dsntype…blksize(0)

Note:blksize(0) says “pick the best block size”

You can display this using

dsadm fileinfo   -path /dsfs/txt/colin  

and get

path: /dsfs/txt/colin                                                                                
fid 49,1 anode 1285,4548
length 16384 format BLOCKED
1K blocks 16 dir tree status VALID
PDS model anode 31,60 PS model anode 0,0
object type DIR object linkcount 181
object genvalue 0x00000000 dir version 1
dir name count 431 data set name type HLQ DIR
recfm na lrecl na
data mode TEXT data set status RETRIEVED
data set name COLIN
ENQ held NO
direct blocks 0x00001BAF 0x00001BB0 0xFFFFFFFF 0xFFFFFFFF
0xFFFFFFFF 0xFFFFFFFF 0xFFFFFFFF 0xFFFFFFFF
indirect blocks 0xFFFFFFFF 0xFFFFFFFF 0xFFFFFFFF
mtime Sep 9 10:09:55 2024 atime Sep 9 10:09:55 2024
ctime Sep 9 10:09:55 2024 create date Sep 9 2024
not encrypted not compressed
PDS model dsntype(library) dsorg(po) lrecl(133) recfm(vb) blksize(0)
PS model na
vnode,vntok 0x00000050,,0x072003C0 0x0099A7F0,,0x00000000
opens oi=0 rd=0 wr=0
file segments na file unscheduled na
meta buffers 0 dirty meta buffers 2

Creating a sequential file

dsadm createparm -path /dsfs/txt/colin -psmodel "dsorg(ps) lrecl(80) recfm(fb) blksize(3200)"

The command

dsadm fileinfo   -path /dsfs/txt/colin 

now gives

PS model               dsorg(ps) lrecl(80) recfm(fb) blksize(3200)    

The command

echo "hi" > /dsfs/txt/colin/seq2          

creates a data set COLIN.SEQ2 with characteristics

Record format . . . : FB
Record length . . . : 80
Block size . . . . : 3200

If you specify blksize(0) the system applies a default. In my case Block size . . . . : 27920, which makes good usage of the disk space.

Different data set types

There are three data set types

  • bin – binary for example load modules
  • rec – records
  • txt – text files, with a new line character “at the end of the line”

You can define the data set characteristics for each of these.

Problems I experienced when using /dsfs

I played around with /dsfs and found a couple of problems.

My userid is COLIN.

Data set not seen for a time

I created a data set COLIN.ENCR2.DSN in ISPF (and also in batch).

The command

ls -ltr /dsfs/txt/colin/en*

found the data set /dsfs/txt/colin/encrypt.jcl, but it did not find /dsfs/txt/colin/encr2.dsn.

A while later (a minute or two) it found /dsfs/txt/colin/encr2.dsn . This is because of the DIRECTORY_REFRESH_TIMEOUT that controls it. Its default value is 60. You can dynamically change it via command dsadm config -directory_refresh_timeout <number>. The expected value is a number in the range 30-600.

Encrypted data set not seen

I had set up a data set which was encrypted, and empty. I could not list it using /dsfs

I had to set up an encryption ZFS for DSFS (as the documentation says) for it to be seen. I expected the dataset name to be visible, but not the contents.

Create the label in ICSF

//IBMICSF  JOB 1,MSGCLASS=H 
//STEP10 EXEC PGM=CSFKGUP
// SET CKDS=COLIN.SCSFCKDS
//CSFCKDS DD DISP=OLD,DSN=&CKDS
//* LENGTH(32) GENERATES A 256 BIT KEY
//CSFIN DD *,LRECL=80
ADD TYPE(DATA) ALGORITHM(AES),
LABEL(DSFSAES ) LENGTH(32)
/*
DELETE TYPE(DATA) LABEL(COLINBATCHAES )
//CSFDIAG DD SYSOUT=*,LRECL=133
//CSFKEYS DD SYSOUT=*,LRECL=1044
//CSFSTMNT DD SYSOUT=*,LRECL=80
//* REFRESH THE IN MEMORY DATA
//REFRESH EXEC PGM=CSFEUTIL,PARM='&CKDS,REFRESH'

Give the DSFS userid access to the label

//IBMRACF2 JOB 1,MSGCLASS=H RESTART=S2 
// EXPORT SYMLIST=(*)
// SET KEY=DSFSAES
//S1 EXEC PGM=IKJEFT01,REGION=0M
//SYSPRINT DD SYSOUT=*
//SYSTSPRT DD SYSOUT=*
//SYSTSIN DD *,SYMBOLS=(JCLONLY)
RDELETE CSFKEYS &KEY
RDEFINE CSFKEYS &KEY +
ICSF(SYMCPACFWRAP(YES) SYMCPACFRET(YES)) +
UACC(NONE)
PERMIT &KEY +
CLASS(CSFKEYS) ID(DSFS ) +
ACCESS(READ)
SETROPTS RACLIST(CSFKEYS) REFRESH

RLIST CSFKEYS &KEY AUTHUSER ICSF

ADDSD 'COLIN.ZFS.DSFS' UACC(NONE)
ALTDSD 'COLIN.ZFS.DSFS' UACC(NONE) DFP(DATAKEY(&KEY))
/*

Define the ZFS for DSFS

I stopped DSFS using the operator command

f omvs,stoppfs=dsfs

I could then delete and recreate the ZFS. If I wanted to keep the content of the DSFS ZFS, I think I could have using VSAM REPRO to copy the contents to a new file.

//IBMCLUST  JOB 1 
//DEFINE EXEC PGM=IDCAMS,REGION=0M
//SYSPRINT DD SYSOUT=*
//SYSIN DD *
DELETE COLIN.ZFS.DSFS
DEFINE CLUSTER (NAME(COLIN.ZFS.DSFS) -
VOLUMES(* ) -
CYL(100 1000) -
KEYLABEL(DSFSAES) -
DATACLASS(DCEATTR) -
ZFS )
LISTCAT ENT(COLIN.ZFS.DSFS) ALL
/*

The LISTCAT output had

LISTCAT ENT(COLIN.ZFS.DSFS) ALL                                                                            
CLUSTER ------- COLIN.ZFS.DSFS
...
ENCRYPTIONDATA
DATA SET ENCRYPTION----(YES)
DATA SET KEY LABEL-----DSFSAES

I restarted DSFS.

With the DSFS ZFS encrypted, the ls command for the encrypted data set worked, and I was able to read the content.

When my userid COLIN which had access to the encrypted data set, but not the encryption key, tried to display the contents there were messages on the system log

ICH408I USER(COLIN   ) GROUP(SYS1    ) NAME(####################) 008    
COLINBATCHAES CL(CSFKEYS )
INSUFFICIENT ACCESS AUTot HORITY
ACCESS INTENT(READ ) ACCESS ALLOWED(NONE )
IDFS00338E RACROUTE REQUEST=FASTAUTH call to authorize access to the keylabel COLINBATCHAES
for data set COLIN.ENCR.DSN for user COLIN failed with SAF rc 8 RACF rc 8 Reason Code 0.

and the OMVS session got

cat: /dsfs/txt/colin/encr.dsn: EDC5164I SAF/RACF error.    

It is hard work getting error messages

I got

COLIN:/u: >dsadm config -hlq_list_add SYS1.PARMLIB                                                                                
IDFS00324E Could not add the specified high-level qualifier directory names to the HLQ list, error code 139 reason code ED04618D.

To get the full description of the error I had to use

COLIN:/u: >bpxmtext ED04618D                                                                                                      
DSFS Tue Sep 12 19:21:54 EDT 2023
Description: The caller is not UID 0.

Action: Ensure that you are properly authorized for this operation. This
subcommand requires UID 0 or READ access to the SUPERUSER.FILESYS.PFSCTL
profile in the z/OS UNIXPRIV class.

Where’s the $?

Because the OMVS shell uses $ to replace variables, you have to quote the data set name.

IBMUSER:/u/ibmuser: >ls /dsfs/txt/sys1.proclib/$$$COIBM                           
ls: FSUM6785 File or directory "/dsfs/txt/sys1.proclib/50397215" is not found

You have to use

ls '/dsfs/txt/sys1/parmlib/$$$COIBM'     

with quotes around it, or escape the special characters.

ls /dsfs/t*/sys1/proclib/\$\$\$coibm

Setting up data set file system /dsfs on z/OS

There is a good overview of dsfs here.

The documentation is pretty good. It is clear – but I would have ordered the sections differently.

Concept

My understanding of how dsfs works is that you can access z/OS datasets, members and spool, through a Unix file system interface, and so use commands like “ls”. When you read a member, it is read into the dsfs file system, and the Unix commands work on this data. When you write to the Unix file, it is written to the dsfs file system, and the data is written to the dataset when the file is closed.

Some information is cached in the dsfs file system, and some definitions, such as DCB information on a per userid basis is also stored.

Setup the RACF definitions

//IBMRACFL JOB 1,MSGCLASS=H                                
//S1 EXEC PGM=IKJEFT01,REGION=0M
//SYSPRINT DD SYSOUT=*
//SYSTSPRT DD SYSOUT=*
//SYSTSIN DD *
ADDGROUP DSFSGRP SUPGROUP(SYS1)
ADDUSER DSFS DFLTGRP(DSFSGRP) AUTHORITY(USE) UACC(NONE)
RDEFINE STARTED DSFS.** STDATA(USER(DSFS))
SETROPTS RACLIST(STARTED)
SETROPTS RACLIST(STARTED) REFRESH
/*
//

The started task procedure is in IOE.SIOESAMP(DSFS)

You can configure it to use the parmlib concatenation. I configured it to use a member in user….parmlib.

//DSFS     PROC REGSIZE=0M                                            
//*
//DSFSGO EXEC PGM=BPXVCLNY,REGION=&REGSIZE,TIME=1440
//*
//*STEPLIB DD DISP=SHR,DSN=hlq.SIEALNKE <--DSFS LOADLIB
//IDFZPRM DD DISP=SHR,DSN=USER.Z31A.PARMLIB(DSFS) <--DSFS PARM FILE

This is automatically started when you activate the dsfs physical file system (in a bpxprmxx member).

Create the ZFS

If you are going to use encrypted data sets the ZFS needs to be encrypted. See One minute MVS – Using individual data set encryption on z/OS.

I do not know how much space you actually need. If you specify a secondary extent, the ZFS can expand as needed.

If you want the ZFS to be larger than 4GB you need to specify Extended Addressability. On ADCD there is an SMS Data Class DCEATTR with extended attributes.

The documentation talks about having the z/OS image name as part of the data set when part of a sysplex. I specified the name as part of DSFS startup.

If you are using ADCD and you use a “system” HLQ such as ZFS.DSFS, then when you move to the next release of ADCD you’ll have to migrate it. By using COLIN as a HLQ, it gets picked up automatically in the next release after I have imported the user catalog.

//IBMCLUST  JOB 1 
//DEFINE EXEC PGM=IDCAMS,REGION=0M
//SYSPRINT DD SYSOUT=*
//SYSIN DD *
DEFINE CLUSTER (NAME(COLIN.ZFS.DSFS) -
VOLUMES(* ) -
CYL(100 1000) -
DATACLASS(DCEATTR) -
ZFS )

/*
LISTCAT ENT(COLIN.ZFS.DSFS) ALL
DELETE COLIN.ZFS.DSFS

The ZFS is formatted when it is first used, and as it expands.

Startup parameters

In USER.Z31A.PARMLIB(DSFS), referred to in the DSFS started procedure, I have

* 
UTFS_NAME=COLIN.ZFS.DSFS

Without the UTFS_NAME, it will look for a data set with the image name. If you are running in a sysplex each system needs its own ZFS.

Define the root

In TSO OMVS use mkdir /dsfs

Define the physical file system, and mount the file

I created USER.Z31A.PARMLIB(BPXPRMFS) with

FILESYSTYPE TYPE(DSFS) ENTRYPOINT(IDFFSCM) ASNAME(DSFS)

MOUNT TYPE(DSFS)
MODE(RDWR)
NOAUTOMOVE
MOUNTPOINT('/dsfs')
FILESYSTEM('COLIN.ZFS.DSFS')

Activate and start it

You can use the operator command set omvs=(FS) to activate the member bpxprmfs (defined above). This starts the address space DSFS. You could have this activated as part of IPL in the usual way.

The first time I started DSFS, I got messages

SET OMVS=(FS)                                                         
IEE252I MEMBER BPXPRMFS FOUND IN USER.Z31A.PARMLIB
BPXO032I THE SET OMVS COMMAND WAS SUCCESSFUL.
IEF196I IEF285I USER.Z31A.PARMLIB KEPT
IEF196I IEF285I VOL SER NOS= A3CFG1.
...
$HASP100 DSFS ON STCINRDR
$HASP373 DSFS STARTED
IEF403I DSFS - STARTED - TIME=08.51.23
IDFS00153I DSFS kernel: Initializing z/OS DSFS 279
Version 03.01.00 Service Level OA65531 - HZFS510.
Created on Tue Sep 12 19:21:54 EDT 2023.
Address space asid x47.
IDFS00064I USER.Z31A.PARMLIB(DSFS) is the configuration data set currently in use.
IDFS00352I DSFS is using HLQ EXCLUDE mode.
IDFS00039I DSFS kernel: initialization complete.
BPXF025I FILE SYSTEM COLIN.ZFS.DSFS IS BEING MOUNTED.
IDFS00320I Formatting to 4096 control interval 18000 for primary
IDFS00003I Primary extent loaded successfully for COLIN.ZFS.DSFS.
IDFS00034I Detaching utility file system COLIN.ZFS.DSFS.

Stop dsfs

If you get messages saying data set in use etc, this may be due to the DSFS program has started. You can stop it using f omvs,stoppfs=dsfs, fix any problems, and restart it.

Setting up permissions

Some of the dsadm commands, for example which change the configuration, need permission. The issuer must be logged in as a root user (UID=0) or have READ authority to the
SUPERUSER.FILESYS.PFSCTL resource in the z/OS UNIXPRIV class. Note: If you are permitted READ to the BPX.SUPERUSER resource in the RACF FACILITY class, you can become a UID of 0 by issuing the su command to get uid 0.

PERMIT SUPERUSER.FILESYS.PFSCTL CLASS(UNIXPRIV) ID(userid) ACCESS(READ)
SETROPTS RACLIST(UNIXPRIV) REFRESH

PERMIT BPX.SUPERUSER CLASS(FACILITY) ID(userid) ACCESS(READ)
SETROPTS RACLIST(FACILITY) REFRESH

Once it has started – now what?

In OMVS you can use the ls -ltr command

IBMUSER:/u/ibmuser: >ls -ltr  /dsfs                           
total 0
drwxrwxrwx 2 OMVSKERN SYS1 0 Sep 9 03:51 txt
drwxrwxrwx 2 OMVSKERN SYS1 0 Sep 9 03:51 sysout
drwxrwxrwx 2 OMVSKERN SYS1 0 Sep 9 03:51 rec
drwxrwxrwx 2 OMVSKERN SYS1 0 Sep 9 03:51 bin

You can use the dsfs commands in Unix Services

dsadm fsinfo
File System Name: COLIN.ZFS.DSFS

System: S0W1 Devno: 0
Size: 72000K Free 8K Blocks: 8732
Free 1K Fragments: 7 Log File Size: 2056K
Bitmap Size: 16K Anode Table Size: 8K
File System Objects: 7 Version: 1
Overflow Pages: 0 Overflow HighWater: 0
Space Monitoring: 0,0
ENOSPC Errors: 0 Disk IO Errors: 0
Status: NE,NC

File System Creation Time: Sep 9 03:51:25 2024
Mount Time: Sep 9 03:51:25 2024

Last Grow Time: n/a

How do I diff on Z/OS with Unix files and directories

I wanted to compare the contents of two directories in Unix System Services on z/OS before I merged them. This took me some time to do because the documented is lacking.

With ISPF you can use SUPERC (3.13), give it two PDSs and it shows you the differences.

On Unix there is the diff command. This can compare individual files, or directories. It can display just the changes, or the changes in context.

File a

AOnly 
line1
Aline2
line3

File b

line1 
Bline2
line3
BOnly
BOnly2

Using ispf edit compare to show differences

You can use diff to show the differences in two files, but it is not easy to understand. ISPF EDIT has the compare facility. If you know two files are different you can use

  • oedit /etc/zexpl/rseapi.env
  • use the primary command compare
    • enter the fully qualified name /u/ibmuser/zexpl/rseapi.env,, in the “Name . . . . .” field. if you specify +/filename the + means the same directory.
    • press pf3 and it will show the differences

Because I tend to remove comments to make it easier to see the content, I tend to use

  • oview /etc/ssh/sshd_config you get into ISPF edit, but no changes are saved.
  • Comments start with a #. x all;f ‘#’ all 1 1; f ‘ ‘ 1 1 all;del all nx removes comment lines and blank lines
  • reset to show the hidden lines
  • compare I then specify my version of the file, and see the changes.

  • Lines like ====== TrustedUserCAKeys /etc/ssh/user_ca_key.pub are from my copy.
  • Lines in green with a line number are in both files 000006 Subsystem sftp /usr/lib/ssh/sftp-server
  • Lines like .OAAAA UseDNS yes are from the base file

Update your version, make a copy of the original, then replace the original with your version.

diff -c1 a b – show the files and the changes in context

The line prefix for input file going to output file

  • – to be removed
  • ! to be changed
  • + to be added

The command diff -c1 a b gives

*** a Tue Aug 27 02:48:56 2024              
--- b Tue Aug 27 02:50:41 2024
***************
*** 1,4 ****
- AOnly
line1
! Aline2
line3
--- 1,5 ----
line1
! Bline2
line3
+ BOnly
+ BOnly2
  • *** a Tue Aug 27 02:48:56 2024 the first file name, and last changed date
  • — b Tue Aug 27 02:50:41 2024 the second file name, and last changed date
  • *** 1,4 **** the*** show it is file 1, lines 1 to 4
  • – AOnly this line is in file a is not in file b, so would need to be removed(-) from file a
  • ! Aline2 this line is in file b – but different
  • — 1,5 —- this is file b, lines 1 to 5
  • ! Bline2 this line is also in file a – but different
  • + BOnly this line is in file b and and was additional(+) to file a

When one file exists but is empty you get output like

*** /etc/resolv.conf Wed Mar  6 11:54:50 2024                         
--- /u/ibmuser/temp/resolv.conf Thu Dec 7 05:40:24 2023
***************
*** 0 ****
--- 1,2 ----
+ nameserver 127.0.0.1
+ TCPIPJOBNAME TCPIP

which follows the rules I explained above. *** 0 **** shows the content after line 0 is empty, because the next line is — 1,2 —- from the other file.

diff a b – show just the changes

gives

1d0             
< AOnly

3c2
< Aline2
---
> Bline2

4a4,5
> BOnly
> BOnly2

The output can be split into sections. The first line of each section is like

  • 1d0 the first line of a needs to be deleted from b, line 0
  • 3c2 line 3 of a is changed from line 2 of b
  • 4a4,5 lines 4,5 of b need to be added to a

The < and > tell you which file the data came from

When data is changed it gives the lines

  • < content of file a
  • output divider
  • > content of file b

When the data is in file a and not file b

  • < contents of file a

When the data is in file b and not file a

  • > contents of file b

diff -s dir1 dir2 compare the directory contents

If you specify -s, or just specify two directories, it compares the directory content.

You can use

diff -c1 dir1 dir2 

the -c1 to display the contents (how I like it).

With the directory entries you get records like

Only in /u/ibmuser/temp: test.tar 
Common subdirectories: /etc/wbem and /u/ibmuser/temp/wbem
Only in /etc: yylex.c
diff -c1 /etc/hosts /u/ibmuser/temp/hosts
*** /etc/hosts Wed Mar 6 11:06:55 2024
--- /u/ibmuser/temp/hosts Tue Feb 28 12:43:07 2023

You can find which files are missing from /etc , by using grep ‘Only in /u/ibmuser/temp’ on the output.

It shows the command used for the individual files, and the output

diff -c1 /etc/hosts /u/ibmuser/temp/hosts
*** /etc/hosts Wed Mar 6 11:06:55 2024
--- /u/ibmuser/temp/hosts Tue Feb 28 12:43:07 2023
...

diff -s -r dir1 dir2 compare the directory contents

The -r option displays the data recursively.

z/OS systems-ssl strange behaviour with environment variables

I was trying to use system ssl to write a program to use native z/O TLS facilities. I wasted a couple of hours because it said it could not find my keyring. Then when I collected a trace, it sometimes did not find the file – which did exist as I could list it.

If I used

//START1   EXEC PGM=GSKMAIN,REGION=0M, 
//* PARM='4000'
// PARM=('ENVAR("_CEE_ENVFILE=DD:STDENV")/4000')
//STDENV DD PATH='/u/ibmuser/gskparms'

When the USS file had

GSK_TRACE_FILE=/tmp/zzztrace.file 
GSK_TRACE=0xff
GSK_KEYRING_FILE=START1/TN3270

This worked file

When I used

//START1   EXEC PGM=GSKMAIN,REGION=0M, 
//* PARM='4000'
// PARM=('ENVAR("_CEE_ENVFILE=DD:STDENV")/4000')
//STDENV DD *
GSK_TRACE_FILE=/tmp/zzztrace.file
GSK_TRACE=0xff
GSK_KEYRING_FILE=START1/TN3270
/*

This failed to work.

If I looked in the trace file I had

ENTRY gsk_open_keyring(): ---> Keyring 'START1/TN3270                       ' 

Where it had taken the whole length of the line – and so START1/TN3270 padded with blanks was not found.

The trace file was not /tmp/zzztrace.file, it was /tmp/zzztrace.file padded with lots of blanks!

The answer is to use a environment file in USS, not in JCL or a data set.

Setting up for PING can be difficult!

  • Setting up ping within one machine is trivial
  • Setting up ping between two machines is relatively easy
  • Setting up ping with 3 machines can be hard to get right and to get working.

Most documentation describes how to set up PING between two machines, and does not mention anything more complex.

Ancient history (well 1980’s)

To understand modern TCPIP networks, it helps to know about the origins of Ethernet.

Originally with Ethernet there was a bus; a very long cable. You plugged your computer into this bus. Each computer on the bus had a unique 48 bit MAC address. You could send a request to another computer on this bus using their Ethernet address. You could send a request to ALL computers on this bus. These days this might be, send to all computers:”does anyone have this IP address…?”

An Ethernet bus is connected to an Ethernet router which can route packets to other routers.

Ethernet has evolved and instead of a long bus shared by all users, you have a switch device, and you plug the Ethernet cable from your computer to the switch. Conceptually it is the same, but with the switch you get much better performance and usability. You still have routers to get between different Ethernet environments. The request send to all computers:”does anyone have this IP address…?” goes to the switch, and the switch sends it to all plugged in computers.

When you start TCPIP, it configures the Ethernet hardware adapter with the addresses is should listen for.

Other terminology and concepts

  • A router is a networking device which forwards packets between different networks. Packets of data get sent from the originator through zero or more routers to the final destination.
  • A gateway has several meanings
    • It can connect different network types, for example act as a protocol translator, it can have a built-in firewall
    • It can route IP traffic between different IP subnets
    • It can loosely mean a router

Usually a router or gateway is a dedicated device or hardware.

You can have a computer act as a router or gateway; it takes data from one interface and routes it to a different interface. The computer can pass the data through a fire wall, or transform it, for example converting internal IP addresses to external IP addresses (NAT translation).

At a concept level, a computer’s network adapter is configured only to listen for packets which match the IP addresses of the interface, and ignore the rest.

If you want a computer to act as a router or gateway, you need to configure the network adapter to listen to all traffic, and to process the traffic. This is important when routing traffic from computer A through computer B to computer C.

A simple one hop ping, first time

I have set up my laptop to talk to a server over Ethernet.

I have configured my laptop using

sudo ip -6 addr add 2001:7::/64 dev enp0s31f6

This says for any IP V6 address starting with 2001:0007:0000:0000 then try sending it down the Ethernet cable with interface name enp0s31f6.

Ping a non existing address

If I try to ping an address say 2001:7::99 a packet is sent to all computers on the Ethernet bus

from myipAddress to everyone on the Ethernet bus, does anyone have IP address 2001:7::99?

There is no reply because no computer on the bus, has the address.

Ping the adjacent box

On the server, the IP address of the Ethernet cable is 2001:7::2.

If I ping this address, there are the following flows

From myipAddress (and myMAC) to everyone on the Ethernet bus, does anyone have IP address 2001:7::2?
From 2001:7::2 to myipAddress (MAC), yes I have that IP address. My Ethernet MAC address is …

My laptop puts the remote IP address and its MAC address into its neighbour cache, then issues the ping request.

The ping request looks in the neighbour cache find the IP address and MAC address and issues.

From myipAddress to 2001:7::2 at macAddress .. ping request.

The second and later times I issue a ping

The ping request looks in the neighbour cache find the IP address and MAC address and issues the ping

From myipAddress to 2001:7::2 at macAddress .. ping request.

So while the neighbour cache still has the IP of the target, and it’s MAC address, the ping can be routed to the next hop.

Setting up a multi hop ping

I have my laptop connected via a physical Ethernet cable to my server. The server is connected to z/OS via a virtual Ethernet connection. The IP address of the z/OS end of the virtual Ethernet cable is 2001:7::4.

The ping 2001:7::4 request on my laptop, does not work (as we saw above), because TCP asked everyone on the Ethernet bus if it has address 2001:7::4 – and no machine replies.

You need to define the link to the server machine as a gateway or router-like device which can handle IP addresses which are not on the Ethernet bus. You define it like

sudo ip -6 route add 2001:7::/64 via 2001:7::2

This says for any traffic with IP address 2001:0007:0000:0000:* send it via address 2001:7::2.

This requires 2001:7::2 to be known about, and so needs the following to be configured first

sudo ip -6 route add 2001:7::2 dev enp0s31f6

for TCPIP to know where to send the traffic to.

The route add 2001:7::2 dev enp0s31f6 command sends a broadcast on the Ethernet bus asking – does anyone have 2001:7::2. My server replies saying yes I have it. This is the same as for the ping above.

In summary

To send traffic on the same Ethernet bus you use

sudo ip -6 route add 2001:7::2 dev enp0s31f6

To route it via a router, switch, or computer acting as a router or switch you need.

sudo ip -6 route add 2001:7::/64 via 2001:7::2 or
sudo ip -6 route add 2001:7::/64 via 2001:7::2 <dev enp0s31f6>

The computer acting as a router or switch may need additional configuration. For example to allow it to route traffic from one Ethernet bus to another Ethernet bus you need to enable packet forwarding. On Linux to enable forwarding for all interfaces use

sysctl -w net.ipv6.conf.all.forwarding=1

On z/OS you use a static route such as

ROUTE 2001:7::/64 2001:7::3 JFPORTCP6

Where 2001:7::/64 is the range of addresses 2001:7::3 is (one of) the addresses of the interface at the remote end of the cabl, and JFPORTCP6 is the interface name. This is similar to the Linux route statement above.

You might need to set up the firewall. On my server I needed

sudo ufw route allow in on eno1 out on tap2

What IP Address does the sender have?

This is where is starts to get more complex.

Every network connection has at least one IP address.

With IP V6

  • Each interface gets an “internal” IP address such as fe80::9b07:33a1:aa30:e272
  • You can allocate an external address using the Linux command sudo ip -6 addr add 2001:7::1 dev enp0s31f6
  • If the interface has one external IP address configured, then this will be used.
  • If the interface has more than one external address configured then the first in the list may be used (not always).
  • If the interface does not have an external IP address, then TCPIP will find one from another interface, or allocate one for the duration of the request, such as 2a00:23c5:978f:9999:210a:1e9b:94a4:c8e. This address comes from my wireless connection

When my laptop is started, only the wireless connection has an IPV6 address. A ping request to 2001:7::4 had the origin IP address of 2a00:23c5:978f:9999:210a:1e9b:94a4:c8e which is the first address in the list for the wireless connection.

You can tell ping which address to use, for example

ping -I 2a00:23c5:978f:9999:cff1:dc13:4fc6:f21b 2007:1::4 (this fails because the server does now know to send the response to the requester)

I defined a new address for the interface using

sudo ip -6 addr rep 2002:7::1 dev enp0s31f6

I could issue ping -I 2002:7::1 2001:7::4, but this failed to get a response, because the back-end and intermediate nodes, did not know how to get the response back to 2002:7::1

Does this sender address matter?

Yes, because the remote end needs to have a definition to be able to send the response back.

I had my laptop connected to a Linux server over Ethernet, which in turn was connected to z/OS over a virtual Ethernet.

On z/OS I could see the ping request arrive, but it could not send the response back because it did not know how to get to 2a00:23c5:978f:9999….

I configured the laptop end of the Ethernet to give it an IP address 2001:7::1

sudo ip -6 addr add 2001:7::1/64 dev enp0s31f6

I configured the server to laptop to have an IP address of 2001:7::2

sudo ip -6 route add 2001:7::1/128 dev eno1

I configured the server to z/OS with an IP address and a route

sudo ip -6 addr add 2001:7::3/128 dev tap2
sudo ip -6 route add 2001:7::4/128 dev tap2

Now when I did the ping, the originator was 2001:7::1.

I configured the z/OS interface to send stuff back

INTERFACE JFPORTCP6 
DEFINE IPAQENET6
CHPIDTYPE OSD
PORTNAME PORTC

INTERFACE JFPORTCP6
ADDADDR 2001:DB8:8::9
INTERFACE JFPORTCP6
ADDADDR 2001:DB8::9
INTERFACE JFPORTCP6
ADDADDR 2001:7::4

START JFPORTCP6

and the routing

BEGINRoutes 
; Destination Gateway LinkName Size
ROUTE 2001:7::/64 2001:7::3 JFPORTCP6 MTU 5000
...
ENDRoutes

This says that all traffic with destination address 2001:0007:0000:000…. send to interface JFPORTCP6. This interface is connected to a gateway with the address of the remote end of the Ethernet (so on the server) of 2001:7::3.

The server machine needs to act as a router between the different Ethernet buses. You can display and configure this using

sysctl -a |grep forwarding
sudo sysctl -w net.ipv6.conf.all.forwarding=1

Packet forwarding on z/OS

By default the z/OS interface only listens for packets with one of the IP addresses of the interface. For z/OS to be able to be a router; accept all packets, and route them to other interfaces you need:

  • IPCONFIG DATAGRAMFWD in the TCP/IP Profile
  • PRIROUTER on the Interface definition . This configures the Ethernet adapter (hardware) so If a datagram is received at this device for an unknown IP address, the datagram is routed to this TCP/IP instance.

But normally if you are running z/OS you use a cheaper, physical router, rather than use the z/OS to do your routing. It might only be people like me who run z/OS on their laptop who want to try routing through z/OS.

Passing parameters to Java programs

There is a quote by Alexander Pope

A little learning is a dangerous thing.
Drink deep, or taste not the Pierian Spring;
There shallow draughts intoxicate the brain,
and drinking largely sobers us again.”

I was having problems passing parameters to Java. I had a little knowledge about Java web servers on z/OS, so I was confident I knew how to configure a web browser; unfortunately I did not have enough knowledge.

I was trying to get keyrings working with a web server on z/OS, but what I specified was not being picked up. In this post I’ll focus on keyrings, but it applies to other parameters are well.

Setting the default values

You can give Java a default keyring name

-Djavax.net.ssl.keyStore=safkeyring://START1/MQRING

Which says, if the user does not specify a keystore then use this value. If the user does not create a keystore, then a default keystore is created using these parameters.

Specify the value yourself

You can create your own keystore, passing keystore name and keystore type. You could use you own option, such as -DCOLINKEYRINGNAME=safkeyring://START1/MQRING and use that when configuring the keystore to Java. In this case the -Djavax.net.ssl.keyStore=safkeyring://START1/MQRING is not used. This is the same for other parameters.

Web server configuation

The Apache web server, and IBM Liberty web servers, are configured using server.xml files, and the web server creates the keystores etc using the information in the file.

For example a web server may have content like

<Connector port="${port.http}" protocol="${http.protocol}" 
     SSLEnabled="true" 
     maxThreads="550" scheme="https" secure="true" 
     clientAuth="false" sslProtocol="TLS" 
     keystoreFile="${keystoreFile}" 
     keystorePass="${keystorePass}" 
     keystoreType="${keystoreType}" 
     sslEnabledProtocols="${ssl.enabled.protocols}" /> 

where the ${…} is an environment variable. ${keystoreFile} would use the environment variable keystoreFile or option -DkeystoreFile=”safkeyring//START1/MYRING”

You can have multiple <Connector..> each with different parameters, so using -Djavax.net.ssl.keyStore etc would not work.

You need to know which options to use, because setting the Java defaults using -Djavax.net.ssl.keyStore… may be overridden.

You also need to know which server.xml file to use! I was getting frustrated when the changes I made to the server.xml was not the server.xml being used!

Write instructions for your target audience – not for yourself.

Over the last couple of weeks, I’ve been asked questions about installing two products on z/OS. I looked at the installation documentation, and it was written the way I would write it for myself – it was not written for other people to follow.

I sent some comments to one of the developers, and as the comments mainly apply to the other products as well, I thought I would write them down – for when another product comes along.

I’ve been doing some documentation of for AT-TLS which allows you to give applications TLS support, without changing the application, so I’ll focus on a product using TCP/IP.

What is the environment?

The environment can range from one person running z/OS on a laptop, to running a Parallel Sysplex where you have multiple z/OS instances running as a Single System Image; and taking it further, you can have multiple sites.

What levels of software

Within a Sysplex you can have different levels of software, for example one image at z/OS 2.4 and another image at z/OS 2.5 You tend to upgrade one system to the next release, then when this has been demonstrated to be stable, migrate the other systems in turn.

Within one z/OS image you can have multiple levels of products, for example MQ 9.2.3 and MQ 9.1. People may have multiple levels so they test the newer level, and when it looks stable, they switch to the newer level and later remove the older level. If the newer level does not work in production – they can easily switch back to the previous level.

Each version may have specific requirements.

  • If your product has an SVC, you may need an SVC for each version, unless the higher level SVC supports the lower level code.
  • If your product uses a TCP/IP port, you will need a port for each instance.

You need to ensure your product can run in this environment, with more than one version installed on an image.

How do things run?

Often z/OS images and programs run for many months. For example IPLing every three months to get the latest fixes on. Your product instance may run for 3 months before restarting. If you write message to the joblog, or have output going to the JES2 spool, you want to be able to purge old output without shutting down your instance. You can specify options to “spin” off output and make the file purge-able.

Your instance may need to be able to refresh its parameters. For example, if a key in a keyring changes, you need to close and reopen the keyring. This implies a refresh command, or the keyring is opened for each request.

Who is responsible for the system?

For me – I am the only person using the system and I am responsible for every thing.

For big systems there will be functions allocated to different departments:

  • Installation of software (getting the libraries and files to the z/OS image)
  • The z/OS systems team – creating and updating the base z/OS system
  • The Security team – this may be split into platform security(RACF), and network security
  • Data management – responsible for data, backup (and restore), migration of unused data sets to tape, ensuring there is enough disk space available.
  • Communications team – responsible for TCPIP connectivity, DNS, firewalls etc.
  • Database team – responsible for DB2 and other products
  • Liberty and z/OSMF etc built on top of Liberty.
  • MQ – responsible for MQ, and MQ to MQ connectivity.

Some responsibilities could be done by different teams, for example creating the security profile when creating a started task. This is a “security” task – but the z/OS systems programmer will usually do it.

How are systems changes managed?

Changes are usually made on a test system and migrated into production. I’ve seen a rule “nothing goes into production which has not been tested”. Some implications of this are

  • No changes are typed into production. A file can be copied into production, and a file may have symbolic substitution, such as SYSTEM=&SYSNAME. You can use cut and paste, but no typing. This eliminates problems like 0 being misread as O, and 1,i,l looking similar.
  • Changes are automated.
  • Every change needs a back-out process – and this back-out has been tested.
    • Delete is a 2 phase operation. Today you do a rename rather than a delete; next week you do the actual delete. If there is a problem with the change you can just rename it back again. Some objects have non obvious attributes, and if you recreate an object, it may be different, and not work the same way as it used to.

There are usually change review meetings. You have to write a change request, outlining

  • the change description
  • the impact on the existing system
  • the back-out plan
  • dependencies
  • which areas are affected.

You might have one change request for all areas (z/OS, security, networking), or a change request for each area, one for z/OS, one for security, one for networking.

Affected areas have to approve changes in their area.

How to write installation instructions

You need to be aware of differences between installing a product first time, and successive times. For example creating a security definition. It is easy to re-test an install, and not realise you already have security profiles set up. A pristine new image is great for testing installation because it is clean, and you have to do everything.

Instructions like

  • Task 1 – create sys1.proclib member
  • Task 2 – define security profile
  • Task 3 – allocate disk storage
  • Task 4 – define another security profile
  • Task 5 – update parmlib

may make sense when one person is doing the work, but not if there are many teams.

It is better to have a summary by role like

  • z/OS systems programmer
    • create proclib member
    • update parmlib
  • Security team
    • Define security profile 1
    • Define security profile 2
  • Storage management team
    • Allocate disk space

and have links to the specific topics. This way it is very clear what a team’s responsibilities are, and you can raise one change request per team.

This summary also gives a good road map so you can see the scale of the installation task.

It is also good to indicate if this needs to be done once only per z/OS image, or for every instance. For example

  • APF authorise the load libraries – once per z/OS image
  • Create a JCL procedure in SYS1.PROCLIB – once per instance

Some tasks for the different roles

z/OS system programmers

  • Create alias for MYPROD.* to a user catalog
  • APF authorise MYPROD…. datasets
  • Create PARMLIB entries
  • Update LNKLST and LPA
  • Update PROCLIB concatenation with product JCL
  • Create security profiles for any started tasks; which userid should be used?
  • WLM classification of the started task or job.
  • Schedule delete of any old log files older than a specified criteria
  • When multiple instances per LPAR, decide whether to use S MYSTASK1, S MYSTASK2, or S MYSTASK.T1, S MYSTASK.T2
  • Do you need to specify JESLOG SPIN to allows JES2 logs to be spun regulary, or when they are greater than a certain size, or any DD SYSOUT with SPIN?
  • ISPF
    • Add any ISPF Panels etc into logon procedures, or provide CLIST to do it.
    • Update your ISPF “extras” panel to add product to page.
  • Try to avoid SVCs. There are better ways, for example using authorized services.
  • Propagate the changes to all systems in the Sysplex.
  • What CF structures are needed. Do they have any specific characteristics, such as duplexed?
  • How much (e)CSA is needed, for each product instance.
  • Does your product need any Storage Class Memory (SCM).

Security team

  • Create groups as needed eg MYPRODSYS, MYPRODRO, and make requester’s userid group special, so they can add and remove userids to and from the groups.
  • Create a userid for the started task. Create the userid with NOPASSWORD, to prevent people logging on with the userid and password.
  • Protect the MYPROD.* datasets, for example members of group MYPRODSYS can update the datasets, members of group MYPRODRO only have read-only access.
  • Create any other profiles.
  • Create any certificate or keyrings, and give users access to them.
  • Set up profiles for who can issue operator commands against the jobs or procedures.
  • Does the product require an “applid”. For example users much have access to a specific APPL to be able to use the facilities. An application can use pthread_security_applid_np, to change the userid a thread is running on – but they must have access to an applid. The default applid is OMVSAPPL.
  • Do users needing to use this product need anything specific? Such as id(0), needing a Unix Segment, or access to any protected resources? See below for id(0).
  • If a client authenticates to the server, the server needs access to BPX.SERVER in the RACF FACILITY.
  • The started task userid may need access to BPX.DAEMON.
  • If a userid needs access to another user’s keyring, the requestor needs read access to user.ring.LST in CLASS(RDATALIB) or access to IRR.DIGTCERT.LISTRING.
  • If a userid needs access to a private key in a keyring the requester needs If a userid needs access to another user’s keyring, the requester needs control access to user.ring.LST in CLASS(RDATALIB).
  • You might need to program control data sets, for example RDEF PROGRAM * ADDMEM(‘SYS1.LINKLIB’//NOPADCHK) UACC(READ) .
  • Users may need access to ICSF class CSFSERV and CSFKEYS.
  • Use of CLASS(SURROGAT) BPX.SRV.<userid> to allow one userid to be a surrogate for another userid.
  • Use of CLASS(FACILITY) BPX.CONSOLE to remove the generation of BPXM023I messages on the syslog.

Storage team

  • How much disk space is needed once the product has been installed, for data sets, and Unix file systems. This includes product libraries and instance data, and logs which can grow without limit.
  • How much temporary space is needed during the install.
  • Where do Unix files for the product go? for example /opt/ or /var….
  • Where do instance files go. For example on image local disks, or sysplex shared disks. You have an instance on every member of the Sysplex – where you do put the instance files?
  • How much data will be produced in normal running – for example traces or logs.
  • When can the data be pruned?
  • Does the product need its own ZFS for instance data, to keep it isolated and so cannot impact other products.
  • Are any additional Storage Classes etc needing to be defined? These determine if and when datasets are migrated to tape, or get deleted.
  • Are any other definitions needed. For example for datasets MYPROD.LOG*, they need to go on the fastest disks, MYPROD.SAMPLES* can go on any disks, and could be migrated.

Database team

  • What databases, tables,indexes etc are required?
  • How much disk space is needed.
  • What volume of updates per second. Can the existing DB2 instances sustain the additional throughput?
  • What security and protection is needed at the table level and at the field level.
  • What groups are permitted to access which fields?
  • What auditing is needed?
  • Is encryption needed?

MQ

  • Do you need to uses MQ Shared Queue between queue managers?
  • How much data will be logged per second?
  • What is the space needed for the message storage, disk space, buffer pool and Coupling Facility?
  • Product specific definitions.
  • Security protection of any product specific definitions.

Networking

  • Which port(s) to use?
    • Do you need to control access to ports with the SAF resource on the PORT entry, and permit access to profile EZB.PORTACCESS.sysname.tcpname.resname
    • Use of SHAREPORT and SHAREPORTWLM
  • Use of Sysplex Distributor to route work coming in to a Sysplex to any available system?
  • Update the port list – so only specific job can use it
  • RACF profile for port?
  • Which cipher specs
  • Which level of TLS
  • Which certificates
  • Any AT-TLS profile?
  • Any firewall changes?
  • Any class of service?
  • Any changes to syslogd profile?
  • Are there any additional sites that will be accessed, and so need adding to the “allow” list.

Automation

  • If the started tasks, or jobs need to be started at IPL, create the definitions. Do they have any pre-reqs, for example requiring DB2 to be active.
  • If the jobs are shutdown during the day, should they be automatically restarted?
  • Add automation to shut down any jobs or started tasks, when the system is shutdown
  • Which product messages need to be managed – such as events requiring operator action, or events reported to the site wide monitoring team.

Operations

  • Play book for product, how to start, and stop it
  • Are there any other commands?

Monitoring

  • Any SMF data needed to be collected.
  • Any other monitoring.
  • How much additional CPU will be needed – at first, and in the long term.

Making your product secure

Many sites are ultra careful about keeping their system secure. The philosophy is give a user access for what they need to do – but no more. For example

  • They will not be comfortable installing a non IBM SVC into their system. An SVC can be used from any address space, so if there is any weakness in the SVC it could be exploiter.
  • Using id(0) (superuser) in Unix Services is not allowed. The userid needs to be given specific permission. If the code “changes userid” then services like pthread_security_applid_np() should be used; where the applid is part of the configuration. Alternatives include __login_applid. End users of this facility will need read access to the specific applid.

TLS and SSL

If you are using TLS there are other considerations

  • Any certificate you generate needs a long validity date, and JCL to recreate it when it expires.
  • If you create a Certificate Authority you need to document how to export it and distribute it to other platforms
  • Browsers and application may verify the host name, so you need to generate a certificate with a valid name. The external z/OS name may be different from the internal name.
  • You should support TLS V1.2 and TLS 1.3 Other TLS and SSL versions are deprecated.
  • It is good practice to have one keyring with the server certificate with its private key, and a “common” trust store keyring which has the Certificate Authorities for all the sites connecting to the z/OS image. If you connect to a new site, you update the common keyring, and all applications pick up the new CA. If you have one keyring just for your instance, you need to maintain multiple keyrings when a new certificate is added, one for each application.