Follow the instructions to install Java on z/OS and screw up production.

I downloaded an update to Java on z/OS, and noticed that the instructions would cause problems!

Java is in a directory like

/usr/lpp/java/J7.1        
/usr/lpp/java/J7.1_64     
/usr/lpp/java/J8.0        
/usr/lpp/java/J8.0_64

I have Java V7 and V8 and 31 and 64 versions

The installation instruction say

cd to /usr/lpp/java and unpax the uploaded SDK8_64bit_SR8_FP6.PAX.Z file.

This will unpack the file into

/usr/lpp/java/J8.0_64

Overwriting your production system with the new version, so it goes live without any testing, and for a few minutes you have a mixture of files from different fixpacks!

What I would suggest is

Unzip the file gzip -l -d ibm-semeru-open-jre_x64_linux_11.0.20_8_openj9-0.40.0.tar.gz this upacks the file into a .tar – and delete the .gz
FTP it to z/OS in binary
Create a ZFS for the new Java.
mkdir /usr/lpp/java/new
mount the ZFS in /usr/lpp/java/new
cd /usr/lpp/java/new
pax -ppx -rvzf SDK8_64bit_SR8_FP6.PAX.Z

This will create files under

/usr/lpp/java/new/J8.0_64

Test this by setting

export JAVA_HOME="/usr/lpp/java/new/J8.0_64"

and using it.

When you have finished your testing, you can mount the new ZFS on /usr/lpp/java/J8.0_64 .

You may want to do this as part of an IPL, because an unmount of a ZFS will terminate any thing using the file system – such as z/OSMF.

You could set up /usr/lpp/java/old/J8.0_64 and /usr/lpp/java/new/J8.0_64 and use /usr/lpp/java/J8.0_64 as an alias to one or the other.

How to debug a bash script when the easy way does not work.

I was trying to debug a program product, and to specify a Java override. The configuration used bash scripts. My problem was which script should I put my fix in?

How to debug bash scripts

From searching in the internet, the way of debugging a simple self contained bash scripts is either

edit the file and add the -x as in #!/bin/sh -x this give output for just this file.
or use sh -x … command line option to enable trace

The sh -x … command applies to the specified program name. You cannot enable the trace this way for any programs it calls.

There is no way of saying trace this command …and any others programs it calls.

Editing the script may not easy

Editing the file was a challenge, as it was on a read only file system. I had to unmount it, and mount it read/write. Easy on my one person z/OS ; not so easy on a typical z/OS image with many users. You do not want to make a change and break someone else.

I used the Unix command

df -P /usr/lpp/IBM/...

on the file to find which file system it was on, then used the TSO commands

unmount filesystem('ABC100.ZFS') Immediate
mount filesystem('ABC100.ZFS') mode(RDWR)                   
  type(HFS) mountpoint('/usr/lpp/IBM/abcdef')

Or change the BPX* parmlib member and re-ipl which seems overkill.

I did not really want to have to edit the production scripts – but this was only at the server startup. Only one instance was affected.

The chicken and the egg problem

How do I know which file to edit to add the “-x”? After lots of investigation I found one file deep down, but I could not see who called it.

I used two commands

ps -o pid,ppid,args -p $PPID   
ps -o pid,ppid,args -u $LOGNAME

The first says give me information about the threads parent PID.

The second says gave information about threads running with the userid of the thread

The first gave

      PID       PPID COMMAND                                                                         
 33620145   67174575 /bin/sh -c /Z24C/usr/lpp/IBM/..../bin/envvars.sh -F/parameter

Which shows the thread was called from the envvars.sh script with the parameter -F/parameter. This was a start, but who called envvars.sh?

The second command gave

         PID       PPID COMMAND  
(1) 83951745          1 BPXBATCH    
(2) 83951791   83951745 /bin/sh /usr/lpp/IBM/.../start.sh                                     
(7) 50397360   33619999 oedit /usr/lpp/IBM/.../bin/config.final.env                           
(3) 83951836   s83951791 /bin/sh /Z24C/usr/lpp/IBM/.../bin/abcmain.sh run -config /Z24C       
(5) 67174622   33620191 /bin/sh -x /Z24C/usr/lpp/IBM/.../bin/envvars.sh -F/tmp/RSEAPI_e       
(4) 33620191   83951836 /bin/sh -c /Z24C/usr/lpp/IBM/.../bin/envvars.sh -F/tmp/RSEAPI_e       
(6) 83951881   67174622 ps -o pid,ppid,args -u STCRSE

You have to start at the top of the process tree, and find the children of each process. The parent and child have the same colour process ids above.

I know the script was started from BPXBATCH. This has a process id of 83951745 (first column of data) in red.
The process with this process id as a parent (PPID) (second column of data) in red. is /bin/sh /usr/lpp/IBM/…/start.sh This is what BPXBATCH executes. If you have a system with lots of BPXBATCH instances running, you can locate the command to find which thread is of interest. This process has a process id of 83951791
This process invokes /bin/sh /Z24C/usr/lpp/IBM/…/bin/abcmain.sh run -config /Z24C with a PID of 83951836.
This process invokes /bin/sh -c /Z24C/usr/lpp/IBM/…/bin/envvars.sh -F/tmp/RSEAPI_e with a PID of 33620191
This invokes /bin/sh -x /Z24C/usr/lpp/IBM/…/bin/envvars.sh -F/tmp/RSEAPI_e (again!)
This issues the command ps -o pid,ppid,args -u STCRSE which displays the thread information. We have got to the end.
Was from me editing a file in an Unix Services session, so not relevant to the investigation

Who am I ?

You can use

echo "path to me ->  ${0}     "

which gave me

path to me -> /home/colinpaice/Downloads/test.sh

What would I do to make it easier to debug?

In each shell script have code like

#if the symbol myprog_text exists
if [ -z ${myprog_test+x} ]; 
then 
   # echo "myprog_test is not set"; 
else
  # echo "specified "${myprog_test}
  # if it has a value > 0 then write the path
  if [ $myprog_test -gt 0 ]
  then 
     echo "# path to me --------------->  ${0}     "
  fi
  #if the value is 2 or more then start the trace
  if [ $myprog_test -gt 1 ]
  then 
     set -x 
  fi
fi

This checks to see if global variable myprog_test is set, if it set to a 1 or larger, it displays the path name, if it is set to 2 or larger it turns trace on, using set -x.

With this each script checks a variable product_myscript (where myscript is the name of the script), and takes the appropriate action.

To turn the traces you list all of the scripts in the directory, then use

export myprod_myscript=1

for each script (of interest). This will then give you a trace of which scripts were invoked. You can then set

Some scripts use a variable $SHLVL which give you the call depth. This would be useful,but his is not supported in shell in z/OS.

Is it as easy(!) as this?

No quite. Many services are started as a started task, where the JCL is like

//ABCAPI   EXEC PGM=BPXBATCH,REGION=0M,TIME=NOLIMIT, 
//            PARM='PGM &HOME./tomcat.base/start.sh'

You can put your global variables in STDENV. Note: this is a list of variables, not a shell script, so you cannot do

xxx=1
yyy=$xxx

Why doesn’t ctrl-s work in ISPF edit? – ah it does now.

I had been editing a file, saving it, and finding the changes were not being picked up. Looking back, it was obvious; I was using CTRL-S the familiar Linux command, instead of F3 on ISPF.

I fixed this by configuring X3270 (on Linux).

My file /home/colin/.x3270pro now has

...
x3270.keymap: mine
! Definition of the 'mine' keymap
x3270.keymap.mine: #override \
    Alt<Key>4:          String("\\x00a2")\n\
    Ctrl<Key>backslash: String("\\x00a2")\n\
    <Key>Escape:    Clear()\n\
    <Key>End:        FieldEnd()\n\
    Ctrl<Key>Delete:   EraseEOF()\n\
    Ctrl<Key>Right:    NextWord()\n\
    Ctrl<Key>Left:    PreviousWord()\n\
    Ctrl<Key>Up:    Home()\n\
    <Key>Control_L: Reset()\n\
    <Key>Control_R: Reset()\n\
    <Key>Prior: PF(7)\n\
    <Key>Next: PF(8)\n\
    <Btn3Down>:   PA(1)\n\
    Ctrl<Key>1:   PA(1)\n\
    Ctrl<Key>s:   MoveCursor(3,15) String("save") Enter()\n\

When I started a new X3270 session, Ctrl -S went to the command line, typed save and pressed enter. Job Done ! The numbers are 0 based, so 3 means line 4 on the screen.

This makes life so much easier!

Parsing command line values

I wanted to pass multiple parameters to a z/OS batch program and parse the data. There are several different ways of doing it – what is the best way ?

This question is complicated by

You may have more data than fits in the PARM= in the JCL. See Defining the parameter string to JCL.
C provides some parse routines. Although they work in Unix System Services, they do not work 100% in batch JCL. See Parsing with single character options, and getsubopt to parse keyword=value.
It may be easier to write a “parser” yourself than work around the usability of the solutions. See My basic command line parser(101)
Or have it table driven. Advanced – table – ize it

Checking options

Processing command line options can mean a two stage process. Reading the command line, and then checking to ensure a valid combination of options have been specified.

If you have an option -debug with a value in range 0 to 3. You can either check the range as the option is processed, or have a separate section of checks once all the parameters have been passed. If there is no order requirement on the parameters you need to have separate code to check the parameters. If you can require order to the parameters, you might be able to have code “if -b is specified, then check -a has already been specified“

I usually prefer a separate section of code at it makes the code clearer.

Command styles

On z/OS there are two styles of commands

def xx(abc) parm1(value) xyz

or the Unix way

-def -xx abc -parm1 -1 -a –value value1 -xyz.

Where you can have

short options “-a” and “-1”
long option with two “-“, as in “–value”,
“option value” as is “-xx abc”
“option and concatenated value” as in “-xyz”; option -x, value yz

I was interested in the “Unix way”.

One Unix way is to have single character option names like -a -A -B -0. This is easy to program – but it means the end user needs to lookup the option name every time as the options are not usually memorable.
Other platforms (but not z/OS) have parsing support for long names like – -userid value.
You can parse a string like ro,rw,name=value, where you have keyword=value using getsubopt.
I wrote a simple parser, and a table driven parser for when I had many options.

Defining the parameter string toJCL.

The traditional way of defining a parameter string in batch is EXEC PGM=MYPROG,PARM=’….’ but the parameter is limited in length.

I tend to use

// SET P1=COLIN.PKIICSF.C 
// SET P2="optional"
//S1 EXEC PGM=MYPROG,PARM='parms &P1 &P2'

You can get round the parameter length limitation using

//ISTEST   EXEC PGM=CGEN,REGION=0M,PARMDD=MYPARMS 
//MYPARMS DD * 
/ 
 -detail 0 
 -debug 0 
 -log "COLINZZZ" 
 -cert d

Where the ‘/’ on its own delimits the C run time options from my program’s options.

The values are start in column 2 of the data. If it starts in column 1, the value is concatenated to the value in the previous line.

You can use JCL and System symbols

// EXPORT SYMLIST=(*) 
// SET LOG='LOG LOG' 
//ISTEST   EXEC PGM=CGEN,REGION=0M,PARMDD=MYPARMS 
//MYPARMS DD *,SYMBOLS=EXECSYS
/ 
 -log "COLINZZZ" 
 -log "&log"
 ...

This produced -log COLINZZZ -log “LOG LOG”…

Parsing the data

C main programs have two parameters, a count of the number of parameter, and an array of null terminated strings.

You can process these

int main( int argc, char *argv??(??)) 
{ 
  int iArg; 
  for (iArg = 1;iArg< argc; iArg ++   ) 
  { 
    printf(".%s.\n",argv[iArg]); 
  } 
  return 0; 
}

Running this job

//CPARMS   EXEC  CCPROC,PROG=PARMS 
//ISTEST   EXEC PGM=PARMS,REGION=0M,PARMDD=MYPARMS 
//MYPARMS DD * 
/ 
 -debug 0 
 -log "COLIN  ZZZ" 
 -cert 
 -ae colin@gmail.com

gave

.-debug.                   
.0.                        
.-log.                     
.COLIN  ZZZ.               
.-cert.                    
.-ae.                      
.colin@gmail.com.

and we can see the string “COLIN ZZZ” in double quotes was passed in as a single string.

Parsing with single character options

C has a routine getopt, for processing single character options like -a… and -1… (but not -name) for example

while ((opt = getopt(argc, argv, "ab:c:")) != -1) 
   { 
       switch (opt) { 
       case 'a': 
           printf("-a received\n"); 
           break; 
       case 'b': 
           printf("-b received \n"); 
           printf("optarg %d\n",optarg); 
           if (optarg) 
             printf("-b received value %s\n",optarg); 
           else 
             printf("-b optarg is0       \n"); 
           break; 
       case 'c': 
           printf("-c received\n"); 
           printf("optarg %d\n",optarg); 
           if (optarg) 
             printf("-c received value %s\n",optarg); 
           else 
             printf("-c optarg is0       \n"); 
           break; 
       default: /* '?' */ 
           printf("Unknown n"); 
     } 
   }

The string “ab:c:” tells the getopt function that

-a is expected with no option
-b “:” says an option is expected
-c “:” says an option is expected

I could only get this running in a Unix environment or in a BPXBATCH job. In batch, I did not get the values after the option.

When I used

//BPX EXEC PGM=BPXBATCH,REGION=0M,
// PARM='PGM /u/tmp/zos/parm.so -a -b 1 -cc1 '

the output included

-b received value b1
-c received value c1

This shows that “-b v1” and “-cc1” are both acceptable forms of input.

Other platforms have a getopt_long function where you can pass in long names such as –value abc.

getsubopt to parse keyword=value

You can use getsubopt to process an argument string like “ro,rw,name=colinpaice”.

If you had an argument like “ro, rw, name=colinpaice” this is three strings and you would have to use getsubopt on each string!

You have code like

int main( int argc, char *argv??(??)) 
{ 
 enum { 
       RO_OPT = 0, 
       RW_OPT, 
       NAME_OPT 
   }; 
   char *const token[] = { 
       [RO_OPT]   = "ro", 
       [RW_OPT]   = "rw", 
       [NAME_OPT] = "name", 
       NULL 
   }; 
   char *subopts; 
   char *value; 

   subopts = argv[1]; 
 while (*subopts != '\0' && !errfnd) { 
   switch (getsubopt(&subopts, token, &value)) { 
     case RO_OPT: 
       printf("RO_OPT specified \n"); 
       break; 
     case RW_OPT: 
       printf("RW_OPT specified \n"); 
       break; 
     case NAME_OPT: 
       if (value == NULL) { 
          printf("Missing value for " 
                 "suboption '%s'\n", token[NAME_OPT]); 
           continue; 
       } 
       else 
         printf("NAME_OPT value:%s\n",value);
         break; 
    default: 
         printf("Option not found %s\n",value); 
         break; 
     }  // switch 
   } // while 
 }

Within this is code

enum.. this defines constants RO_OPT = 0 RW_OP = 1 etc
char const * token defines a mapping from keywords “ro”,”rw” etc to the constants defined above
getsubopt(&subopts, token, &value) processes the string, passes the mapping, and the field to receive the value

This works, but was not trivial to program

It did not support name=”colin paice” with an imbedded blank in it.

My basic command line parser(101)

I have code

for (iArg = 1;iArg< argc; iArg ++   ) 
{ 
  // -cert is a keyword with no value it is present or not
  if (strcmp(argv[iArg],"-cert") == 0) 
  { 
    function_code = GENCERT    ; 
    continue; 
  } 
  else 
  //  debug needs an option
  if (strcmp(argv[iArg],"-debug") == 0 
      && iArg +1 < argc) // we have a value 
      { 
        iArg  ++; 
        debug = atoi(argv[iArg]); 
        continue; 
      } 
  else 
  ...
  else 
    printf("Unknown parameter or problem near parameter %s\n", 
           argv[iArg]);
  }   // for outer - parameters

This logic processes keywords with no parameters such as -cert, and keyword which have a value such as -debug.

The code if (strcmp(argv[iArg],”-debug”) == 0 && iArg +1 < argc) checks to see if the keyword has been specified, and that there is a parameter following it (that is, we have not run off the end of the parameters).

Advanced – table – ize it

For a program with a large number of parameters I used a different approach. I created a table with option name, and pointer to the fields variable.

For example

getStr lookUpStr[] = { 
    {"-debug", &debug     }, 
    {"-type",  &type       }, 
    {(char *) -1,  0} 
   };

You then check each parameter against the list. To add a new option – you just update the table, with the new option, and the variable.

int main( int argc, char *argv??(??)) 
{ 
   char * debug = "Not specified"; 
   char * type   = "Not specified"; 
   typedef struct getStr 
   { 
      char * name; 
      char ** value; 
   } getStr; 
   getStr lookUpStr[] = { 
       {"-debug", &debug     }, 
       {"-type",  &type       }, 
       {(char *) -1,  0} 
      }; 
  int iArg; 
  for (iArg = 1;iArg< argc; iArg ++   ) 
  { 
   int found = 0; 
   getStr * pGetStr =&lookUpStr[0];
   // iterate over the options with string values
   for (; pGetStr -> name != (char *)  -1; pGetStr ++) 
   { 
     // look for the arguement in the table
     if (strcmp(pGetStr ->name, argv[iArg]) == 0) 
     { 
       found = 1; 
       iArg ++; 
       if (iArg < argc) // if there are enough parameters
                        // so save the pointer to the data
        *( pGetStr -> value)= argv[iArg] ; 
       else 
         printf("Missing value for %s\n", argv[iArg]);       
       break;  // skip the rest of the table
     }  // if (strcmp(pGetStr ->name, argv[iArg]) == 0) 
     if (found > 0) break; 
    } // for (; pGetStr -> name != (char *)  -1; pGetStr ++) 
   
   if (found == 0) 
   // iterate over the options with int values 
   ....
  } 
  printf("Debug %s\n",debug); 
  printf("Type  %s\n",type ); 
  return 0; 
}

This can be extended so you have

getStr lookUpStr[] = { 
    {"-debug", &debug, "char" }, 
    {"-type",  &type ,"int"       }, 
    {(char *) -1,  0, 0} 
   };

and have logic like

if (strcmp(pGetStr ->name, argv[iArg]) == 0) 
     { 
       found = 1; 
       iArg ++; 
       if (iArg < argc) // if there are enough parmameters
       {
       if ((strcmp(pGetStr -> type, "char") == 0 
        *( pGetStr -> value)= argv[iArg] ; 
       else 
        if ((strcmp(pGetStr -> type, "int ") == 0 )
        *( pGetStr -> value)= atoi(argv[iArg]) ;
      ...   
     }

You can go further and have a function pointer

getStr lookUpStr[] = { 
    {"-debug", &debug,myint }, 
    {"-loop", &loop  ,myint },  
    {"-type",  &type , mystring  }, 
    {"-type",  &type , myspecial  }, 
    {(char *) -1,  0, 0} 
   };f

and you have a little function for each option. The function “myspecial(argv[iarg])” looked up values {“approved”, “rejected”…} etc and returned a number representation of the data.

This takes a bit more work to set up, but over all is cleaner and clearer.

Setting up z/OS for TLS clients

There is a lot of configuration needed when setting up TLS(SSL) between a server and a client. There are many options and it is easy to misconfigure. The diagnostic information you get when the TLS handshake fails is usually insufficient to identify any problems.

You need the following on z/OS:

One or more Certificate Authority certificates. You can create and use your own for testing. If you want to work with external sites you’ll need a proper (external) CA, but for validation and proof of concept you can create your own CA. You could set up a top level CA CN=CA,O=MYORG, and another one (signed by CA=CA,O=MYORG), called CN=CA,OU=TEST,O=MYORG. Either or both of the public CA certificates will need to be sent to the clients in imported into their keystore.
A private/public key, signed by a CA, (such as signed by CA=CA,OU=TEST,O=MYORG).
The private key is associated with a userid.
- The signing operation takes the data (the public key), does a hash sum calculation on the data, encrypts this hash sum, and stores the encrypted hash value, and CA public certificate with the original data. To check the signature, the receiving application compares the CA with its local copy, if that matches, does the same checksum calculation, decodes the encrypted hash sum – and checks the decrypted and locally calculated values match.
- A certificate is created using one from a list of algorithms. (For example, Elliptic Curve, RSA). When the certificate is sent to the client, the client needs to support the algorithm. Either end can be configured, for example, to support Elliptic Curve, but not RSA.
A keyring to contain your private key(s) – this can also contain CA public certificates of the partners (clients or servers).
A “site” keyring (public keystore, or trust ring) which holds the public CA certificates of all the other sites you work with. If you have only one keyring per user or application, you need to update each of them if you need to an a new CA to your environment. Many applications are only designed to work with one keyring. Java applications tend to have a key store(for the private key) and a trust store for the CAs.
Some applications can support more than one private certificate on a keyring. The certificate needs to match what the client can support.
For certificates which are sent to your server, you need a copy of the CA(s) used to sign the incoming certificate. If you have a copy of the CA, then you can validate any certificate that the CA signed. This means you do not have to have a copy of the public certificate of every client. You just need the CA.
- Some application need access to just one CA in the chain, other applications need access to all certificates in the CA chain.
As part of the TLS handshake
- the client sends up a list of the valid cipher specs it supports (which algorithms, and size of key)
- the server sends down a subset of the list of cipher spec to use (from the client’s list)
- the server can also send down its certificate, which contains information such as the distinguished name CN=zSERVER, OU=TEST, O= MYORG, and host name.
- the client can validate these names – to make sure the host name in the certificate matches the host, and what it was expecting.
- if requested, the client can send up its certificate for identification. The server can validate the certificate, and can optionally map it to a userid on the server.
A userid can be given permission to read certificate in another user’s keyring. A userid needs a higher level of authority to be able to access the private key in another id’s keyring.

Create the Certificate Authority

//IBMRACF  JOB 1,MSGCLASS=H                               
//S1  EXEC PGM=IKJEFT01,REGION=0M                         
//SYSPRINT DD SYSOUT=*                                    
//SYSTSPRT DD SYSOUT=*                                    
//SYSTSIN DD * 
RACDCERT certauth LIST(label('DOCZOSCA')) 
RACDCERT CERTAUTH DELETE(LABEL('DOCZOSCA'))               
RACDCERT GENCERT  -                                         
  CERTAUTH -                                                
  SUBJECTSDN(CN('DocZosCA')- 
             O('COLIN') -                                   
             OU('CA')) - 
  NOTAFTER(   DATE(2027-07-02  ))-                          
  KEYUSAGE(   CERTSIGN )  -      
  SIZE(2048) -                                              
  WITHLABEL('DOCZOSCA') 
/*
//

This certificate is created against “user” CERTAUTH. Keyusage CERTSIGN means it can be used to sign certificates. “user” CERTAUTH is often displayed internally as “irrcerta”.

Once it has been created the certificate should be connected to every ring that may use it, see below.

Export the CA certificate to a file so, clients can access it

RACDCERT CERTAUTH EXPORT(LABEL('DOCZOSCA')) -
  DSN('IBMUSER.CERT.DOC.CA.PEM') -
  FORMAT(CERTB64) -
  PASSWORD('password')

The file looks like

This can be sent to the clients, so they can validate certificates sent from the server. This file could be sent using cut and paste, or FTP.

Create the keyring for user START1.

The instructions below lists the ring first – in case you need to know what it was before you deleted it”

RACDCERT LISTRING(TN3270)  ID(START1) 

RACDCERT DELRING(TN3270) ID(START1) 

RACDCERT ADDRING(TN3270) ID(START1)                                                          

RACDCERT LISTRING(TN3270)  ID(START1) 
SETROPTS RACLIST(DIGTCERT,DIGTRING ) refresh

Connect the CA to every keyring that needs to use it

RACDCERT ID(START1) CONNECT(RING(TN3270) - 
                            CERTAUTH LABEL('DOCZOSCA'))

Create a user certificate and sign it on z/OS

This creates a certificate and gets is signed – as one operation. You can create a certificate, export it, sent it off to a remote CA, import it, and add it to a userid.

RACDCERT ID(START1) DELETE(LABEL('NISTECC521')) 
                                                                
RACDCERT ID(START1) GENCERT -                                   
  SUBJECTSDN(CN('10.1.1.2') - 
             O('NISTECC521') -                                  
             OU('SSS')) -                                       
   ALTNAME(IP(10.1.1.2))-                                       
   NISTECC - 
   KEYUSAGE(HANDSHAKE) - 
   SIZE(521) - 
   SIGNWITH (CERTAUTH LABEL('DOCZOSCA')) -                      
   WITHLABEL('NISTECC521')     -                                
                                                                
RACDCERT id(START1) ALTER(LABEL('NISTECC521'))TRUST             

RACDCERT ID(START1) CONNECT(RING(TN3270) -                      
                            ID(START1)  -                       
                            DEFAULT  - 
                            LABEL('NISTECC521') )               
SETROPTS RACLIST(DIGTCERT,DIGTRING ) refresh                    
RACDCERT LIST(LABEL('NISTECC521' )) ID(START1)                  
RACDCERT LISTRING(TN3270)  ID(START1)

This creates a certificate with type Elliptic Curve (NISTECC) with a key size of 521. It is signed with the CA certificate created above.

The ALTNAME, is a field the client can verify that the Source Name in the certificate matches the IP address of the connection.

It is connected to the user’s keyring as the DEFAULT. The default certificate is used if the label of a certificate is not specified when using the keyring.

Give a user access to the keyring

PERMIT START1.TN3270.LST CLASS(RDATALIB)  -    
    ID(COLIN )  ACCESS(UPDATE )                          
SETROPTS RACLIST(RDATALIB) refresh

Update access give userid COLIN access to the private key.
Read access only gives access to the public keys in the ring.

You would typically give a group of userids access, not just to individual userids.

Import the client’s CA’s used to sign the client certificates

This is the opposite to Export the CA certificate to a file so clients can access it above.

Copy the certificate to z/OS. This can be done using FTP or cut and paste.

Use it!

I used it in AT-TLS

TTLSConnectionAdvancedParms       TNCOonAdvParms 
{ 
 ServerCertificateLabel  NISTECC521
 ...
} 
TTLSSignatureParms                TNESigParms 
{ 
   CLientECurves Any 
} 
TTLSEnvironmentAction                 TNEA 
{ 
  HandshakeRole                       ServerWithClientAuth 
  TTLSKeyringParms 
  { 
    Keyring                   start1/TN3270 
  } 
...
}

Write instructions for your target audience – not for yourself.

Over the last couple of weeks, I’ve been asked questions about installing two products on z/OS. I looked at the installation documentation, and it was written the way I would write it for myself – it was not written for other people to follow.

I sent some comments to one of the developers, and as the comments mainly apply to the other products as well, I thought I would write them down – for when another product comes along.

I’ve been doing some documentation of for AT-TLS which allows you to give applications TLS support, without changing the application, so I’ll focus on a product using TCP/IP.

What is the environment?

The environment can range from one person running z/OS on a laptop, to running a Parallel Sysplex where you have multiple z/OS instances running as a Single System Image; and taking it further, you can have multiple sites.

What levels of software

Within a Sysplex you can have different levels of software, for example one image at z/OS 2.4 and another image at z/OS 2.5 You tend to upgrade one system to the next release, then when this has been demonstrated to be stable, migrate the other systems in turn.

Within one z/OS image you can have multiple levels of products, for example MQ 9.2.3 and MQ 9.1. People may have multiple levels so they test the newer level, and when it looks stable, they switch to the newer level and later remove the older level. If the newer level does not work in production – they can easily switch back to the previous level.

Each version may have specific requirements.

If your product has an SVC, you may need an SVC for each version, unless the higher level SVC supports the lower level code.
If your product uses a TCP/IP port, you will need a port for each instance.

You need to ensure your product can run in this environment, with more than one version installed on an image.

How do things run?

Often z/OS images and programs run for many months. For example IPLing every three months to get the latest fixes on. Your product instance may run for 3 months before restarting. If you write message to the joblog, or have output going to the JES2 spool, you want to be able to purge old output without shutting down your instance. You can specify options to “spin” off output and make the file purge-able.

Your instance may need to be able to refresh its parameters. For example, if a key in a keyring changes, you need to close and reopen the keyring. This implies a refresh command, or the keyring is opened for each request.

Who is responsible for the system?

For me – I am the only person using the system and I am responsible for every thing.

For big systems there will be functions allocated to different departments:

Installation of software (getting the libraries and files to the z/OS image)
The z/OS systems team – creating and updating the base z/OS system
The Security team – this may be split into platform security(RACF), and network security
Data management – responsible for data, backup (and restore), migration of unused data sets to tape, ensuring there is enough disk space available.
Communications team – responsible for TCPIP connectivity, DNS, firewalls etc.
Database team – responsible for DB2 and other products
Liberty and z/OSMF etc built on top of Liberty.
MQ – responsible for MQ, and MQ to MQ connectivity.

Some responsibilities could be done by different teams, for example creating the security profile when creating a started task. This is a “security” task – but the z/OS systems programmer will usually do it.

How are systems changes managed?

Changes are usually made on a test system and migrated into production. I’ve seen a rule “nothing goes into production which has not been tested”. Some implications of this are

No changes are typed into production. A file can be copied into production, and a file may have symbolic substitution, such as SYSTEM=&SYSNAME. You can use cut and paste, but no typing. This eliminates problems like 0 being misread as O, and 1,i,l looking similar.
Changes are automated.
Every change needs a back-out process – and this back-out has been tested.
- Delete is a 2 phase operation. Today you do a rename rather than a delete; next week you do the actual delete. If there is a problem with the change you can just rename it back again. Some objects have non obvious attributes, and if you recreate an object, it may be different, and not work the same way as it used to.

There are usually change review meetings. You have to write a change request, outlining

the change description
the impact on the existing system
the back-out plan
dependencies
which areas are affected.

You might have one change request for all areas (z/OS, security, networking), or a change request for each area, one for z/OS, one for security, one for networking.

Affected areas have to approve changes in their area.

How to write installation instructions

You need to be aware of differences between installing a product first time, and successive times. For example creating a security definition. It is easy to re-test an install, and not realise you already have security profiles set up. A pristine new image is great for testing installation because it is clean, and you have to do everything.

Instructions like

Task 1 – create sys1.proclib member
Task 2 – define security profile
Task 3 – allocate disk storage
Task 4 – define another security profile
Task 5 – update parmlib

may make sense when one person is doing the work, but not if there are many teams.

It is better to have a summary by role like

z/OS systems programmer
- create proclib member
- update parmlib
Security team
- Define security profile 1
- Define security profile 2
Storage management team
- Allocate disk space

and have links to the specific topics. This way it is very clear what a team’s responsibilities are, and you can raise one change request per team.

This summary also gives a good road map so you can see the scale of the installation task.

It is also good to indicate if this needs to be done once only per z/OS image, or for every instance. For example

APF authorise the load libraries – once per z/OS image
Create a JCL procedure in SYS1.PROCLIB – once per instance

Some tasks for the different roles

z/OS system programmers

Create alias for MYPROD.* to a user catalog
APF authorise MYPROD…. datasets
Create PARMLIB entries
Update LNKLST and LPA
Update PROCLIB concatenation with product JCL
Create security profiles for any started tasks; which userid should be used?
WLM classification of the started task or job.
Schedule delete of any old log files older than a specified criteria
When multiple instances per LPAR, decide whether to use S MYSTASK1, S MYSTASK2, or S MYSTASK.T1, S MYSTASK.T2
Do you need to specify JESLOG SPIN to allows JES2 logs to be spun regulary, or when they are greater than a certain size, or any DD SYSOUT with SPIN?
ISPF
- Add any ISPF Panels etc into logon procedures, or provide CLIST to do it.
- Update your ISPF “extras” panel to add product to page.
Try to avoid SVCs. There are better ways, for example using authorized services.
Propagate the changes to all systems in the Sysplex.
What CF structures are needed. Do they have any specific characteristics, such as duplexed?
How much (e)CSA is needed, for each product instance.
Does your product need any Storage Class Memory (SCM).

Security team

Create groups as needed eg MYPRODSYS, MYPRODRO, and make requester’s userid group special, so they can add and remove userids to and from the groups.
Create a userid for the started task. Create the userid with NOPASSWORD, to prevent people logging on with the userid and password.
Protect the MYPROD.* datasets, for example members of group MYPRODSYS can update the datasets, members of group MYPRODRO only have read-only access.
Create any other profiles.
Create any certificate or keyrings, and give users access to them.
Set up profiles for who can issue operator commands against the jobs or procedures.
Does the product require an “applid”. For example users much have access to a specific APPL to be able to use the facilities. An application can use pthread_security_applid_np, to change the userid a thread is running on – but they must have access to an applid. The default applid is OMVSAPPL.
Do users needing to use this product need anything specific? Such as id(0), needing a Unix Segment, or access to any protected resources? See below for id(0).
If a client authenticates to the server, the server needs access to BPX.SERVER in the RACF FACILITY.
The started task userid may need access to BPX.DAEMON.
If a userid needs access to another user’s keyring, the requestor needs read access to user.ring.LST in CLASS(RDATALIB) or access to IRR.DIGTCERT.LISTRING.
If a userid needs access to a private key in a keyring the requester needs If a userid needs access to another user’s keyring, the requester needs control access to user.ring.LST in CLASS(RDATALIB).
You might need to program control data sets, for example RDEF PROGRAM * ADDMEM(‘SYS1.LINKLIB’//NOPADCHK) UACC(READ) .
Users may need access to ICSF class CSFSERV and CSFKEYS.
Use of CLASS(SURROGAT) BPX.SRV.<userid> to allow one userid to be a surrogate for another userid.
Use of CLASS(FACILITY) BPX.CONSOLE to remove the generation of BPXM023I messages on the syslog.

Storage team

How much disk space is needed once the product has been installed, for data sets, and Unix file systems. This includes product libraries and instance data, and logs which can grow without limit.
How much temporary space is needed during the install.
Where do Unix files for the product go? for example /opt/ or /var….
Where do instance files go. For example on image local disks, or sysplex shared disks. You have an instance on every member of the Sysplex – where you do put the instance files?
How much data will be produced in normal running – for example traces or logs.
When can the data be pruned?
Does the product need its own ZFS for instance data, to keep it isolated and so cannot impact other products.
Are any additional Storage Classes etc needing to be defined? These determine if and when datasets are migrated to tape, or get deleted.
Are any other definitions needed. For example for datasets MYPROD.LOG*, they need to go on the fastest disks, MYPROD.SAMPLES* can go on any disks, and could be migrated.

Database team

What databases, tables,indexes etc are required?
How much disk space is needed.
What volume of updates per second. Can the existing DB2 instances sustain the additional throughput?
What security and protection is needed at the table level and at the field level.
What groups are permitted to access which fields?
What auditing is needed?
Is encryption needed?

MQ

Do you need to uses MQ Shared Queue between queue managers?
How much data will be logged per second?
What is the space needed for the message storage, disk space, buffer pool and Coupling Facility?
Product specific definitions.
Security protection of any product specific definitions.

Networking

Which port(s) to use?
- Do you need to control access to ports with the SAF resource on the PORT entry, and permit access to profile EZB.PORTACCESS.sysname.tcpname.resname
- Use of SHAREPORT and SHAREPORTWLM
Use of Sysplex Distributor to route work coming in to a Sysplex to any available system?
Update the port list – so only specific job can use it
RACF profile for port?
Which cipher specs
Which level of TLS
Which certificates
Any AT-TLS profile?
Any firewall changes?
Any class of service?
Any changes to syslogd profile?
Are there any additional sites that will be accessed, and so need adding to the “allow” list.

Automation

If the started tasks, or jobs need to be started at IPL, create the definitions. Do they have any pre-reqs, for example requiring DB2 to be active.
If the jobs are shutdown during the day, should they be automatically restarted?
Add automation to shut down any jobs or started tasks, when the system is shutdown
Which product messages need to be managed – such as events requiring operator action, or events reported to the site wide monitoring team.

Operations

Play book for product, how to start, and stop it
Are there any other commands?

Monitoring

Any SMF data needed to be collected.
Any other monitoring.
How much additional CPU will be needed – at first, and in the long term.

Making your product secure

Many sites are ultra careful about keeping their system secure. The philosophy is give a user access for what they need to do – but no more. For example

They will not be comfortable installing a non IBM SVC into their system. An SVC can be used from any address space, so if there is any weakness in the SVC it could be exploiter.
Using id(0) (superuser) in Unix Services is not allowed. The userid needs to be given specific permission. If the code “changes userid” then services like pthread_security_applid_np() should be used; where the applid is part of the configuration. Alternatives include __login_applid. End users of this facility will need read access to the specific applid.

TLS and SSL

If you are using TLS there are other considerations

Any certificate you generate needs a long validity date, and JCL to recreate it when it expires.
If you create a Certificate Authority you need to document how to export it and distribute it to other platforms
Browsers and application may verify the host name, so you need to generate a certificate with a valid name. The external z/OS name may be different from the internal name.
You should support TLS V1.2 and TLS 1.3 Other TLS and SSL versions are deprecated.
It is good practice to have one keyring with the server certificate with its private key, and a “common” trust store keyring which has the Certificate Authorities for all the sites connecting to the z/OS image. If you connect to a new site, you update the common keyring, and all applications pick up the new CA. If you have one keyring just for your instance, you need to maintain multiple keyrings when a new certificate is added, one for each application.

How do I trace TCP/IP sockets on z/OS?

I stumbled on this by accident.

In the TCPIP.DATA configuration file you can specify

 TRACE SOCKET

I copied the configuration file, and made the change. I used it in my JCL

//SYSTCPD DD DISP=SHR,DSN=USER.Z24C.TCPPARMS(MYDATA)

Note if you use TRACE SOCKET in the configuration file used by every one – then every one will get their sockets traced – which may not be what you want.

The output to SYSPRINT is like

request = HCreate                                                                         
                                                                                          
EZY3829I  pre   0xe3e2d9c2 00c00001 00010000 00000020 e3c3d7c9 d7404040 00000000 00000000 
EZY3830I        0x00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 
EZY3831I        0x00000000 1fa77318 00000000 00000000 00000000 00000080 00000000 00000000 
EZY3832I        0x00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 
EZY3833I        0xffff0002 00000000 00000000 40404040 40404040 f18681f7 f68686f8 00000000 
EZY3834I        0x00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 
                                                                                          
request = HCreate                                                                         
                                                                                          
EZY3835I  post  0xe3e2d9c2 00c00001 00010000 00000020 e3c3d7c9 d7404040 00000000 00000000 
EZY3830I        0x7f5ec0f0 00010000 00000000 00000000 00000000 00000000 00000000 00000000 
EZY3831I        0x00000000 1fa77318 00000000 00000000 00000000 00000080 00000000 00000000 
EZY3832I        0x00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 
EZY3833I        0xffff0002 00000031 00000000 40404040 40404040 f18681f7 f68686f8 00000000 
EZY3834I        0x00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000

These messages and the content are not documented – they are for IBM Software Support.

Compiling the TCP/IP samples on z/OS

Communications server (TCPIP) on z/OS provides some samples. I had problems getting these to compile, because the JCL in the documentation was a) wrong and b) about 20 years behind times.

Samples

There are some samples in TCPIP.SEZAINST

TCPS: a server which listens on a port
TCPC: a client which connects to a server using IP address and port
UDPC: C socket UDP client
UDPS: C socket UDP server
MTCCLNT: C socket Multitasking client
MTCSRVR: C socket Multitasking server
MTCCSUB: C socket subtask MTCCSUB

The JCL I used is

//COLCOMPI   JOB 1,MSGCLASS=H,COND=(4,LE) 
//S1          JCLLIB ORDER=CBC.SCCNPRC 
// SET LOADLIB=COLIN.LOAD 
// SET LIBPRFX=CEE 
// SET SOURCE=COLIN.C.SOURCE(TCPSORIG) 
//COMPILE  EXEC PROC=EDCCB, 
//       LIBPRFX=&LIBPRFX, 
//       CPARM='OPTFILE(DD:SYSOPTF),LSEARCH(/usr/include/)', 
// BPARM='SIZE=(900K,124K),RENT,LIST,RMODE=ANY,AMODE=31' 
//COMPILE.SYSLIB DD 
//               DD 
//               DD DISP=SHR,DSN=TCPIP.SEZACMAC 
//*              DD DISP=SHR,DSN=TCPIP.SEZANMAC  for IOCTL 
//COMPILE.SYSOPTF DD * 
DEF(_OE_SOCKETS) 
DEF(MVS) 
LIST,SOURCE 
TEST 
RENT ILP32        LO 
INFO(PAR,USE) 
NOMARGINS EXPMAC   SHOWINC XREF 
LANGLVL(EXTENDED) sscom dll 
DEBUG 
/* 
//COMPILE.SYSIN    DD  DISP=SHR,DSN=&SOURCE 
//BIND.SYSLMOD DD DISP=SHR,DSN=&LOADLIB. 
//BIND.SYSLIB  DD DISP=SHR,DSN=TCPIP.SEZARNT1 
//             DD DISP=SHR,DSN=&LIBPRFX..SCEELKED 
//* BIND.GSK     DD DISP=SHR,DSN=SYS1.SIEALNKE 
//* BIND.CSS    DD DISP=SHR,DSN=SYS1.CSSLIB 
//BIND.SYSIN DD * 
  NAME  TCPS(R) 
//START1   EXEC PGM=TCPS,REGION=0M, 
// PARM='4000          ' 
//STEPLIB  DD DISP=SHR,DSN=&LOADLIB 
//SYSERR   DD SYSOUT=*,DCB=(LRECL=200) 
//SYSOUT   DD SYSOUT=*,DCB=(LRECL=200) 
//SYSPRINT DD SYSOUT=*,DCB=(LRECL=200)

Change the source

The samples do not compile with the above JCL. I needed to remove some includes

#include <manifest.h> 
// #include <bsdtypes.h> 
#include <socket.h> 
#include <in.h> 
// #include <netdb.h> 
#include <stdio.h>

With the original sample I got compiler messages

ERROR CCN3334 CEE.SCEEH.SYS.H(TYPES):66 Identifier dev_t has already been defined on line 98 of “TCPIP.SEZACMAC(BSDTYPES)”.
ERROR CCN3334 CEE.SCEEH.SYS.H(TYPES):77 Identifier gid_t has already been defined on line 101 of “TCPIP.SEZACMAC(BSDTYPES)”.
ERROR CCN3334 CEE.SCEEH.SYS.H(TYPES):162 Identifier uid_t has already been defined on line 100 of “TCPIP.SEZACMAC(BSDTYPES)”.
ERROR CCN3334 CEE.SCEEH.H(NETDB):87 Identifier in_addr has already been defined on line 158 of “TCPIP.SEZACMAC(IN)”.

INFORMATIONAL CCN3409 TCPIP.SEZAINST(TCPS):133 The static variable “ibmcopyr” is defined but never referenced.

I tried many combinations of #define but could not get it to compile, unless I removed the #includes.

Compile problems I stumbled upon

Identifier dev_t has already been defined on line ...                                                     
Identifier gid_t has already been defined on line ...                                                     
Identifier uid_t has already been defined on line ....

This was caused by the wrong libraries in SYSLIB. I needed

CEE.SCEEH.H
CEE.SCEEH.SYS.H
TCPIP.SEZACMAC
TCPIP.SEZANMAC

The compile problems were caused by CEE.SCEEH.SYS.H being missing.

Execution problems

I had some strange execution problem when I tried to use AT-TLS within the program.

EDC5000I No error occurred. (errno2=0x05620062)

The errno2 reason from TSO BPXMTEXT 05620062 was

BPXFSOPN 04/27/18
JRNoFileNoCreatFlag: A service tried to open a nonexistent file without O_CREAT

Action: The open service request cannot be processed. Correct the name or the open flags and retry the operation.

Which seems very strange. I have a feeling that this field is not properly initialised and that this value can be ignored.

Colin’s “TCPIP on z/OS” message explanations

Purpose

This blog post is a repository of my interpretation of the messages from the Z/OS communications server family of products. Ive tried to add more information, or explain what some of the values are. it is aimed at search engines, not as a readable article.

EZZ7853I AREA LINK STATE DATABASE

This message can come from

OSPF external advertisements : The DISPLAY TCPIP,tcpipjobname,OMPROUTE,OSPF,EXTERNAL
OSPF area link state database: The DISPLAY TCPIP,tcpipjobname, OMPROUTE, OSPF, DATABASE, AREAID=area-id

in topic DISPLAY TCPIP,,OMPROUTE.

Type

Router links advertisement
Network links advertisements
Network summaries
Autonomous System(whole network) summaries
Autonomous System(whole network) external advertisements (DISPLAY TCPIP, tcpipjobname, OMPROUTE, OSPF,EXTERNAL)

EZZ0318I HOST WAS FOUND ON LINE 8 AND FIRST HOP ADDRESS OR AN = WAS EXPECTED

I got this with

ROUTE 2001:db8::7/128 host 2001:db8:1::3    IFPORTCP6      MTU 5000

Which has a first hop address! The problem was /128. Remove this and it worked. If you then issue TSO NETSTAT ROUTE it gives

DestIP:   2001:db8::7/128 
  Gw:     2001:db8:1::3 
  Intf:   IFPORTCP6         Refcnt:  0000000000 
  Flgs:   UGHS              MTU:     5000

EZZ7904I Packet authentication failure, from 10.1.1.1, type 2

An OSPF packet of the specified type was received. The packet fails to authenticate.

System programmer response

Verify the authentication type and authentication key specified for the appropriate interfaces on this and the source router. The types and keys must match in order for authentication to succeed. If MD5 authentication is being used and OMPROUTE is stopped or recycled, ensure that it stays down for at least 3 times the largest configured dead router interval of the OSPF interfaces that use MD5 authenticaiton, in order to age out the authentication sequence numbers on routers that did not recycle.

Types are

0 Null authentication
1 Simple password
2 Cryptographic authentication

See OSPF Version 2.

From the message description, this could be a timing issue.

EZZ7921I OSPF adjacency failure, neighbor 10.1.1.1, old state 128, new state 4, event 10

EZZ7921I.

I got this restarting frr on Linux.

The Neighbor State Codes can be one of the following:

1 Down
2 Attempt
4 Init (session has (re) started
8 2-way
16 ExStart
32 Exchange
64 Loading
128 Full. the router has sent and received an entire sequence of Database Description Packets.

The Neighbor Event Codes can be one of the following:

7 SeqNumberMismatch
8 BadLSReq
10 1-way. An Hello packet has been received from the neighbor, in which this router is not mentioned. This indicates that communication with the neighbor is not bidirectional. For example the remote end is restarting.
11 KillNbr
12 InactivityTimer
13 LLDown
15 NoProg. This event is not described in RFC1583. This is an indication that adjacency establishment with the neighbor failed to complete in a reasonable time period (Dead_Router_Interval seconds). Adjacency establishment restarts.
16 MaxAdj. This event is not described in RFC2328. This indicates that OMPROUTE has exceeded the futile neighbour state loop threshold (DR_Max_Adj_Attempt). Even if a redundant parallel interface (primary or backup) exists, OMPROUTE continues to attempt to establish adjacency with the same neighbouring designated router over the existing or alternate interface.

EZZ7905I No matching OSPF neighbor for packet from 10.1.1.1, type 4

EZZ7905I No matching OSPF neighbor for packet from 10.1.1.1, type 4
EZZ7904I Packet authentication failure, from 10.1.1.1, type 2

I got these when I was using OSPF Authentication_type=MD5, and the Authentication_Key_ID did not match.

BPXF024I

You get messages prefixed by this message if SYSLOGD is not running.

For example

BPXF024I (TCPIP) Oct 6 10:11:10 omproute 67174435 : EZZ8100I OMPROUTE subagent Starting

With the SYSLOGD running you get

EZZ8100I OMPROUTE SUBAGENT STARTING

TELNET and AT-TLS

EZZ6035I TN3270 DEBUG CONN DETAIL 1035-00 Policy is invalid for the conntype specified.

EZZ6035I TN3270 DEBUG CONN DETAIL 
IP..PORT: 10.1.0.2..34588
CONN: 0000004E LU: MOD: EZBTTACP
RCODE: 1035-00 Policy is invalid for the conntype specified.
PARM1: PARM2: SECURE PARM3: POLICY NOT APPLCNTRL

POLICY NOT APPLCNTRL

The AT-TLS policy needs

TTLSEnvironmentAdvancedParms CSQ1-ENVIRONMENT-ADVANCED 
{ 
  ApplicationControlled         On 
...
}

Now you know, it is obvious that APPLCNTRL in the message means ApplicationControlled!

PARM2: SECURE PARM3: NO POLICY

EZZ6035I TN3270 DEBUG CONN   DETAIL                      
  RCODE: 1035-00  Policy is invalid for the conntype specified.      
  PARM1:          PARM2: SECURE   PARM3: NO POLICY

There is no AT-TLS policy for the port being used. The message does not tell you which port or policy is being used. The operator command “D TCPIP,TN3270,PROFILE” shows which ports are in use.

EZZ6060I TN3270 PROFILE DISPLAY 968                            
  PERSIS   FUNCTION        DIA  SECURITY    TIMERS   MISC      
 (LMTGCAK)(OPATSKTQSSHRTL)(DRF)(PCKLECXN23)(IPKPSTS)(SMLT)     
  L******  ***TSBTQ***RT*  TJ*  TSTTTT**TT  IP**STT  SMD*      
----- PORT:  2023  ACTIVE           PROF: CURR CONNS:      0

The TS under security mean TLS connection, Secure Connection.

Use the Unix commands pasearch -t 1>a oedit a to display the configuration and search for “port”. The port value may be specified – or it may be within a range.

LocalPortFrom: 2023 LocalPortTo: 2025

EZZ6035I TN3270 RCODE: 1030-01 TTLS Ioctl failed for query or init HS.

PARM1: FFFFFFFF PARM2: 00000464 PARM3: 77B77221

The PARM1 value is the return value, the PARM2 value is the return code, and the PARM3 value is the reason code for the ioctl failure; these values are defined in z/OS UNIX System Services Messages and Codes.

Error numbers. 464 is ENOTCONN:The socket is not connected
Reason codes 7221: The connection was not in the proper state for retrieving.

I got this when

there was problems with the System SSL configuration, such as invalid certificate name,
when the z/OS certificate was not suitable eg the key needed to be bigger
the HandshakeRole ServerWithClientAuth was specified – it should be HandshakeRole Server
Breton Imhauser said this could also include a TCP connection flood – crude Denial Of Service attempt of TN3270. This is what it looks like when a set of remote clients are repeatedly establishing a tcp connection with your TN3270 and hanging up. They establish the TCP connection and FIN-ACK it without telnet negotiation. User may claim it was a “heartbeat” test of the host.

In my /etc/syslog.conf I have

daemon.debug /var/log/SSHDdebug

There were additional messages in this file after the TLS handshake problem.

EZZ6035I TN3270 DEBUG CONFIG EXCEPTION RCODE: 600F-00 System SSL initiation failed.

PARM1: 000000CA PARM2: 00000000 PARM3: GSK_ENVIRONMENT_INIT

AT-TLS did not have access to the keyring. For example need access to

RDEFINE RDATALIB START1.MQRING.LST UACC(NONE)
PERMIT START1.MQRING.LST CLASS(RDATALIB) ID(TCPIP) ACCESS(CONTROL)
tso setropts refresh raclist(rdatalib)

and perhaps access to

PERMIT IRR.DIGTCERT.LISTRING CLASS(FACILITY) ID(TCPIP) ACCESS(READ)

1030-02 – also to do with keyrings.

EZZ6035I TN3270 DEBUG TASK EXCEPTION TASK: MAIN MOD: EZBTZMST
RCODE: 1016-01 Port Task setup failed.
PARM1: 0000102B PARM2: 00000BCF PARM3: 00000000
EZZ6006I TN3270 CANNOT LISTEN ON PORT 3023, CONNECTION MANAGER TERMINATED, RSN =102B

This was caused by

PORT 
...
   3023 TCP *   SAF     VERIFY

and getting

EZD1313I REQUIRED SAF SERVAUTH PROFILE NOT FOUND EZB.PORTACCESS.S0W1.TCPIP.VERIFY

Define the profiles and give the userid access to it.

OMPRoute

EZZ7815I Socket 11 bind to port 521, address :: failed, errno=111:EDC5111I Permission denied., errno2=74637246

This was caused by

PORT
   520 UDP OMPROUTE            ; RouteD Server 
   521 UDP OMPROUTE            ; RouteD Server for IP V6

The name after the UDP (OMPROUTE) did not match my job name which was trying to use it.

EDC5111I Permission denied. errno2=0x744C7246.

0x744C7246 744C7246. This problem occurred with using port 22 (Telnet).

Changing to port 2222 showed that it was just port 22, the other configuration worked.

Commenting out the RESTRICTLOWPORTS and the PORT reservation for “22 SSHD” showed it was one of those.

Using the RESTRICTLOWPORTS parameter to control access to unreserved ports below port 1024 (an application cannot obtain a port in the range 1 – 1023 that has not been reserved by a PORT or PORTRANGE statement, unless the application is APF-authorized or has OMVS superuser [UID(0)] authority).

The solution was to use port reservation such as

    22 TCP SSHD* NOAUTOLOG  ; OpenSSH SSHD server

EZZ7811I COULD NOT ESTABLISH AFFINITY WITH INET, ERRNO=1011:

EDC8011I A NAME OF A PFS WAS SPECIFIED THAT EITHER IS NOT CONFIGURED OR IS NOT A SOCKETS PFS., ERRNO2=11B3005A

I had RESOLVER_CONFIG=//’ADCD.Z24C.TCPPARMS(TCPDATA)’ pointing to an invalid data set.

EZZ7937I THE IPV6 OSPF ROUTING PROTOCOL IS DISABLED

The message in the documentation is pretty useless.

It means there was no valid IPV6 interfaces defined, and no IPV6 addresses.

EZZ7956I OSPF area 0.0.0.3 not configured, interface JFPORTCP6 not installed

The documentation

I was missing an IPv6_area for the interface

IPv6_AREA Area_Number=0.0.0.3; 
IPv6_OSPF_Interface 
      Name = JFPORTCP6 
      Attaches_To_area=0.0.0.3 
      Prefix=2001:db8:8::/64 
      ; 
IPv6_Default_Route 
      Name=JFPORTCP6 
      Next_Hop=2300::1 
      ; 
IPv6_OSPF 
        RouterID = 7.7.7.7

EZZ8125I IPV6 OSPF ROUTERS NONE

The documentation is useless.

I got NONE even though I had a router.

EZZ7886I NOT CONNECTED TO AREA SPECIFIED ON … DISPLAY COMMAND

I got this response to the F P1,IPV6OSPF,database command. You do not specify an area!

OSPF on z/OS, basic commands

This article follows on from getting the simplest example of OSPF working. It gives the z/OS commands to display useful information.

I want to

Display a summary of the configuration, number of links, number of interfaces (one line of output per area). F OMP1,OSPF,areasum
Display the routers. F OMP1,ospf,database,areaid=0.0.0.0
Display the directly connected links. F OMP1,ospf,neighbor
Display all links, by IP address from F OMP1,OSPF, LSA, LSTYPE=2,LSID=…
Display links for a specific ospf router. F OMP1,OSPF, LSA, LSTYPE=1,LSID=…
Display the IP addresses in the network. Either use F OMP1,RTTABLE or for each router F OMP1,OSPF,LSA,LSTYPE=1,LSID=…. , LINK ID: is the IP address of the remote end, LINK DATA: is the IP address of the router’s end.
Display all of the routers in the network. F OMP1,ospf,database,areaid=0.0.0.0 and extract “LS ORIGINATOR”
Display the connection from an ospf router. F OMP1,OSPF,LSA,LSTYPE=1,LSID=….

OMP1

I configured multiple TCPIP subsystems, and each one had an OMPROUTE defined. I used a started task OEMP1, as the OMPROUTE for my base TCPIP.

If you have only one TCPIP subsystem, you can use OMPROUTE as your name.

F OMP1,OSPF,areasum

This displays the area summary.

AREA ID        AUTHENTICATION   #IFCS  #NETS  #RTRS  #BRDRS DEMAND     
0.0.0.0           NONE              2      3      3      0  OFF

F OMP1,OSPF,EXTERNAL

EZZ7853I AREA LINK STATE DATABASE                        
TYPE LS DESTINATION     LS ORIGINATOR     SEQNO     AGE   XSUM
                # ADVERTISEMENTS:       0                     
                CHECKSUM TOTAL:         0X0

F OMP1,ospf,list,areas

“Displays all information concerning configured OSPF areas and their associated ranges.”

 EZZ7832I AREA CONFIGURATION 820 
 AREA ID          AUTYPE          STUB? DEFAULT-COST IMPORT-SUMMARIES? 
 0.0.0.0          0=NONE           NO          N/A           N/A 
                                                                               
 --AREA RANGES-- 
 AREA ID          ADDRESS          MASK             ADVERTISE? 
 0.0.0.0          11.11.0.0        255.255.255.0    YES

The entry with address 11.11.0.0 comes from the omproute configuration file entry

range ip_address=11.11.0.1 
      subnet_mask=255.255.255.0 
      ;

F OMP1,ospf,list,ifs

“For each OSPF interface, display the IP address and configured parameters as coded in the
OMPROUTE configuation file”

 EZZ7833I INTERFACE CONFIGURATION 822 
 IP ADDRESS      AREA             COST RTRNS TRDLY PRI HELLO  DEAD DB_EX 
 10.1.3.2        0.0.0.0             1     5     1   1    10    40    40 
 10.1.1.2        0.0.0.0             1     5     1   1    10    40    40

F OMP1,ospf,list,nbma

“Displays the interface address and polling interval related to interfaces connected to nonbroadcast multiaccess networks.”

 NBMA CONFIGURATION 824 
 INTERFACE ADDR      POLL INTERVAL 
 << NONE CONFIGURED >>

F OMP1,ospf,list,nbrs

“Displays the configured neighbors on non-broadcast networks”

 NEIGHBOR CONFIGURATION 826 
 NEIGHBOR ADDR     INTERFACE ADDRESS   DR ELIGIBLE? 
 << NONE CONFIGURED >>

F OMP1,ospf,list,vlinks

“Displays all virtual links that have been configured with this router as an endpoint.”

VIRTUAL LINK CONFIGURATION 828 
 VIRTUAL ENDPOINT     TRANSIT AREA      RTRNS  TRNSDLY HELLO  DEAD DB_EX 
 << NONE CONFIGURED >>

F OMP1,ospf,database,areaid=0.0.0.0

EZZ7853I AREA LINK STATE DATABASE                           
TYPE LS DESTINATION     LS ORIGINATOR     SEQNO     AGE   XSUM     
  1  1.2.3.4            1.2.3.4         0X80000013   61  0X3D8D    
  1  9.2.3.4            9.2.3.4         0X8000001A  393  0X5A78    
  1 @10.1.1.2           10.1.1.2        0X8000000D  286  0X9E22    
  2  10.1.0.2           1.2.3.4         0X80000006 1241  0XC35E    
  2  10.1.1.1           9.2.3.4         0X80000003  353  0X8197    
  2 @10.1.1.2           10.1.1.2        0X80000005 3600  0X64BD    
  2  10.1.3.1           9.2.3.4         0X80000003  383  0X6BAB    
  2 @10.1.3.2           10.1.1.2        0X80000005 3600  0X4ED1

(LS) Type is described here.

Router links advertisement
Network link advertisement
Summary link advertisement
Summary ASBR advertisement
Autonomous System (AS -think entire network) external link.

LS ORIGINATOR: Indicates the router that originated the advertisement.
LS DESTINATION: Indicates an IP destination (network, subnet, or host).

From the above

TYPE LS DESTINATION     LS ORIGINATOR
  2  10.1.0.2           1.2.3.4

means router 1.2.3.4 told every one that it has the end of a network link, and its address is 10.1.0.2.

TYPE LS DESTINATION     LS ORIGINATOR      
  1  1.2.3.4            1.2.3.4

says router 1.2.3.4 told every one “here I am, router 1.2.3.4”.

You can use the type and destination in the command:

F OMP1,OSPF,LSA,LSTYPE=…,LSID=…

For example

F OMP1,OSPF,LSA,LSTYPE=1,LSID=1.2.3.4
F OMP1,OSPF,LSA,LSTYPE=2,LSID=10.1.0.3

below.

F OMP1,OSPF,LSA,LSTYPE=1,LSID=1.2.3.4

This allows you to see a lot of information about an individual element of the OSPF database.

LSTYPE=1 is for Router Links Advertisment.

The valid LSID values are given in the output of F OMP1,ospf,database,areaid=0.0.0.0 above.

F OMP1,OSPF,LSA,LSTYPE=1,LSID=9.2.3.4 
EZZ7880I LSA DETAILS  
  LS DESTINATION (ID): 9.2.3.4                     
  LS ORIGINATOR:   9.2.3.4 
  ROUTER TYPE:      (0X00)                         
  # ROUTER IFCS:   3                        
    LINK ID:          10.1.0.2        
    LINK DATA:        10.1.0.3        
    INTERFACE TYPE:   2               
    
    LINK ID:          10.1.1.2        
    LINK DATA:        10.1.1.1        
    INTERFACE TYPE:   2               
   
    LINK ID:          10.1.3.2        
    LINK DATA:        10.1.3.1        
    INTERFACE TYPE:   2

LINK ID: Is the IP address of the remote end
LINK DATA: Is the IP address of the router’s end
INTERFACE TYPE: 2 is “Network links”.

F OMP1,OSPF,LSA,LSTYPE=2,LSID=10.1.0.3

This allows you to see a lot of information about an individual element of the OSPF database.

LSTYPE=2 is “Network links the set of routers attached to a network”.

The valid LSID values are given in the output of F OMP1,ospf,database,areaid=0.0.0.0 above, with type=2.

F OMP1,OSPF,LSA,LSTYPE=2,LSID=10.1.0.3                     
EZZ7880I LSA DETAILS                                   
LS OPTIONS:      E (0X02)                          
LS TYPE:         2                                 
LS DESTINATION (ID): 10.1.0.3                      
LS ORIGINATOR:   9.2.3.4                           
NETWORK MASK:    255.255.255.0                     
 ATTACHED ROUTER: 1.2.3.4          (100)    
 ATTACHED ROUTER: 9.2.3.4          (100)

Where (100) is the metric.

F OMP1,ospf,if

 EZZ7849I INTERFACES 832 
 IFC ADDRESS     PHYS         ASSOC. AREA     TYPE   STATE  #NBRS  #ADJS 
 10.1.3.2        JFPORTCP4    0.0.0.0         BRDCST   64      1      1 
 10.1.1.2        ETH1         0.0.0.0         BRDCST   64      1      1

F OMP1,ospf,neighbor

EZZ7851I NEIGHBOR SUMMARY 834 
 NEIGHBOR ADDR   NEIGHBOR ID     STATE  LSRXL DBSUM LSREQ HSUP IFC 
 10.1.3.1        9.2.3.4           128      0     0     0  OFF JFPORTCP4 
 10.1.1.1        9.2.3.4           128      0     0     0  OFF ETH1

F OMP1,ospf,routers

EZZ7855I OSPF ROUTERS 836 
DTYPE RTYPE DESTINATION       AREA           COST       NEXT HOP(S) 
  NONE

F OMP1,ospf,statistics

EZZ7856I OSPF STATISTICS 838 
                 OSPF ROUTER ID:         10.1.1.2 (*OSPF) 
                 EXTERNAL COMPARISON:    TYPE 2 
                 AS BOUNDARY CAPABILITY: NO 
                                                                          
 ATTACHED AREAS:                  1  OSPF PACKETS RCVD:             3336 
 OSPF PACKETS RCVD W/ERRS:        0  TRANSIT NODES ALLOCATED:         84 
 TRANSIT NODES FREED:            78  LS ADV. ALLOCATED:                1 
 LS ADV. FREED:                   1  QUEUE HEADERS ALLOC:             32 
 QUEUE HEADERS AVAIL:            32  MAXIMUM LSA SIZE:               512 
 # DIJKSTRA RUNS:                 4  INCREMENTAL SUMM. UPDATES:        0 
 INCREMENTAL VL UPDATES:          0  MULTICAST PKTS SENT:           3371 
 UNICAST PKTS SENT:               7  LS ADV. AGED OUT:                 1 
 LS ADV. FLUSHED:                 1  PTRS TO INVALID LS ADV:           0 
 INCREMENTAL EXT. UPDATES:        0

F OMP1,OSPF,LSA,LSTYPE=2,LSID=10.1.0.3

Where

LSTYPE=2 is “Network links the set of routers attached to a network”.
10.1.0.3 is an LS destination (from F OMP1,ospf,database,areaid=…) It comes from the frr definition below

interface eno1
   ip address 10.1.0.3 peer 10.1.0.2/24

Only addresses on the Server are accepted. Addresses from the Laptop are not valid.

In the command F OMP1,OSPF,LSA,LSTYPE=1,LSID=1.2.3.4, some of the LINK IDs seem to be valid.

F OMP1,OSPF,LSA,LSTYPE=1,LSID=x.x.x.x

This allows you to see a lot of information about an individual element of the OSPF database.

The LSATYPE is described in here. LSTYPE=1 is for Router Links Advertisment.

The LSID is one of the routers, for example in

F OMP1,ospf,database,areaid=0.0.0.0, it displays, LS DESTINATION LS ORIGINATOR
F OMP1,ospf,neighbor, it displays NEIGHBOR ID

F OMP1,OSPF,LSA,LSTYPE=1,LSID=9.2.3.4 
EZZ7880I LSA DETAILS  
  LS DESTINATION (ID): 9.2.3.4                     
  LS ORIGINATOR:   9.2.3.4 
  ROUTER TYPE:      (0X00)                         
  # ROUTER IFCS:   3                               
     LINK ID:          10.1.0.3               
     LINK DATA:        10.1.0.3               
        INTERFACE TYPE:   2
     LINK ID:          10.1.1.1
     LINK DATA:        10.1.1.1              
        INTERFACE TYPE:   2 
     LINK ID:          10.1.3.1              
     LINK DATA:        10.1.3.1              
        INTERFACE TYPE:   2

F OMP1,RTTABLE

EZZ7847I ROUTING TABLE 842 
 TYPE   DEST NET         MASK      COST    AGE     NEXT HOP(S) 
                                                                        
 STAT*  10.0.0.0         FF000000  0       16079   10.1.1.2 
  SPF   10.1.0.0         FFFFFF00  101     16071   10.1.1.1         (2) 
  SPF*  10.1.1.0         FFFFFF00  1       16078   ETH1 
  SPF*  10.1.3.0         FFFFFF00  1       16078   JFPORTCP4 
  SPF   11.1.0.2         FFFFFFFF  201     4733    10.1.1.1         (2) 
                        0 NETS DELETED, 3 NETS INACTIVE

(2) is the number of equal-cost routes to the destination.

D TCPIP,,OMPROUTE,RTTABLE,DEST=10.1.0.0

gives

EZZ7874I ROUTE EXPANSION 105                   
DESTINATION:    10.1.0.0                       
MASK:           255.255.255.0                  
ROUTE TYPE:     SPF                            
DISTANCE:       101                            
AGE:            943                            
NEXT HOP(S):    10.1.1.1          (ETH1)       
                10.1.3.1          (JFPORTCP4)

How to debug bash scripts

Editing the script may not easy

The chicken and the egg problem

Who am I ?

What would I do to make it easier to debug?

Is it as easy(!) as this?

Checking options

Command styles

Defining the parameter string toJCL.

Parsing the data

Parsing with single character options

getsubopt to parse keyword=value

My basic command line parser(101)

Advanced – table – ize it

Create the Certificate Authority

Export the CA certificate to a file so, clients can access it

Create the keyring for user START1.

Connect the CA to every keyring that needs to use it

Create a user certificate and sign it on z/OS

Give a user access to the keyring

Import the client’s CA’s used to sign the client certificates

Use it!

What is the environment?

What levels of software

How do things run?

Who is responsible for the system?

How are systems changes managed?

How to write installation instructions

Some tasks for the different roles

z/OS system programmers

Security team

Storage team

Database team

MQ

Networking

Automation

Operations

Monitoring

Making your product secure

TLS and SSL

Samples

Change the source

Compile problems I stumbled upon

Execution problems

EDC5000I No error occurred. (errno2=0x05620062)

Purpose

EZZ7853I AREA LINK STATE DATABASE

EZZ0318I HOST WAS FOUND ON LINE 8 AND FIRST HOP ADDRESS OR AN = WAS EXPECTED

EZZ7904I Packet authentication failure, from 10.1.1.1, type 2

EZZ7921I OSPF adjacency failure, neighbor 10.1.1.1, old state 128, new state 4, event 10

EZZ7905I No matching OSPF neighbor for packet from 10.1.1.1, type 4

BPXF024I

TELNET and AT-TLS

EZZ6035I TN3270 DEBUG CONN DETAIL 1035-00 Policy is invalid for the conntype specified.

POLICY NOT APPLCNTRL

PARM2: SECURE PARM3: NO POLICY

EZZ6035I TN3270 RCODE: 1030-01 TTLS Ioctl failed for query or init HS.

EZZ6035I TN3270 DEBUG CONFIG EXCEPTION RCODE: 600F-00 System SSL initiation failed.

EZZ6035I TN3270 DEBUG TASK EXCEPTION TASK: MAIN MOD: EZBTZMSTRCODE: 1016-01 Port Task setup failed.PARM1: 0000102B PARM2: 00000BCF PARM3: 00000000EZZ6006I TN3270 CANNOT LISTEN ON PORT 3023, CONNECTION MANAGER TERMINATED, RSN =102B

OMPRoute

EZZ7815I Socket 11 bind to port 521, address :: failed, errno=111:EDC5111I Permission denied., errno2=74637246

EDC5111I Permission denied. errno2=0x744C7246.

EZZ7811I COULD NOT ESTABLISH AFFINITY WITH INET, ERRNO=1011:

EDC8011I A NAME OF A PFS WAS SPECIFIED THAT EITHER IS NOT CONFIGURED OR IS NOT A SOCKETS PFS., ERRNO2=11B3005A

EZZ7937I THE IPV6 OSPF ROUTING PROTOCOL IS DISABLED

EZZ7956I OSPF area 0.0.0.3 not configured, interface JFPORTCP6 not installed

EZZ8125I IPV6 OSPF ROUTERS NONE

EZZ7886I NOT CONNECTED TO AREA SPECIFIED ON … DISPLAY COMMAND

OMP1

F OMP1,OSPF,areasum

F OMP1,OSPF,EXTERNAL

F OMP1,ospf,list,areas

F OMP1,ospf,list,ifs

F OMP1,ospf,list,nbma

F OMP1,ospf,list,nbrs

F OMP1,ospf,list,vlinks

F OMP1,ospf,database,areaid=0.0.0.0

F OMP1,OSPF,LSA,LSTYPE=1,LSID=1.2.3.4

F OMP1,OSPF,LSA,LSTYPE=2,LSID=10.1.0.3

F OMP1,ospf,if

EZZ6035I TN3270 DEBUG TASK EXCEPTION TASK: MAIN MOD: EZBTZMST
RCODE: 1016-01 Port Task setup failed.
PARM1: 0000102B PARM2: 00000BCF PARM3: 00000000
EZZ6006I TN3270 CANNOT LISTEN ON PORT 3023, CONNECTION MANAGER TERMINATED, RSN =102B