z/OS systems-ssl strange behaviour with environment variables

I was trying to use system ssl to write a program to use native z/O TLS facilities. I wasted a couple of hours because it said it could not find my keyring. Then when I collected a trace, it sometimes did not find the file – which did exist as I could list it.

If I used

//START1   EXEC PGM=GSKMAIN,REGION=0M, 
//*  PARM='4000' 
// PARM=('ENVAR("_CEE_ENVFILE=DD:STDENV")/4000') 
//STDENV DD PATH='/u/ibmuser/gskparms'

When the USS file had

GSK_TRACE_FILE=/tmp/zzztrace.file 
GSK_TRACE=0xff 
GSK_KEYRING_FILE=START1/TN3270

This worked file

When I used

//START1   EXEC PGM=GSKMAIN,REGION=0M, 
//*  PARM='4000' 
// PARM=('ENVAR("_CEE_ENVFILE=DD:STDENV")/4000') 
//STDENV  DD * 
GSK_TRACE_FILE=/tmp/zzztrace.file 
GSK_TRACE=0xff 
GSK_KEYRING_FILE=START1/TN3270 
/*

This failed to work.

If I looked in the trace file I had

ENTRY gsk_open_keyring(): ---> Keyring 'START1/TN3270                       '

Where it had taken the whole length of the line – and so START1/TN3270 padded with blanks was not found.

The trace file was not /tmp/zzztrace.file, it was /tmp/zzztrace.file padded with lots of blanks!

The answer is to use a environment file in USS, not in JCL or a data set.

One minute mvs: catalogs and datasets.

In days of old, when 64KB was a lot of real storage, to reference a data set you had to specify the data set name and the DASD volume the data set was on. DSN=MY.JCL,VOL=SER=USER00

After this, the idea of a catalog was developed. Just like the catalog in a library, it told you where things were stored. When you created a data set, and specified DISP=(NEW,CATLG), the data set name and volume were stored in the catalog. When you wanted to use a data set, and did not specify the volume, then the catalog was used to find the volume for the data set.

As systems grew and the number of data sets grew, the catalog grew and quickly became difficult to manage. For example if you deleted data sets, the entry was logically removed from the catalog, resulting in gaps in the catalog.

After this a multi level catalog was developed. You have one master catalog. You can have many user catalogs. You define an alias in the master catalog saying for data sets starting with a specific high level qualifier, use that user catalog.

When a userid was created, most system programmers would also create an alias pointing to a user catalog. They may define a user catalog for each user, or a user catalog could be shared by many aliases.

The catalogs are managed by the VSAM component of z/OS.

Entity naming

PDS and sequential files are referred to as datasets. VSAM provides simple database objects,

Relative Record ( where you say get me the n’th record)
Key sequence. You define a primary index, and you can an index on different columns using an ALTERNATIVE INDEX.

VSAM uses the term cluster to what you use in you JCL or application. A cluster has a data component, and zero or more index components.

Moving systems.

I have been running on a self contained ADCD system at z/OS level 2.4. I have recently installed a self contained system at z/OS 2.5. How do I get my data sets into the new system?
You can import connect a user catalog into a new (master) catalog, and define an alias in the new master catalog pointing to the user catalog.

When I did this I could then see my COLIN.* data sets. To be able to use the data sets, I need the volumes to be attached to the z/OS system.

Useful IDCAMS commands

In batch you use the IDCAMS program (IDC = prefix for VSAM, AMS is for Access Management Services!)

If you do not specify a catalog, it defaults to the master catalog.

Create a user catalog

//IBMDF  JOB 1,MSGCLASS=H                                   
//S1  EXEC PGM=IDCAMS                                       
//SYSPRINT DD SYSOUT=*                                      
//SYSIN DD *                                                
 DEFINE USERCATALOG -                                       
      ( NAME('A4USR1.ICFCAT') -                             
        MEGABYTES(15 15) -                                  
        VOLUME(A4USR1) -                                    
        ICFCATALOG  -                                       
        FREESPACE(10 10) -                                  
        STRNO(3  ) ) -                                      
        DATA( CONTROLINTERVALSIZE(4096) -                   
              BUFND(4) ) -                                  
        INDEX(BUFNI(4) )                                    
/*

Creating an alias to use the catalog

The JCL below creates two aliases in the master catalog. They both point to a catalog called A4USR1.ICFCAT (which is in the master catalog)

//IBMUSERT JOB 1,MSGCLASS=H                                     
//S1  EXEC PGM=IDCAMS,REGION=0M                                 
//SYSPRINT DD SYSOUT=*                                          
//SYSIN DD *                                                    
   DEFINE ALIAS (NAME(BACKUP) RELATE('A4USR1.ICFCAT') )         
   DEFINE ALIAS (NAME(COLIN ) RELATE('A4USR1.ICFCAT') )         
/*

To import an existing user catalog into a master catalog

//IBMUSERT JOB 1,MSGCLASS=H                                   
//S1  EXEC PGM=IDCAMS,REGION=0M                               
//SYSPRINT DD SYSOUT=*                                        
//SYSIN DD *                                                  
   IMPORT CONNECT -                                           
      OBJECTS -                                               
        (('A4USR1.ICFCAT' VOLUME(A4USR1) DEVICETYPE(3390)  -  
        ))                                                    
/*

From the information provided, the master catalog knows the name, and location of the user catalog.

List an entry

You can list information about an entry, such as a data set, or a catalog using the LISTCAT command

LISTCAT ENT(COLIN.USERS) ALL

Listing aliases

You can use the IDCAMS command LISTCAT with alias

LISTCAT ALIAS

which gives a one line per entry list of all of the aliases

ALIAS --------- ADCDA       
ALIAS --------- ADCDB       
ALIAS --------- ADCDC       
ALIAS --------- ADCDD       
...

LISTCAT ALIAS ALL

gives

ALIAS --------- ADCDA                                                       
     ...          
     ENCRYPTIONDATA                                                         
       DATA SET ENCRYPTION-----(NO)                                         
     ASSOCIATIONS                                                           
       USERCAT--USERCAT.Z24C.USER                                           
ALIAS --------- ADCDB                                                       
    ...      
     ENCRYPTIONDATA                                                         
       DATA SET ENCRYPTION-----(NO)                                         
     ASSOCIATIONS
       USERCAT--USERCAT.Z24C.USER

So we can see that the the alias ADCDA maps to user catalog USERCAT.Z24C.USER

Listing a catalog

The command

LISTCAT CATALOG(USERCAT.Z24C.USER)

gives

CLUSTER ------- 00000000000000000000000000000000000000000000       
   DATA ------- USERCAT.Z24C.USER                                  
   INDEX ------ USERCAT.Z24C.USER.CATINDEX                         
NONVSAM ------- ADCDA.S0W1.ISPF.ISPPROF                            
NONVSAM ------- ADCDA.S0W1.SPFLOG1.LIST                            
NONVSAM ------- ADCDB.S0W1.ISPF.ISPPROF                            
...                           
CLUSTER ------- SYS1.VVDS.VC4CFG1                                  
   DATA ------- SYS1.VVDS.VC4CFG1

Which shows there is a data component of the catalog called USERCAT.Z24C.USER, and there is index component called USERCAT.Z24C.USER.CATINDEX.

The catalog has a data component (USERCAT.Z24C.USER) and an index component USERCAT.Z24C.USER.CATINDEX.

Within the catalog are entries for data sets such as ADCDA.S0W1.ISPF.ISPPROF, and system (DFDSS) dataset SYS1.VVDS.VC4CFG1 - which contains information what is on the SMS DASD volume C4CFG1.

Where’s my packet gone? How can I see it being discarded.

I was testing out tracing TCP/IP on z/OS and wondered why I was not seeing packets with a bad destination address in the trace. I’ve now found out WHY I cannot see them, it’s just a little extra complexity of TCP/IP.

I had configured my z/OS to have an IP address of 7.1.168.74, and I could trace the packet. When I defined a route to z/OS for 7.168.1.>75< it did not appear in my trace even as an undeliverable packet.

My configuration was

  INTERFACE TAP1 
     DEFINE IPAQENET 
     CHPIDTYPE OSD 
     IPADDR 7.168.1.74 
;     PRIROUTER 
     PORTNAME PORTB 
 START TAP1

If you add the PRIROUTER option, my packet appeared in the trace. The documentation says

The primary router stack is the only stack to which OSA forwards packets when the destination IP address has not been previously registered with OSA. If no active TCP/IP instance using this device is defined as the primary router (PRIROUTER) or secondary router (SECROUTER), the device discards the datagram.

I think of an OSA as a clever hardware socket where you plug your Ethernet cable into the side of the z/OS box. Amongst other things, it drops traffic not destined for the z/OS image(s).

The quoted text means the OSA is doing some of the network work, and dropping packets not destined for the z/OS image(s), and so less work for z/OS.

With PRIROUTER I had a trace record, and the drop reason code was 4167, IP_NO_FWD: the packet cannot be forwarded, as expected.

Can I see this in NETSTAT output?

When I use NETSTAT STATS, it now has

Received Address Errors = 5

When I use NETSTAT DEVLINKS, the interface now has

Inbound Packets In Error = 5

When PRIROUTE is not specified, these values are 0.

MFA messages

AZF2406E Error from R_factor (AZFTOTP1) (length=0)

The RACF definitions have been done, you need to the the AZFEXEC processing for the factor (AZFTOTP1)

From Curl on Linux

curl: (35) error:0A00010B:SSL routines::wrong version number

The policy agent was not started on z/OS

ANZ#IN01 : IO error while writing, errno=140 reason=76697242

The policy agent was not running.

Browser: Secure Connection Failed An error occurred during a connection to … SSL received a record that exceeded the maximum permissible length.

Error code: SSL_ERROR_RX_RECORD_TOO_LONG

The policy agent was not started on z/OS

AZF2606E Failed to listen on loopback address (port:6792. rc:1115, rsn:0x744c7247)

There were two instances of AZF#IN00 active at the time (I was shutting one down, and restarting it)

Action: Wait till the first instance has stopped and then restart it.

z/OS health checker – understanding and configuring it

The z/OS Health Checker has many checks for the configuration of your z/OS system and it’s components. For example check for the existence of expired certificates, and z/OS parameters which are not best practice. The checks are configured into Health Checker, and you can have one or more policies which changes these checks. You can make checks inactive if they are not applicable, for example when they refer to a sysplex function, when your system is not in a sysplex.

Some checks are shipped as inactive, but most are active.

There are different sorts of check

local – these run in the Health Checker address space
remote – these run in other address spaces and report into the Health Checker
Rexx – these run in an address space and report to the Health Checker
global – these run once in a sysplex

Some of the concepts and commands are not very intuitive (for example the documentation is not very clear on how policy and checks are connected), but on the whole it is pretty easy to understand and use.

There is a policy which allows you to change the operational aspect of individual checks, for example make inactive, or change the description to it provides information specific to your configuration.

You can have multiple policies, so you could have one configuration and policies for different systems (for example different levels of z/OS), or different shifts.

Within each policy each statement has an identifier (it defaults to a sequence number) or you can give each statement a label using the STATEMENT(…) option. Personally I would have used the term “label” to avoid description like the STATEMENT parameter is used to identify the statement. If you specify a STATEMENT then it makes it easy to find it. A check might report “changed by COLINZFS” or “changed by 17 ” it is easier to find COLINZFS than find the 17th STATEMENT.

When do checks run?

The checks are run when

the HSZPROC is started
as specified in the configuration (parameter INTERVAL=ONETIME|hhh:mm)
when the refresh command is issued, such as F HZSPROC,refresh,check(*,*)

What output do checks provide?

A successful check provides no information on the logs. Each exception writes a few lines to the system log. You can run a print job to get a fuller description include full message text, and operation actions etc. For example one RACF check gave a “one line” entry on syslog, but the print job listed of all digital certificates which had, or were due to expire. The output format is what you typically see in a z/OS messages manual.

You can use the SDSF CK command to display the checks, the status, including when they last ran, and take action on the checks; such as temporarily disable or delete a check.

There are various operator display commands you can issue

A summary of the Health Checker configuration
Policy information
Checks and their status etc.

A summary of the Health Checker configuration

The commands are described here.

f hzsproc,display,checks

gave me

HZS0203I  16.02.33 HZS INFORMATION                          
POLICY(DEFAULT)                                                 
OUTSTANDING EXCEPTIONS: 12                                      
  (SEVERITY NONE: 0   LOW: 3   MEDIUM: 8   HIGH: 1)             
ELIGIBLE CHECKS: 155  (CURRENTLY RUNNING: 0)                    
INELIGIBLE CHECKS: 67   DELETED CHECKS: 0                       
ASID: 0041   LOG STREAM: NOT DEFINED                            
  LOG STREAM WRITES PER HOUR: 1327                              
  LOG STREAM AVERAGE BUFFER SIZE: 2364 BYTES                    
HZSPDATA DSN: ADCD.S0W1.HZSPDATA                                
HZSPDATA RECORDS: 828                                           
PARMLIB: AD,CP                                                  
ORIGINAL PARMLIB SOURCE: <USER>                                 
OPTIONS: NONE

where

members HZSPRMAD and HZSPRMCP were used from the sys1.parmlib concatenation
ORIGINAL PARMLIB SOURCE: <USER> the definitions were read from the sys1.parmlib. You can specify PREV, which means use the same as last time – but the use of this is unclear to me.

Display the status of the individual checks

F HZSPROC,DISPLAY,CHECKS                                               
HZS0200I 16.03.07 CHECK SUMMARY     580                                
CHECK OWNER      CHECK NAME                      STATE STATUS          
IBMCS            ZOSMIGV2R4PREV_CS_IWQSC_TCPIP    IE   INACTIVE        
IBMCS            CSTCP_IWQ_IPSEC_TCPIP            AE   SUCCESSFUL 
IBMCS            CSTCP_CINET_PORTRNG_RSV_TCPIP    AE   EXCEPTION-MED   
IBMCS            CSTCP_SYSPLEXMON_RECOV_TCPIP     AE   SUCCESSFUL 
... 
A - ACTIVE          I - INACTIVE                                  
E - ENABLED         D - DISABLED                                  
G - GLOBAL CHECK    + - CHECK ERROR MESSAGES ISSUED

This shows

All of these checks came from the IBMCS (TCPIP) component
Check ZOSMIGV2R4PREV_CS_IWQSC_TCPIP is inactive
CSTCP_CINET_PORTRNG_RSV_TCPIP has detected a medium level exception
State: Indicates whether a check runs at the next specified interval. For example INACTIVE(ENABLED) and ACTIVE(ENABLED)
Status: Describes the output of the check when it last ran.
For example INACTIVE and EXCEPTION-MED

You can display an individual or similar checks.

F HZSPROC,DISPLAY CHECKS,check=(IBMRACF,RACF_I*)
F HZSPROC,DISPLAY CHECKS,check=(IBMRACF,RACF_GRS_RNL)

F HZSPROC,DISPLAY CHECKS,CHECK=(IBMRACF,RACF_I*)                    
HZS0200I 14.41.21 CHECK SUMMARY     812                             
CHECK OWNER      CHECK NAME                      STATE STATUS       
IBMRACF          RACF_ICHAUTAB_NONLPA             AE   SUCCESSFUL   
IBMRACF          RACF_IBMUSER_REVOKED             IE   INACTIVE     
 ...

Using F HZSPROC,DISPLAY CHECKS,…DETAIL gives a lot of information about each checks, such as state, status, and last ran.

Display a policy

You can display all the items in a policy, or details about a statement in a policy. A policy is a group of changes you make to checks. This could be to make a check inactive, or to change the description(reason) to provide more site specific information.


F HZSPROC,DISPLAY,POLICY,STATEMENT=COLINS                         
HZS0204I 17.29.33 POLICY SUMMARY    988                           
POLICY(DEFAULT)                                                   
STMT NAME       TYPE  CHECK OWNER      CHECK NAME                 
COLINS           UPD  IBMRACF          RACF_SYSPLEX_COMMUNICATION

F HZSPROC,DISPLAY,POLICY,STATEMENT=COLINS,DETAIL                 
HZS0202I 17.30.16 POLICY DETAIL     992                          
POLICY(DEFAULT) STATEMENT: COLINS                                
 ORIGIN: HZSPRMUS        DATE: 20240120                          
 UPDATE CHECK(IBMRACF,RACF_SYSPLEX_COMMUNICATION)                
 REASON: Colins - Test/Development env                             
 INACTIVE

Update a policy

You can use the command interface to temporarily update the checks and policy, or you can update the HZSPMxx members.

For example I updated USER.*.PARMLIB(HSZPRMUS) with

ADDREPLACE POLICY STATEMENT(COLINS) 
UPDATE   CHECK(IBMRACF,RACF_SYSPLEX_COMMUNICATION) 
         DATE(20240120) 
         INACTIVE 
         REASON('COLIN - Test/Development env') 
                                                                    
ADDREPLACE POLICY 
UPDATE   CHECK(IBMRACF,RACF_PROTECTALL_FAIL) 
         DATE(20240120) 
         INACTIVE 
         REASON('COLIN2- do not want in one person system')

Note:

Each update needs and ADDREPLACE POLICY… statement
If you do not provide a STATEMENT, a numerical one is generated for you
You need a date (see dates in policies). Each check has a default date specified. If you specify a data which is before the default date, the check is not used. I specified the date I changed it to inactive so I have an audit trail!
I specified a reason why I made it inactive. This reason is displayed when you display the policy details.

Then used the command

F hzsproc,ADD,PARMLIB=(US,CHECK)

To check the syntax and validity, where US is the suffix of the HZSPRM source (above). I then used

zsproc,ADD,PARMLIB=(US)

To activate the definition.

You can specify which parmlib members are used at startup in the HSZPROC JCL.

How do I check the status of the checks?

In SDSF you can use the CK option to display the checks. There are many columns. The interesting columns (to me) are

Name: like RACF_PROTECTALL_FAIL
Owner: (what I would call a component) IBMRACF
State: Indicates whether a check runs at the next specified interval. For example INACTIVE(ENABLED) and ACTIVE(ENABLED)
Status: Describes the output of the check when it last ran.
For example INACTIVE and EXCEPTION-MEDIUM
Run count: such as 9
ModifiedBy: such as STMT(37) in the policy. This allows you to find your update statement. By specifying you STATEMENT(…) makes it easier to use SRCHFOR to find the member with the update.
Reason: the description from the check such as PROTECTALL(FAIL) should be enabled.
Update reason: any update from the policy statement, for example COLIN- do not want in one person system

In SDSF CK, you can use the DL line command to display the information in one screen. Sometimes the line command SV (display in ISPF View) works. You can use the DP line command to display the active policy (if any) for the check.

You can also use standard the SDSF commands sort, arrange and filters (such as FILTER STATUS EQ “INACTIVE”) to limit and change the data displayed.

Printing the full health check log

I used

//IBMHZSPR JOB 1,MSGCLASS=H 
//HZSPRINT EXEC PGM=HZSPRNT,TIME=1440,REGION=0M,PARMDD=SYSIN 
//SYSIN DD * 
CHECK(*,*) 
,EXCEPTIONS 
//SYSOUT   DD SYSOUT=*,DCB=(LRECL=256)

After I cleaned my system I had two exception, and a total of 150 lines of output.

One minute MVS: Health checker

The z/OS Health checker is a great facility, and makes the systems programmer’s job much easier. z/OS provides a set of configuration guidelines, such as the value for … should be …. At IPL and periodically, it checks the system and reports anything which is out of line. This allows you to check your configuration is consistent with best practice, and may identify problems you were not aware of.

For example it reported

I had some digital certificates about to expire or had already expired – whoops.
Some OMVS mounts had failed (because the entries in the BPXPRM… file were not active on the system)
Some storage allocations were not as recommended.

When I printed out the full report, it told me what the recommended values where, and what values I had in my configuration so it was easy to change.

You can have different sorts of checks

local – these run in the Health Checker address space
remote – these run in other address spaces and report into the Health Checker
Rexx – these run in an address space and report to the Health Checker

You can print out the full list of problems, and this comes with comprehensive help information and instructions on what to do about the problem.

Example output in syslog

HZS0001I CHECK(IBMCS,CSVTAM_CSM_STG_LIMIT): 442                       
ISTH017E Communications storage manager (CSM) storage allocation      
definitions might not be optimal                                      
HZS0002E CHECK(IBMRACF,RACF_JESJOBS_ACTIVE): 443                      
IRRH229E The class JESJOBS is not active.                             
HZS0001I CHECK(IBMOCE,OCE_XTIOT_CHECK): 444                           
IECH0101E OPEN macro support for XTIOT, uncaptured UCBs and DSAB      
above the line is not enabled for non-VSAM. IBM recommends setting    
NON_VSAM_XTIOT=YES in the DEVSUPxx member of PARMLIB.                 
HZS0001I CHECK(IBMRACF,RACF_PASSWORD_CONTROLS): 445                   
IRRH283E The RACF_PASSWORD_CONTROLS check found an exception          
with one or more password control settings.                           
HZS0002E CHECK(IBMXCF,XCF_TCLASS_CLASSLEN): 446                       
IXCH0420E The XCF transport class size segregation configuration on   
system S0W1 is inconsistent with the owner specification.

You can disable health checks which you do not want, so after cleaning your system, you should aim to have no health check exceptions.

What do these mean?

You can run a print job

//IBMHZS   JOB 1,MSGCLASS=H 
//HZSPRINT EXEC PGM=HZSPRNT,TIME=1440,REGION=0M,PARMD
//SYSIN DD * 
CHECK(*,*) 
,EXCEPTIONS 
//SYSOUT   DD SYSOUT=*,DCB=(LRECL=256)

Example output of print job

Certificates Expiring within 60 Days

CHECK(IBMRACF,RACF_CERTIFICATE_EXPIRATION)                                 
SYSPLEX:    ADCDPL    SYSTEM: S0W1                                         
START TIME: 01/19/2024 07:14:39.529686                                     
CHECK DATE: 20111010  CHECK SEVERITY: MEDIUM                               
                                                                           
                  Certificates Expiring within 60 Days                     
                                                                           
S Cert Owner   Certificate Label                End Date   Trust Rings     
- ------------ -------------------------------- ---------- ----- -----     
  CERTAUTH     Verisign Class 1 Individual CA   2008-05-12 No        0
E ID(START1)   JES2 CLIENT EDS                  2019-03-21 Yes       1     
  CERTAUTH     GTE CyberTrust Root CA           2006-02-23 No        0 
...
Only certificates that are marked as trusted result in exceptions.              
Exceptions are indicated by an "E" or an "M" in the "S" (Status)                
column.  An "E" indicates that the certificate has expired within               
time period examined by the check. An "M" indicates that the                    
certificate has no end date in the certificate profile. The trust               
status of the certificate is shown in the "Trust" column. The number            
of key rings to which the certificate is connected (other than the              
virtual key ring) is shown in the "Rings" column. A value of "99999"            
in the "Rings" column indicates that the certificate is connected to            
99999 or more rings.  
Use the RACDCERT LIST command to list complete information about any    
certificate. The RACDCERT command syntax is:                            
                                                                        
        RACDCERT CERTAUTH     LIST(LABEL('label-name'))                 
                               or                                       
        RACDCERT SITE         LIST(LABEL('label-name'))                 
                               or                                       
        RACDCERT ID(user-id)  LIST(LABEL('label-name')) 
...

BPXH061E One or more file systems specified in the BPXPRMxx parmlib
members are not mounted.

* High Severity Exception *      

BPXH059I The following file systems are not active:                        
-----------------------------------------------------------                
File System: ZWE200.ZFS                                                    
 Parmlib Member: BPXPRMZW                                                  
 Path: /usr/lpp/zowe                                                       
 Return Code: 00000099                                                     
 Reason Code: EF096150                                                     
                                                                           
File System: ZWE200.CONFIG.ZFS                                             
 Parmlib Member: BPXPRMZW                                                  
 Path: /apps/zowe/v20                                                      
 Return Code: 00000099                                                     
 Reason Code: EF096150

Whoops – I missed than one due to a finger problem

CSFH0042I Check for weak CCA cryptographic keys in the PKDS

CHECK(IBMICSF,ICSF_WEAK_CCA_KEYS)                                                   
SYSPLEX:    ADCDPL    SYSTEM: S0W1                                                  
START TIME: 01/19/2024 07:15:00.161074                                              
CHECK DATE: 20181101  CHECK SEVERITY: LOW                                           
                                                                                    
CSFH0042I Check for weak CCA cryptographic keys in the PKDS                         
                                                                                    
Active PKDS: CSF.CSFPKDS.NEW                                                        
---------------------------------------------------------                           
COLIN                                                                               
COLIN2                                                                              
                                                                                    
* Low Severity Exception *                                                          
                                                                                    
CSFH0044E Weak CCA cryptographic keys in the PKDS were found. 
....

EZBH008E The port range defined for CINET use has not been reserved for
OMVS on this stack.

CHECK(IBMCS,CSTCP_CINET_PORTRNG_RSV_TCPIP)                                        
SYSPLEX:    ADCDPL    SYSTEM: S0W1                                                
START TIME: 01/19/2024 07:14:59.665575                                            
CHECK DATE: 20070901  CHECK SEVERITY: MEDIUM                                      
                                                                                  
* Medium Severity Exception *                                                     
                                                                                  
EZBH008E The port range defined for CINET use has not been reserved for           
OMVS on this stack.                                                               
                                                                                  
  Explanation:  The port range defined for CINET use in the BPXPRMxx              
    parmlib member is not reserved for OMVS on this stack.  
...

Should I use tar or pax to backup my Unix files?

I am running this on z/OS and want to copy Unix files from one z/OS image to another.

The tar command is very popular, and works for most people.

The pax command is similar to tar, but It can also save and restore file attributes that cannot be handled by any other format such as: files greater than 8 GB, large UID, and GID values , large time values and z/OS -specific attributes like user audit and auditor audit flags and file format.

You create a file using

pax -o saveext -wf pax_file_name files_to_add

and

pax -ppx -rf pax_file_name

to extract the files.

Thanks to Gwydion Tudur for the pointers about extended attributes.

Using a data set

You can use a data set as an output file for example you specify “//’COLIN.PAX.HTTP2′”

pax -E -f “//’COLIN.PAX.HTTP2′”

This will display the contents of the file, for example if gives

drwxrwxrwx        1 OMVSKERN SYS1           0 Aug  4 10:03 ./                                  
drwxrwxrwx        1 OMVSKERN SYS1           0 Nov 15  2021 ./images/                           
drwxrwxrwx        1 OMVSKERN SYS1           0 Nov 15  2021 ./images/ihs/                       
-rwxrwxrwx  --s-  1 OMVSKERN SYS1         223 Nov 15  2021 ./images/ihs/administration.gif     
                 
-rwxrwxrwx  --s-  1 COLIN    SYS1         373 Jun 17  2023 ./colin.html

Using a data set makes it more portable, for example it is a data set, not a file in a Unix file system.

File owners

Within the .pax file the file owner is a name. When the file is unpacked, you can use the -p o option to preserves the user ID and group information. On my system userid OMVSKERN has uid 0, and group SYS1 has gid 0. On my newer z/OS system the file got the uid of COLIN on the new system – not from the old system.

Without the -po option, the files get the uid from the userid executing the pax command.

One minute MVS: What is IBM Multi Factor Authentication on z/OS?

Most people are familiar with Multi Factor Authentication (MFA). For example when accessing a banking site through the internet, you have a digital code sent to your phone which you enter in the web page.

There is a phrase associated with MFA. Something you know, something you have. When using internet banking, you use a userid and password(something you know) and the 6 digit code sent to your phone (something you have). At airports, the staff use a badge to get access to secure areas. They swipe the badge (something you have) and have to enter a 4 digit code (something you know).

In-band and out-of-band

With some applications you enter two factors to logon to the application. For example, I can logon to TSO with a “password” 983211:passw0rd - where 983211 is a one time code (which changes every minute) and passw0rd is my password. This is in-band (you enter the combined password >IN<to the application).

I can use a certificate to logon to a web page, get a one time password and enter that into the TSO logon screen. This is indirect, or out-of-band authentication, you set up the password >OUT<side of the application.

What is available on z/OS?

You can set up one-time-codes, or password (or pass phrase), or one-time-code and password (or pass phrase). A password can be up to 8 characters. A pass phrase must be between 14 to 100 characters in length (inclusive).

You can get a one-time-code from several sources:

A small hardware device, which you can hang on your keyring
Generated from software. I use the IBM Security Verify application on my Android phone. There are other applications, such as Google Authenticator and Duo Mobile, but the code generated by these was not accepted by z/OS. See below.

To set up the mobile phone application you logon on the IBM MFA web browser on your z/OS and get a QR code displayed. This code contains a secret, and other information such as algorithm=SHA256, period=60 seconds, digits=6. The app on your phone reads this and stores the information. When ever you use the app, it displays a code which you enter on z/OS. The value is time limited and expires after a short interval, typically 30 or 60 seconds.

I configured MFA to use just the TOTP (One Time Password). When I logged on to TSO with userid TOTP and the code I got

ICH70008I IBM MFA Message:
AZF1105I TOTP PASSCODE ACCEPTED
ICH70001I TOTP LAST ACCESS AT 07:37:37 ON SATURDAY, JANUARY 6, 2024

When I configured MFA to require TOTP and userid, I had to enter a password like 345112:PASSW0RD, where 345112 was the one time code from the application.

You configure the MFA on a per userid basis. I set up MFA for a new userid called TOTP, and this has to logon with two factors. Another userid only has to logon with the password.

The IBM Security verify application worked out of the box.

With applications Duo Mobile, and Google Authenticator I got message

AZF5042E Preflight saw invalid account metadata

because they provided an invalid code. The applications only worked with the following system wide options

Digest Algorithm . . . . . 1 (SHA-1)
Token Code Length . .. . 1 (6-digit)
Token Period. . . . . . . . . 2 (30 seconds)

See this page for more information.

Yubikey

A Yubikey is a small USB device from which you can get a one time password. I found the site and what you need to order confusing, and purchased the wrong device. On one of the Yubikey pages it compares the different devices. I needed a Yubikey 5 series; I had (wrongly) purchased a Yubikey Security key Series. When the new key arrives I’ll write up how to use it.

Using a certificate

To make your certificate known to the MFA instance, you logon to a web page using TLS. The web browser port has been secured using AT-TLS. When you logon to the web page, the web browser displays the list of valid certificates for you to choose. After you have selected one, the application running in the web server can extract information from the certificate, and update the userid information in the MFA profile for the userid.

You set up an “out-of-bounds policy” saying use which authentication method (use certificates) and how long the password is valid for (60 seconds).

You configure the userid to be able to use the policy.

To be able to logon, the userid logs on to a different web page ../mfa/mypolicy using the same certificate and enters the userid. A TLS handshake is done to the server (validating the certificate), and a password is returned. You enter the password in your application.

One minute MVS – Using individual data set encryption on z/OS.

Overview

You can have full disk encryption. This prevents the disk from being read if it is removed from the environment. The disk subsystem requests the keys from a key manager, not z/OS, as the disk subsystem is doing the encryption and decryption. The keys are requested at power on of the disk subsystem.

On z/OS you can have data set encryption. The data set contents are encrypted on disk. Each data set could have a unique encryption key. Users on the system need permission to read the data set, and need access to the encrypt key.

If a userid is permitted to the data set, and to the encryption key, the userid has access to the data and can read and write it, the same was as if the data set was not encrypted.

Once you have set up the definitions, they are used when the data set is created. To encrypt a data set, you can…

Create a new (encrypted) dataset
Copy the old to the new.
Delete the old, and rename the new to old.

If you delete the key, then the data is not accessible unless you have a backup of the key – or you have a copy of the key on another system.

This encryption does not apply to files in Unix System Services, because these are not RACF protected.

MQ 9.2 and later supports encryption, including for page sets and log data sets. See here. DB2 can use data set encryption for its page sets and logs, see here.

Topics

Implementation
- ISPF interface
- Batch interface
Security profiles
- Create and use the encryption key profiles
- Use the definitions with a dataset
Doing interesting things with encrypted data sets
Other questions

Implementation

You create an encryption key using the ICSF component on z/OS.

ISPF interface

If you are using the ICSF ISPF interface use options use : 5 Utility, 5 CKDS Keys, 7 Generate AES DATA keys. In the field Enter the CKDS record label for the new AES DATA key enter a memorable name. In the Red book, it uses a prefix of DATASET.name , I used COLINAES.

In AES key bit length: select 256 – other values give errors.

Batch interface

Use the operator command d icsf,kds to display the current datasets being used by ICSF. It gave me CSF.CSFCKDS.NEW .

The JCL below deletes the key, and creates a new key. It then refreshes the in memory data. (Once you delete the key, any data sets which used it cannot be read).

//IBMICSF  JOB 1,MSGCLASS=H 
//STEP10 EXEC PGM=CSFKGUP 
//  SET CKDS=CSF.CSFCKDS.NEW 
//CSFCKDS DD DISP=OLD,DSN=&CKDS 
//* LENGTH(32) GENERATES A 256 BIT KEY 
//CSFIN DD *,LRECL=80 
DELETE TYPE(DATA) LABEL(COLINBATCHAES ) 
ADD TYPE(DATA) ALGORITHM(AES), 
LABEL(COLINBATCHAES          ) LENGTH(32) 
/* 
//CSFDIAG DD SYSOUT=*,LRECL=133 
//CSFKEYS DD SYSOUT=*,LRECL=1044 
//CSFSTMNT DD SYSOUT=*,LRECL=80 
//* Refresh the in memory data
//REFRESH  EXEC PGM=CSFEUTIL,PARM='&CKDS,REFRESH'

This gave

CSFG0321 STATEMENT SUCCESSFULLY PROCESSED.
CSFG0780 A REFRESH OF THE IN-STORAGE CKDS IS NECESSARY TO ACTIVATE CHANGES MADE BY KGUP.

and the refresh gave

CSFU002I CSFEUTIL COMPLETED, RETURN CODE = 0, REASON CODE = 0

Security profiles

The encryption information is used when the data set is created. This can be specified in JCL, VSAM DEFINE, or in the DFP extension of a dataset RACF profile.

Create and use the encryption key profiles

Use batch TSO. The statements below:

Uses SET to define the variable, as it is used in several places
Delete the old profile (there is no define replace)
Create the profile
Give userid IBMUSER read access to the profile
Refreshes the RACLIST information
Alters the data sets profile to set the DFP segment to use the key just defined

//IBMRACF2 JOB 1,MSGCLASS=H 
//       EXPORT SYMLIST=* 
// SET KEY=COLINAES 
//S1  EXEC PGM=IKJEFT01,REGION=0M 
//SYSPRINT DD SYSOUT=* 
//SYSTSPRT DD SYSOUT=* 
//SYSTSIN DD *,SYMBOLS=JCLONLY   
RDELETE CSFKEYS &KEY 
RDEFINE CSFKEYS &KEY                          + 
  ICSF(SYMCPACFWRAP(YES) SYMCPACFRET(YES))    + 
  UACC(NONE) 
PERMIT  &KEY                                  + 
  CLASS(CSFKEYS) ID(IBMUSER      )            + 
  ACCESS(READ) 
SETROPTS RACLIST(CSFKEYS) REFRESH 
                                                       
RLIST  CSFKEYS &KEY      AUTHUSER         ICSF 

                                             
ALTDSD 'COLIN.ENCR.*' UACC(NONE)    + 
   DFP(DATAKEY(&KEY)) 

/*                   
//* LISTCAT ENTRIES('COLIN.ENCR.DSN') ALL

This encryption information is only used when a data is created.

If you use LISTCAT, it will show old information, until the data set is recreated.

More set up

When I tried creating a data set with the encryption label I got

IEF344I IBMRACF2 S3 DD2 - ALLOCATION FAILED DUE TO DATA FACILITY SYSTEM ERROR           
IGD17157I DSNTYPE BASIC IS NOT A SUPPORTED DATA SET TYPE FOR ENCRYPTION                 
BECAUSE STGADMIN.SMS.ALLOW.DATASET.SEQ.ENCRYPT IS NOT DEFINED                           
IGD17151I ALLOCATION FAILED FOR DATA SET                                                
COLIN.ENCR.DSN BECAUSE A KEY LABEL IS                                                   
SPECIFIED FOR AN UNSUPPORTED DATA SET TYPE.

See Specifying a key label for a non-extended format data set.

You can either use an SMS DC with Extended Format specified, or define the RACF resource

STGADMIN.SMS.ALLOW.DATASET.SEQ.ENCRYPT.

TSO RDEFINE FACILITY STGADMIN.SMS.ALLOW.DATASET.SEQ.ENCRYPT
TSO SETR RACLIST(FACILITY) REFRESH

Use the definitions with a dataset

You can specify the encryption key reference in

JCL using DSKEYLBL
Via a RACF data set profile and the DFP extension
DEFINE IDCAMS, with KEYLABEL(MYLABEL)
SMS definitions

If there is no DFP segment to the RACF profile you can use

//SYSUT2 DD   DSN=COLIN.ENCR.DSN,SPACE=(CYL,(1,1)), 
//       DSKEYLBL=COLINBATCHAES, 
//       DISP=(MOD,CATLG), 
//       DCB=(RECFM=FB,LRECL=80,BLKSIZE=800)

In the JCL output it has

IGD17150I DATA SET COLIN.ENCR.DSN IS ELIGIBLE FOR ACCESS METHOD ENCRYPTION. KEY LABEL IS (COLINBATCHAES)

LISTCAT output gave

LISTCAT ENTRIES('COLIN.ENCR.DSN') ALL                           
NONVSAM ------- COLIN.ENCR.DSN                                  
     IN-CAT --- A4USR1.ICFCAT                                   
     HISTORY                                                    
       ...  
     SMSDATA                                                    
      ... 
     ENCRYPTIONDATA                                             
       DATA SET ENCRYPTION----(YES)
       DATA SET KEY LABEL-----COLINBATCHAES

Doing interesting things with encrypted data sets

You can use DFDSS to copy the encrypted dataset, without decrypting it. Any encryption parameters are copied to the new data set.

You need access to the CSFKEYS profile.

The JCL below

Deletes the old data set
Copies from COLIN.ENCR.DSN creating the output renaming COLIN to ADCD
List the catalog for the output data set

//IBMDFDSS JOB 1,MSGCLASS=H                                       
//S1 EXEC PGM=IEFBR14,REGION=0M                                  
//SYSPRINT DD SYSOUT=*                                            
//DDOLD DD DSN=ADCD.ENCR.DSN,SPACE=(CYL,(1,1)),DISP=(MOD,DELETE) 
//* 
//S1  EXEC PGM=ADRDSSU,REGION=0M PARM='TYPRUN=NORUN'              
//SYSPRINT DD SYSOUT=*                                            
//SYSIN DD *                                                      
 COPY  -                                                          
    DATASET(INCLUDE(COLIN.ENCR.DSN))       -               
    REPLACE  -                                                    
    RENUNC(ADCD )                                                 
/*                                                                
//S1  EXEC PGM=IKJEFT01,REGION=0M                                 
//SYSPRINT DD SYSOUT=*                                            
//SYSTSPRT DD SYSOUT=*                                            
//SYSTSIN DD *                                                    
LISTCAT ENTRIES('ADCD.ENCR.DSN') ALL                              
/*

The userid(COLIN) that ran this job had permission to read the data set, and has access to the key.

The output data set has

ENCRYPTIONDATA                      
  DATA SET ENCRYPTION----(YES)      
  DATA SET KEY LABEL-----COLINBATCHAES

The data set has been copied encrypted with the same key as the original data set.

You can print the encrypted data in the file using DFDSS (ADRDSSU) PRINT DATASET(..) command.

Configuring sshd server on z/OS

SSH is Secure SHell. It allows you to securely logon to a remote Unix-like shell using OpenSSl.

SSH has little in common with SSL or TSH. For example you cannot keep “certificates” in z/OS keyrings. (The documentation says you can – but it is talking about something else).

SSH uses a different protocol and certificate to TLS – you cannot use TLS certificate for SSH encryption and authentication because they have different formats.

The IBM documentation for sshd starts here.

To connect to a server, the server needs to be running a daemon.

I’ve written a blog post on using a client to connect to SSH.

Setting up the SSH Daemon

The SSH daemon runs by default as started task SSHD. I changed the PARM in the JCL to be

//SSHD    PROC 
//SSHD    EXEC PGM=BPXBATCH,REGION=0M,TIME=NOLIMIT, 
//             PARM='PGM /usr/sbin/sshd -f /etc/ssh/sshd_config ' 
//*            PARM='PGM /bin/sh -c /etc/ssh/sshd.sh' 
//* STDIN AND STDOUT ARE BOTH DEFAULTED TO /dev/null 
//STDERR DD PATH='/tmp/sshd.stderr',PATHOPTS=(OWRONLY,OCREAT,OAPPEND), 
//         PATHMODE=(SIRWXU) 
//STDOUT DD PATH='/tmp/sshd.stdout',PATHOPTS=(OWRONLY,OCREAT,OAPPEND), 
//         PATHMODE=(SIRWXU)

The original PARM statement attaches the daemon as SSHD3. With my way, the started task is SSHD.

With the original PARM , the WLM classification came up as Workload SERVERS, SvrClass SRVOMVS, with my change the WLM classification was Workload STARTED, SvrClass STCLOM.

General setup

You can specify attributes that apply to all logons, and use theMatch statement to specify attributes which apply to a subset of logons. For example match on server userid, or match on client IP address.

Start the Daemon

S SSHD

Stop the Daemon

Either cancel SSHD, or cancel SSHD3, depending on how you started it.It may not responsd to the Stop command (P SSHD).

Basic configuration

You can display a logon message using

Banner /S0W1/var/log/banner.txt

You can specify a command that runs when they user logs on.

 ForceCommand  echo "HI ADCDA"

Listen address and port

You can specify

Port 22
Port 222
ListenAddress host
ListenAddress host:port

How to authenticate

AuthenticationMethods publickey,password publickey,keyboardinteractive

Limit/allow userids or groups

AllowGroups  sys1
DenyGroups   OTHERS
AllowUsers   ADCDA ADCDB
DenyUsers    ADCDC ADCDC

Examples of match

If there are multiple Match statements, then the first applicable one is used.

Match user  ADCDA 
    AuthenticationMethods  publickey 
    Banner /S0W1/var/log/banner.txt 
#   ForceCommand  echo "HI ADCDA" 
Match Address 10.1.0.3 
    AuthenticationMethods  publickey 
    Banner /S0W1/var/log/banner.txt 

Match Address 10.1.0.2 
    AllowUsers IBMUSER
    AuthenticationMethods  password 
    Banner /S0W1/var/log/banner.txt2 
#   ForceCommand  echo "HI 10.1.0.2 IBMUSER"

Debugging startup problems

The SSHD server writes to syslogd. Check the SYSLOGD daemon is active.

Look at the config file for

Problems

I got message

EZYFT16E accept error : EDC5122I Input/output error. (errno2=0x74687308)

The Unix command BPXMTEXT 74687308 gave

JrNoDuAvailable: TCP/IP cannot create a dispatchable unit to process the request. Either TCP/IP is not active or there is insufficient common storage available.

I think the error message means the port is in use, SSHD was unable to connect to the port. Check /S0W1/etc/ssh/ssh_config and find the port. It defaults to 22. Check to see if this is active

TSO NETSTAT allcon (port 22