The z/OS Health checker is a great facility, and makes the systems programmer’s job much easier. z/OS provides a set of configuration guidelines, such as the value for … should be …. At IPL and periodically, it checks the system and reports anything which is out of line. This allows you to check your configuration is consistent with best practice, and may identify problems you were not aware of.
For example it reported
- I had some digital certificates about to expire or had already expired – whoops.
- Some OMVS mounts had failed (because the entries in the BPXPRM… file were not active on the system)
- Some storage allocations were not as recommended.
When I printed out the full report, it told me what the recommended values where, and what values I had in my configuration so it was easy to change.
You can have different sorts of checks
- local – these run in the Health Checker address space
- remote – these run in other address spaces and report into the Health Checker
- Rexx – these run in an address space and report to the Health Checker
You can print out the full list of problems, and this comes with comprehensive help information and instructions on what to do about the problem.
Example output in syslog
HZS0001I CHECK(IBMCS,CSVTAM_CSM_STG_LIMIT): 442
ISTH017E Communications storage manager (CSM) storage allocation
definitions might not be optimal
HZS0002E CHECK(IBMRACF,RACF_JESJOBS_ACTIVE): 443
IRRH229E The class JESJOBS is not active.
HZS0001I CHECK(IBMOCE,OCE_XTIOT_CHECK): 444
IECH0101E OPEN macro support for XTIOT, uncaptured UCBs and DSAB
above the line is not enabled for non-VSAM. IBM recommends setting
NON_VSAM_XTIOT=YES in the DEVSUPxx member of PARMLIB.
HZS0001I CHECK(IBMRACF,RACF_PASSWORD_CONTROLS): 445
IRRH283E The RACF_PASSWORD_CONTROLS check found an exception
with one or more password control settings.
HZS0002E CHECK(IBMXCF,XCF_TCLASS_CLASSLEN): 446
IXCH0420E The XCF transport class size segregation configuration on
system S0W1 is inconsistent with the owner specification.
You can disable health checks which you do not want, so after cleaning your system, you should aim to have no health check exceptions.
What do these mean?
You can run a print job
//IBMHZS JOB 1,MSGCLASS=H
//HZSPRINT EXEC PGM=HZSPRNT,TIME=1440,REGION=0M,PARMD
//SYSIN DD *
CHECK(*,*)
,EXCEPTIONS
//SYSOUT DD SYSOUT=*,DCB=(LRECL=256)
Example output of print job
Certificates Expiring within 60 Days
CHECK(IBMRACF,RACF_CERTIFICATE_EXPIRATION)
SYSPLEX: ADCDPL SYSTEM: S0W1
START TIME: 01/19/2024 07:14:39.529686
CHECK DATE: 20111010 CHECK SEVERITY: MEDIUM
Certificates Expiring within 60 Days
S Cert Owner Certificate Label End Date Trust Rings
- ------------ -------------------------------- ---------- ----- -----
CERTAUTH Verisign Class 1 Individual CA 2008-05-12 No 0
E ID(START1) JES2 CLIENT EDS 2019-03-21 Yes 1
CERTAUTH GTE CyberTrust Root CA 2006-02-23 No 0
...
Only certificates that are marked as trusted result in exceptions.
Exceptions are indicated by an "E" or an "M" in the "S" (Status)
column. An "E" indicates that the certificate has expired within
time period examined by the check. An "M" indicates that the
certificate has no end date in the certificate profile. The trust
status of the certificate is shown in the "Trust" column. The number
of key rings to which the certificate is connected (other than the
virtual key ring) is shown in the "Rings" column. A value of "99999"
in the "Rings" column indicates that the certificate is connected to
99999 or more rings.
Use the RACDCERT LIST command to list complete information about any
certificate. The RACDCERT command syntax is:
RACDCERT CERTAUTH LIST(LABEL('label-name'))
or
RACDCERT SITE LIST(LABEL('label-name'))
or
RACDCERT ID(user-id) LIST(LABEL('label-name'))
...
BPXH061E One or more file systems specified in the BPXPRMxx parmlib
members are not mounted.
* High Severity Exception *
BPXH059I The following file systems are not active:
-----------------------------------------------------------
File System: ZWE200.ZFS
Parmlib Member: BPXPRMZW
Path: /usr/lpp/zowe
Return Code: 00000099
Reason Code: EF096150
File System: ZWE200.CONFIG.ZFS
Parmlib Member: BPXPRMZW
Path: /apps/zowe/v20
Return Code: 00000099
Reason Code: EF096150
Whoops – I missed than one due to a finger problem
CSFH0042I Check for weak CCA cryptographic keys in the PKDS
CHECK(IBMICSF,ICSF_WEAK_CCA_KEYS)
SYSPLEX: ADCDPL SYSTEM: S0W1
START TIME: 01/19/2024 07:15:00.161074
CHECK DATE: 20181101 CHECK SEVERITY: LOW
CSFH0042I Check for weak CCA cryptographic keys in the PKDS
Active PKDS: CSF.CSFPKDS.NEW
---------------------------------------------------------
COLIN
COLIN2
* Low Severity Exception *
CSFH0044E Weak CCA cryptographic keys in the PKDS were found.
....
EZBH008E The port range defined for CINET use has not been reserved for
OMVS on this stack.
CHECK(IBMCS,CSTCP_CINET_PORTRNG_RSV_TCPIP)
SYSPLEX: ADCDPL SYSTEM: S0W1
START TIME: 01/19/2024 07:14:59.665575
CHECK DATE: 20070901 CHECK SEVERITY: MEDIUM
* Medium Severity Exception *
EZBH008E The port range defined for CINET use has not been reserved for
OMVS on this stack.
Explanation: The port range defined for CINET use in the BPXPRMxx
parmlib member is not reserved for OMVS on this stack.
...