Collecting and understanding a RACF GTF trace output

A RACF request can be

  • A callable service, such as IRRSDL00 which is used to extract information about keyrings. This can be used by High Level Languages. These have a trace type of OMVS
  • A RACROUTE request, an assembler macro interface, such as FASTAUTH. These have a trace type of RACF.

You get a trace entry PRE the call, and and a trace entry POST the call. A callable service would have an OMVSPRE, and an OMVSPOST entry.

When setting up your RACF trace you need to define which callable service, or which RACROUTE requests you want to trace.

Example of a callable server

Entry trace

Trace Identifier:             00000036 
Record Eyecatcher: RTRACE
Trace Type: OMVSPRE
Ending Sequence: ........
Calling address: 00000000 20801A0F
Requestor/Subsystem: ........ ........
Primary jobname: IBMINC5
Primary asid: 00000029
Primary ACEEP: 00000000 008FA8A8
Home jobname: IBMINC5
Home asid: 00000029
Home ACEEP: 00000000 008FA8A8
Task address: 00000000 008D6A88
Task ACEEP: 00000000 00000000
Time: DFF36A27 5460CC00
Error class: ........
Service number: 00000029
RACF Return code: 00000000
RACF Reason code: 01000000
Return area address: 00000000 00000000
Parameter count: 00000021

Interesting information

  • Trace Type: OMVSPRE this is an entry trace
  • Primary jobname: IBMINC5
  • Home jobname: IBMINC5
  • The RACF return and reason code have no meaning on entry
  • Service number: 00000029 is hex so the service is 41. The RACF commands documentation says for a callable service, 41 is function number IRRSDL00.

Exit trace

Trace Identifier:             00000036 
Record Eyecatcher: RTRACE
Trace Type: OMVSPOST
Ending Sequence: ........
Calling address: 00000000 20801A0F
Requestor/Subsystem: ........ ........
Primary jobname: IBMINC5
Primary asid: 00000029
Primary ACEEP: 00000000 008FA8A8
Home jobname: IBMINC5
Home asid: 00000029
Home ACEEP: 00000000 008FA8A8
Task address: 00000000 008D6A88
Task ACEEP: 00000000 00000000
Time: DFF36A27 5468DC40
Error class: ........
Service number: 00000029
RACF Return code: 00000008
RACF Reason code: 00000014
Return area address: 00000000 00000000
Parameter count: 00000021

Interesting fields

  • Trace Type: OMVSPOST this is the exit trace
  • As above service number 41 is function
  • The RACF Return code is 8
  • The RACF reason code is 14

Example of a RACROUTE trace

Entry

Trace Identifier:             00000036 
Record Eyecatcher: RTRACE
Trace Type: RACFPRE
Ending Sequence: ........
Calling address: 00000000 A096A08A
Requestor/Subsystem: CRYPTO CRYPTO
Primary jobname: CSF
Primary asid: 00000038
Primary ACEEP: 00000000 008F6B98
Home jobname: IBMEXP
Home asid: 00000029
Home ACEEP: 00000000 008FA8A8
Task address: 00000000 008D6A88
Task ACEEP: 00000000 00000000
Time: DFF37DD1 AB7FF040
Error class: ........
Service number: 00000002
RACF Return code: 00000000
RACF Reason code: 00000000
Return area address: 00000000 00000000
Parameter count: 00000009

This trace entry was from a batch job IBMEXP, using an ICSF service to access an encryption key.

Interesting fields

  • RACFPRE this is a RACROUTE before entry
  • Requestor/Subsystem: CRYPTO which z/OS component
  • Primary jobname: CSF This was the address space which issued the RACROUTE request
  • Home jobname: IBMEXP This was the job requesting an ICSF service
  • Service number: 2. The RACF commands documentation says for a RACROUTE service, 2 is FASTAUTH
  • Ignore RACF return and RACF reason codes. They are set on exit,

Exit

Trace Identifier:             00000036 
Record Eyecatcher: RTRACE
Trace Type: RACFPOST
Ending Sequence: ........
Calling address: 00000000 A096A08A
Requestor/Subsystem: CRYPTO CRYPTO
Primary jobname: CSF
Primary asid: 00000038
Primary ACEEP: 00000000 008F6B98
Home jobname: IBMEXP
Home asid: 00000029
Home ACEEP: 00000000 008FA8A8
Task address: 00000000 008D6A88
Task ACEEP: 00000000 00000000
Time: DFF37DD1 AB85DE40
Error class: ........
Service number: 00000002
RACF Return code: 00000008
RACF Reason code: 00000000
Return area address: 00000000 00000000
Parameter count: 00000009

This trace entry was from a batch job IBMEXP, using an ICSF service to access an encryption key.

Interesting fields

  • RACFPOST this is a RACROUTE after exit entry
  • Requestor/Subsystem: CRYPTO which z/OS component
  • Primary jobname: CSF This was the address space which issued the RACROUTE request
  • Home jobname: IBMEXP This was the job requesting an ICSF service
  • Service number: 2. The RACF commands documentation says for a RACFROUTE service, 2 is FASTAUTH
  • RACF Return code: 00000008, RACF Reason code: 00000000 This shows there was a problem. The documentation for FASTAUTH shows
    • RC=8 -> the user or group is not authorized to use the resource.
    • RS=0 -> The invoker does not need to log the request.

The RACF trace command for this was

#SET TRACE(JOBNAME(csf),callable(none),RACROUTE(type(2)))

and the following commands

S GTF.GTF
R 1,trace=usrp
R 2,USR=(F44)
R 3,END
R 4,U

Fast start GTF

I can use

S GTF.GTF,M=GTFRACF

Where my GTF proc has

//GTFNEW  PROC M=GTFPARM,ID=SYS1 
//DELETE EXEC PGM=IEFBR14
//IEFRDER DD DSNAME=&ID..TRACE,UNIT=SYSDA,SPACE=(TRK,20),
// DISP=(MOD,DELETE)
//IEFPROC EXEC PGM=AHLGTF,PARM='MODE=EXT,DEBUG=NO,TIME=YES,NP',
// TIME=1440,REGION=2880K
//IEFRDER DD DSNAME=&ID..TRACE,UNIT=SYSDA,SPACE=(TRK,20),
// DISP=(NEW,KEEP)
//SYSLIB DD DSNAME=USER.Z24C.PROCLIB(&M),DISP=SHR

I have a member in USER.Z24C.PROCLIB(GTFRACF)

TRACE=USRP 
USR=(F44)
END

Debugging the “you do not have access to something, but I’m not telling you what” problem

The problem, I had a message

SSL Handshake Failed, ICSF error. Review ‘RACF CSFSERV Resource Requirements’ of the z/OS documentation.
Reason: The webservers userid does not have access to CSFSERV resource classes required for SSL.

But it does not tell me what it does not have access to.

When an application tries to access a resource, and the userid is not authorised to that resource, RACF can produce an error message, which tells you the resource name.

If the application asks “does this application have access to this resource”, then RACF produces no error message, and it is up to the application to provide a sensible and useful message.

Collecting a RACF trace

You can use a command

#SET TRACE(CLASS(CSFSERV),RACROUTE(ALL))

to turn on the trace for that class. You can also use USERID(…) and jobname(…) to further restrict what is traced.

The output goes to GTF.

s gtf,gtf
01 AHL125A  RESPECIFY TRACE OPTIONS OR REPLY U 
 1,trace=usrp                                                                                 
IEE600I REPLY TO 01 IS;TRACE=USRP                                                              
    09.53.19 STC00315  TRACE=USRP                                                                                     
02 AHL101A  SPECIFY TRACE EVENT KEYWORDS --USR=                                                
  - 09.53.27           r 2,usr=(F44),end                                                                              
    09.53.27 STC00315  IEE600I REPLY TO 02 IS;USR=(F44),END                                                           
    09.53.27 STC00315  USR=(F44),END                                                                                  
    09.53.27 STC00315  AHL103I  TRACE OPTIONS SELECTED --USR=(F44)                                                    
  | 09.53.27 STC00315 *03 AHL125A  RESPECIFY TRACE OPTIONS OR REPLY U                                                 
00- 09.53.30           r 3,u                                                                                          
    09.53.30 STC00315  IEE600I REPLY TO 03 IS;U                                                                       
    09.53.30 STC00315  U                                                                                              

Run your work.

P GTF
AHL006I GTF ACKNOWLEDGES STOP COMMAND
AHL904I THE FOLLOWING TRACE DATASETS CONTAIN TRACE DATA :
SYS1.TRACE

if you do not get “THE FOLLOWING TRACE DATASETS CONTAIN TRACE DATA…” it means you did not collect any data.

Use IPCS to look at it

  • =0 and specify the trace data set name
  • if you change scope to both it will remember the data for next time
  • =6 to get you to IPCS Subcommand Entry panel
  • if this is the first time you have used this instance of the data set, you should issue the dropd command to get IPCS to forget about previous usage
  • gtf usr(all) This displays the data
  • You can process this
    • type M and press PF8 to get to the bottom of the data
    • report view will display the data in ISPF edit (view mode)
    • You can now issue commands like
    • x all
    • f code all and look for non zero return codes.
    • del all x
    • sort 30 50 to display all return codes in numerical order. You need to look at the top and the bottom.
    • make a note of the return code ( copy the line to your clipboard)
    • quit
    • report view
    • find return code

To get rid of some of the forest of unhelpful data

  • x all
  • find ‘ ‘ 1 20 all
  • find ‘+’ 1 2 all
  • delete all nx

#SET TRACE(NOCLASS,RACROUTE(ALL))

How do I test my RACF updates or recover from a boo-boo?

I am running z/OS on zPDT on my Linux server. I want to test out some RACF commands, but want to be able to recover if I get it wrong. What can I do?

The product zSecure Admin offers RACF Offline can be used by those who have the license.

I do not have this product so I need a different way.

I am running a single user z/OS image. For anything more complex than this, talk to the experts.

Background to RACF

RACF has a database, typically SYS1.RACFDS. For availability you can have a backup RACF database, typically SYS1.RACFDS.BACKUP. Typically these are both updated when changes are made to the RACF environment. I would call this a duplexed dataset.

You can make a copy of a RACF database using the RACF utility IRRUT200. This takes a lock on database for the duration of the copy and ensures the copy is consistent. If you use other tools to backup the primary database, and there were concurrent database updates, the backup may not be consistent. You risk a partial update, and an inconsistent database.

Changing your RACF

You can use the RVARY command (TSO and operator) to change the RACF data sets. You can setup controls for the RVARY command.

If you use SETROPTS LIST it will list the RACF options being used. Mine has (near the bottom)

DEFAULT RVARY PASSWORD IS IN EFFECT FOR THE SWITCH FUNCTION.
...
DEFAULT RVARY PASSWORD IS IN EFFECT FOR THE STATUS FUNCTION.

In this case the default password is YES

See SETROPTS RVARY(…)

You will get a prompt

ICH703A ENTER PASSWORD TO SWITCH RACF DATASETS JOB=IBMUSER USER=IBMUSER

If you have an I/O problem with the SYS1.RACFDS database

You can

  • switch to use the backup, and stop duplex updates.
  • fix the problem. Perhaps delete the data set and recreate it on a different device.
  • copy the backup dataset using IRRUT200 to the newly allocated dataset
  • activate the newly allocated dataset activate the duplex updates.

Updates to the RACF database are written to SMF 80 records.

If you make mistakes with your commands

Some RACF command changes you can undo, for example, you can use an alter command to set a new value. To be able to undo the change you need to know what the original value was, so it is always a good idea to display a resource before you change or delete it.

If you delete a record there is no easy way of undoing it.

Rob van Hoboken suggested the following.

Plan A – online switch

Display the status

#rvary list

This gave me

ICH15013I RACF DATABASE STATUS:                          
ACTIVE USE NUM VOLUME DATASET
------ --- --- ------ -------
YES PRIM 1 A3CFG1 SYS1.RACFDS
YES BACK 1 A3CFG1 SYS1.RACFDS.BACKUP

This shows both data sets are active and which is the primary and which ise the backup.

Switch off your RACF DUPLEX (backup) data set

#RVARY INACTIVE,DATASET(SYS1.RACFDS.BACKUP)

It prompts with

ICH702A ENTER PASSWORD TO DEACTIVATE RACF JOB=RACF USER=START1

I didnt know the password, but I replied r xx,YES and it worked!

Run the commands that you hope will work.

If the commands do not work successfully

If the commands do not work successfully, make the backup active and switch to it.

#RVARY ACTIVE,DATASET(SYS1.RACFDS.BACKUP)
#RVARY SWITCH

See the command RVARY (Change status of RACF database) .

After you have switched to the backup, make the primary inactive

#RVARY INACTIVE,DATASET(SYS1.RACFDS)

and copy the backup into the primary using IRRUT200

//IBMUSRAC  JOB 1,MSGCLASS=H                              
//S1 EXEC PGM=IRRUT200 PARM=ACTIVATE
//SYSRACF DD DISP=SHR,DSN=SYS1.RACFDS.BACKUP
//SYSUT1 DD DISP=SHR,DSN=SYS1.RACFDS
//SYSUT2 DD SYSOUT=*
//SYSIN DD *
//SYSPRINT DD SYSOUT=*

First run it without PARM=ACTIVATE. If this works successfully, then use PARM=ACTIVATE, this take a lock during the copy, and then activates the SYSUT1 data set, so both are now active.

This gave the output

IRR62005I - IDCAMS REPRO copied SYSRACF to the work data set SYSUT1                                         
ICH15013I RACF DATABASE STATUS:
ACTIVE USE NUM VOLUME DATASET
------ --- --- ------ -------
YES PRIM 1 A3CFG1 SYS1.RACFDS.BACKUP
YES BACK 1 A3CFG1 SYS1.RACFDS
ICH15020I RVARY COMMAND HAS FINISHED PROCESSING.

The #RVARY LIST gave, as expected

ICH15013I RACF DATABASE STATUS:                                                  
ACTIVE USE NUM VOLUME DATASET
------ --- --- ------ -------
YES PRIM 1 A3CFG1 SYS1.RACFDS.BACKUP
YES BACK 1 A3CFG1 SYS1.RACFDS

Which shows it has worked.

If the commands do work successfully

If the commands do work successfully, you need to bring the backup up to date with the primary.

//IBMUSRAC  JOB 1,MSGCLASS=H                              
//S1 EXEC PGM=IRRUT200,PARM=ACTIVATE
//SYSRACF DD DISP=SHR,DSN=SYS1.RACFDS
//SYSUT1 DD DISP=SHR,DSN=SYS1.RACFDS.BACKUP
//SYSUT2 DD SYSOUT=*
//SYSIN DD *
//SYSPRINT DD SYSOUT=*

The #rvary active command.

I would be careful about using the #RVARY ACTIVE command. If you copy a RACF data set, then use the RVARY ACTIVE command to make it available, there is a small window where changes made to the active RACF data base are not made to the offline one. You could tell people “do not issue any commands while this work is going on”, but the RACF database is updated during normal work, for example with profile use counts, and last used date.

Plan B – REIPL with different data sets.

  • Convert to using IRRPRMxx PARMLIB member definitions to specify the data set names, instead of using a load module. See RACF setup, and changing the RACF dataset
  • Copy the RACF database to a recovery RACF data set using IRRUT200
  • Create a new IRRPRMxx member pointing to your recovery data set.
  • Create a new IEASYSxx member to use the RACF=xx member
  • You may have to create a new LOADxx member in SYS1.IPLPARM to point to the new IEASYSxx member.
  • REIPL, and specify the appropriate LOADxx suffix.
  • Fix the problem; copy from the recovery into the SYS1.RACFDS dataset and SYS1.RACFDS.BACKUP, so the primary and backup are the same.
  • Reipl using the normal IPL parameters

You could use the emergency stand-alone one pack system SARES1, but this may not have access to all of the data sets due to SMS and catalogs.

RACF setup, and changing the RACF dataset

I used ADCD supplied version of z/OS, and needed to change which data set I used, so it could be used on z/OS 3.1 or earlier (but not both at the same time).

What options are active?

You can use the operator command

#rvary

where # is the RACF subsystem character (defined in INITPARM(‘#’) in the IEFSSN definition for RACF).

Before z/OS 2.3

You had to define RACF data set parameters using a load module.

  • The databases are defined in a module ICHRDSNT.
  • If you use SDSF and issue the LOAD ICHRDSNT command, it will display the data.
  • The source is in ADCD.LIB.JCL(ICHRDSNT).
  • If you want to change it, you need to generate the module, and REIPL

Parmlib definitions

In z/OS 2.3 and later you can specify parameters in the parmlib concatenation.

In your IEASYS member you can specify RACF=(xx,yy,zz) and specify up to three parameters. These parameters take precedence over the ICHRDSNT table. I specified RACF=(00)

My equivalent definition in USER.Z25D.PARMLIB(IRRPRM00) is

DATABASE_OPTIONS 
DATASETNAMETABLE
ENTRY
PRIMARYDSN(SYS1.RACFDS)
BACKUPDSN(SYS1.RACFDS.BACKUP)

You can have one primary data set and up to one backup data set. If you want to partition your database by key, you can have more primary and backup data sets entries, and you need to specify the key range to data set mapping.

The syntax is defined here.

The datasets have to be cataloged. I have used a data set catataloged in a user catalog, they do not need to be SYS1.xxxx.

If you get it wrong and the member is invalid, at IPL RACF prompts

 *IRRY115I RACF IS NOT USING PARMLIB FOR INITIALIZATION.                     
*01 ICH502A SPECIFY NAME FOR PRIMARY RACF DATA SET SEQUENCE 001 OR 'NONE'
IRA600I SRM CHANNEL DATA NOW AVAILABLE FOR ALL SRM FUNCTIONS
IRA860I HIPERDISPATCH MODE IS NOW ACTIVE
R 01,SYS1.RACFDS
*02 ICH502A SPECIFY NAME FOR BACKUP RACF DATA SET SEQUENCE 001 OR 'NONE'
R 2,SYS1.RACFDB.BACKUP

Changing parmlib

If you change the IEASYS RACF=, or change an IRRPMRxx member, you have to REIPL to pick up the changes. Stopping and restarting RACF does not pick up the changes, because the RACF started task handles operator commands and other admin stuff, it does not process the data sets. See here.

Copying the RACF database

You can copy a RACF database and use that. For example

//IBMCRACO  JOB 1,MSGCLASS=H 
//STEP EXEC PGM=IRRUT200
//SYSRACF DD DSN=SYS1.RACFDS,DISP=SHR
//SYSUT1 DD UNIT=SYSDA,SPACE=(CYL,(20)),
// DCB=(LRECL=4096,RECFM=F),DSN=COLIN.RACFDS,DISP=(MOD,CATLG)
//SYSUT2 DD SYSOUT=A
//SYSPRINT DD SYSOUT=A
//SYSIN DD *
INDEX
MAP
END
/*

IRRUT200 is called a RACF database verification utility program (IRRUT200), and to make an exact copy of a RACF data set. It takes a lock on the RACF database for the duration of the copy to ensure the data is consistent.

You should check the size of the source dataset, and make the copy the same size.

RACF – I cannot delete a certificate!

I made a mistake creating a certificate, and could not delete it, because I got a message

IRRD109I The certificate cannot be added. Profile … is already defined.

In my JCL I had

RACDCERT GENCERT  -                                           
CERTAUTH -
SUBJECTSDN(CN('DocZosCADSA')-
O('COLIN') -
OU('CA')) -
NOTAFTER( DATE(2027-07-02 ))-
KEYUSAGE( CERTSIGN ) -
DSA
SIZE(1024) -
WITHLABEL('DocZosCADSA')

This created a certificate

I was missing the – after DSA, so the DSA, SIZE(1024) and importantly the WITHLABEL() was missing.

When I tried to recreate this certificate I got message

IRRD109I The certificate cannot be added. Profile 00.CN=DocZosCADSA.OU=CA.O=COLIN is already defined.

My problem was, what do I delete ?

The following command listed all of the certificate owned by certauth.

RACDCERT certauth LIST

I searched for CN=DocZosCADSA and it found. (Use the CN=… not the whole string)

Label: LABEL00000002 
...
Subject's Name:
>CN=DocZosCADSA.OU=CA.O=COLIN<
...

Note the label.

The command

RACDCERT CERTAUTH DELETE(LABEL('LABEL00000002'))

Deleted the certificate in error – problem solved.

Note: For a personal certificate the reported certificate was like

02AB.CN=SSCA256.OU=CA.O=SSS.C=GB

If you use the list command the certificate sequence number 02AB is on a different line to the remainder of the label.

Using RACF callable services including from a 64bit bit program

You can use RACF callable services to programatically get and set RACF information, for example to list and display digital certificates, and objects.

There is a C interface to these services. These interfaces are easy to use as long as you are careful with your data types, and get your compile JCL right. You can use 31 and 64 mode programs with these services.

JCL to compile a 64 bit program

Below is the JCL I use for compile programs which use gskit and RACF callable services.

//COLINC5    JOB 1,MSGCLASS=H,COND=(4,LE) 
//S1 JCLLIB ORDER=CBC.SCCNPRC
// SET LOADLIB=COLIN.LOAD
//*OMPILE EXEC PROC=EDCCB,
//COMPILE EXEC PROC=EDCQCB,
// LIBPRFX=CEE,
// CPARM='OPTFILE(DD:SYSOPTF),LSEARCH(/usr/include/)',
// BPARM='SIZE=(900K,124K),RENT,LIST,RMODE=ANY,AMODE=64,AC=1'
//COMPILE.SYSOPTF DD *
...
/*
//COMPILE.SYSIN DD DISP=SHR,DSN=COLIN.C.SOURCE(...)
//BIND.SYSLMOD DD DISP=SHR,DSN=COLIN.LOAD
//BIND.OBJLIB DD DISP=SHR,DSN=COLIN.OBJLIB
//BIND.GSK DD DISP=SHR,DSN=SYS1.SIEALNKE
//BIND.CSS DD DISP=SHR,DSN=SYS1.CSSLIB
//BIND.SYSIN DD *
INCLUDE GSK(GSKCMS64)
INCLUDE GSK(GSKSSL64)
INCLUDE CSS(IRRSDL64)

NAME AMSCHE64(R)

Note the 64 bit specific items

  • PROC=EDCQCB
  • RMODE=ANY,AMODE=64
  • The includes of the GSK*64 stubs
  • The include of the 64 bit RACF callable stub IRRSDL64

The 31 bit equivilants are

  • PROC=EDCCB
  • RMODE=ANY,AMODE=31
  • The includes of the GSK*31 stubs: GSKCMS31,GSKSSL
  • The include of the 31 bit RACF callable stub IRRSDL00

The source is specified via //COMPILE.SYSIN

SYSOPTF

For both 64 bit and 31 bit programs

//COMPILE.SYSOPTF DD * 
LIST,SOURCE
aggregate(offsethex) xref
SEARCH(//'COLIN.C.H',//'SYS1.SIEAHDR.H')
TEST
RENT LO
OE
INFO(PAR,USE)
NOMARGINS EXPMAC SHOWINC XREF
LANGLVL(EXTENDED) sscom dll
DEFINE(_ALL_SOURCE)
DEBUG

Skeleton of C program

#pragma linkage(IRRSFA64 ,OS) 

The pragma is needed for the bind operation. It says the module is a z/OS callable service type of module (and not a C program).

#ifdef _LP64 
#include <irrpcomy.h>
#else
#include <irrpcomx.h>
#endif

You need a different copy book for the RACF constants depending on the 31/64 bit mode.

IRRPCOMY contains definitions for 64 bit programs, IRRCOMX is for 31 bit programs.

char * workarea ; 
workarea = (char *) malloc(1024);
int ALET1= 0;
int parmAlet = 0;
int numParms =11;
short function_code = 1;
int ALET2= 0;
int ALET3= 0;
int SAF_RC = 0;
int RACF_RC = 0;
int RACF_RS = 0;

The variables have to be “int”, not “long”, as they are 4 bytes long. With 64 bit program, a long is 8 bytes long. See here for a table about the types and lengths in 31 bit and 64 bit programs. A short is 2 bytes long.

Set up the parameter list 

The macro IRRPCOM? provides header files for some definitions.

For example

char * pSTC = "AZFTOTP1"; 
char area[1000];

struct fact_getf_plist pl;
pl.fact_getf_options = 0;
pl.fact_getf_factor_length = 8;
pl.fact_getf_factor_a = pSTC;
pl.irrpcomy_dummy_34 = 0;
pl.fact_getf_af_length = sizeof(area);
pl.fact_getf_af_a = & area;

where pl is used below.

Call the function

 IRRSFA64( workarea, // WORKAREA 
&ALET1 , // ALET
&SAF_RC, // SAF RC
&ALET2, // ALET
&RACF_RC,// RACF RC
&ALET3 , // ALET
&RACF_RS,// RACF Reason
&numParms,
&parmAlet, //
&function_code,
&pl );

The irrpcomx has a structure definition for the parameter list, but I could not get it to work in these programs, as it passes the address of the data, instead of the data itself.

How to take (and process) a RACF GTF trace with Java

When trying to resolve a certificate problem in a Java program, see here, I tried unsuccessfully to take a RACF trace to see what calls were being issued, and what reason codes were being returned.

The RACF GTF had no entries for the Java program!

Start RACF trace

My started task was called OZUSRV4. I had to specify a jobname to RACF trace of OZUSRV4* because Java spawns address spaces, and it was a spawned address space that did all of the Java work. If your started task is 8 characters long – just specify the 8 character name.

The trace command was the RACF SET TRACE command, where # is my RACF subsystem recognition character.

#SET TRACE(CALLABLE(TYPE(41))JOBNAME(OZUSVR4*))

Where type(41) is for IRRSDL00 which performs the R_datalib, keyring processing.

Start GTF

S GTF.GTF
R 1,trace=usrp
R 2,USR=(F44) 
R 3,END
R 4,U 

Run the test

I ran my started task, and stopped the RACF trace

#SET TRACE(CALLABLE(NONE))JOBNAME(OZUSVR4*)) 
#set list

The output of the #set list command included

TRACE OPTIONS                   - NOIMAGE                                    
                                - NOAPPC                                     
                                - NOSYSTEMSSL                                
                                - NORRSF                                     
                                - NORACROUTE                                 
                                - NOCALLABLE                                 
                                - NOPDCALLABLE                               
                                - NODATABASE                                 
                                - NOGENERICANCHOR                            
                                - NOASID                                     
                                - JOBNAME                                    
                                   OZUSVR4*                                  
                                - NOCLASS                                    
                                - NOUSERID                                   
SUBSYSTEM USERID                - START1                                     

So the traces are off…. but it still has a reference to OZUSVR4 – strange.

Process the GTF file.

I used IPCS to look at the GTF file

  • =0 and specify the GTF file name
  • =6 dropd to drop any saved status from last time that dataset was used
  • gtf usr(all) It displays the output in an editor like window.
  • report view displays it in ISPF editor, view mod.
  • You can the do things like
    • x all
    • f ‘RACF Reason code’ all

To display the records with non zero return codes.

The output is very chatty – and it was hard to find the data I wanted from data with a hex dump of the string “OFFSET” etc. For example

Trace Identifier:             00000036                           
Record Eyecatcher:            RTRACE                             
Trace Type:                   OMVSPRE                            
Ending Sequence:              ........                           
Calling address:              00000000  79403A2D                 
Requestor/Subsystem:          ........  ........                 
Primary jobname:              OZUSVR44                           
Primary asid:                 00000035                           
Primary ACEEP:                00000000  008FC8A0                 
Home jobname:                 OZUSVR44                           
Home asid:                    00000035                           
Home ACEEP:                   00000000  008FC8A0                 
Task address:                 00000000  008CF298                 
Task ACEEP:                   00000000  00000000                 
Time:                         DDD4C11D  776E2A40                 
Error class:                  ........                           
Service number:               00000029                           
RACF Return code:             00000000                           
RACF Reason code:             00000000                           
Return area address:          00000000  00000000                 
Parameter count:              0000002B    
...                       
Area length:                  00000008                                                                                
                                                                                                                      
Area value:                                                                                                  
D6C6C6E2  C5E30050                               | OFFSET.&                         |  
                                                                                                                      
Area length:                  00000007                                                                                
                                                                                                                      
Area value:                                                                                                           
06E2E3C1  D9E3F1                                 | .START1                          |  

I wrote a REXX exec which post processes the output and removes what I think is irrelevant data.

An example of what I think is useful is below. Non zero return codes have ! in column 1

! Return code: 00000008 8 
! Reason code: 00000004 4  4 Parameter list error occurred. 
-  -  -  -  -  -  -  -  -  -  -  -  -  -  -  -  -  -  -  -  -  -  -  -  -  -  - 
! Return code: 00000008 8 
! Reason code: 0000002C 44 44 No certificate found with the specified status 
-  -  -  -  -  -  -  -  -  -  -  -  -  -  -  -  -  -  -  -  -  -  -  -  -  -  - 
Area value: 
00000050  10AFC67C  ...
...
  -  -  -  -  -  -  -  -  -  -  -  -  -  -  -  -  -  -  -  -  - 
Area value:          | .START1                          | 
06E2E3C1  D9E3F1                                                
Area value:          | .MQRING                          | 
06D4D8D9  C9D5C7                                                

You can download the rexx exec from

You need to upload it to a CLIST available to ISPF.

Solving certificate problems in Java on z/OS

I spent many any hour trying to understand why z/OSMF was getting a message saying certificate not found in keyring, when it was always there when I checked it.

I tried Java trace options but they did not help. I have my own Java program, and that gave me a message from IRRSDL00 (the callable service to access keyrings). But when I did a RACF GTF trace to get see what was going on I got no entries in the trace. Weird. Once I solved the problems, the solution was obvious.

My Java program reported

java.io.IOException: The private key of NEWTECCTEST is not available or no authority to access the private key

z/OSMF report

[ERROR ] CWPKI0024E: The NISTECCTEST certificate alias specified by the attribute serverKeyAlias is either not found in KeyStore safkeyring://START1/MQRING or it is invalid.

The problem and the solution

The message The private key … is not available or no authority to access the private key. Has a hint as to the problem. The documentation is hidden away. It was not as bad as

It was on display in the bottom of a locked filing cabinet stuck in a disused lavatory with a sign on the door saying ‘Beware of the Leopard.”

but it is not easy to find. It says

Applications can call the R_datalib callable service (IRRSDL00) to extract the private keys from certain certificates after they have access to the key ring. A private key is returned only when the following conditions are met:

  1. For RACF real key rings:
    • User certificates An application can extract the private key from a user certificate if the following conditions are met:
      • The certificate is connected to the key ring with the PERSONAL usage option.
      • One of the following two conditions is true:
        • The caller’s user ID is the user ID associated with the certificate if the access to the key ring is through the checking on IRR.DIGTCERT.LISTRING in the FACILITY CLASS, or
        • The caller’s user ID has READ or UPDATE authority to the <ringOwner>.<ringName>.LST resource in the RDATALIB class. READ access enables retrieving one’s own private key, UPDATE access enables retrieving other’s.

I had a keyring START1.MQRING and the start task userid had read access to it. Within the keyring was the certificate NISTECCTEST owner by userid START1. The started task userid needs UPDATE access to the keyring to be able to access the private key belonging to a different userid.

Reasons for “not found” reason code

Under the covers the callable server IRRSDL00 is called. The reason code are documented here. You might get SAF return code 8, RACF return code 8, RACF reason code 44.

  • The certificate was not in the keyring
  • It was NOTRUST
  • It had expired
  • The CA for the certificate was not in the keyring,
  • The userid did not have update access to the keyring when there are private certificates from other userids.

Certificate dates with ICSF and PKISERV

Digging into certificates provided by ICSF, I got confused with the dates.

There are 3 places that start/end dates can be, and they have different meanings and uses.

  1. The validity period you see in the certificate is part of the certificate itself. That is added at certificate creation and enforced by the application (such as System SSL).
  2. Within ICSF there are fields START DATE/END DATE you see in the panels are CKA_START_DATE and CKA_END_DATE. They are defined in the PKCS#11 standards but are not enforced.
  3. Within record metadata for a KDSR format record, you will see Cryptoperiod start date/Cryptoperiod end date. This is enforced by ICSF. Usage outside this time frame is not permitted.

Note that PKCS#11 services are the only place you can see all three of these. Neither CKDS nor PKDS can hold certificates, nor do they support PKCS#11 attributes.

Official document

The standards document PKCS #11 Cryptographic Token Interface Base Specification says

CKA_START_DATE – Start date for the certificate (default empty)
CKA_END_DATE – End date for the certificate (default empty

Section 4.6.2 (Certificate objects Overview):
The CKA_START_DATE and CKA_END_DATE attributes are for reference only; Cryptoki does not attach any special meaning to them. When present, the application is responsible to set them to values that match the certificate’s encoded “not before” and “not after” fields (if any).

Section 4.7.2 (Key Objects Overview) has similar wording:
Note that the CKA_START_DATE and CKA_END_DATE attributes are for reference only; Cryptoki does not attach any special meaning to them. In particular, it does not restrict usage of a key

Thanks to Eric Rossman, for helping me understand this.

Setting up Linux to z/OS certificates

Several times I have had to set up certificates between Linux and z/OS and struggled for a day to get them working. Once you are familiar with doing it – it is easy. As the last time I needed to do this was over a year ago, I’ve forgotten some of the details. This blog post is to help me remember what I need to do, and to help other who struggle with this.

I’m ignoring self signed.

Basic TLS

A certificate contains

  • who it belongs to, such as CN=COLIN,O=SSS
  • the date range the certificate is valid
  • a public key
  • meta data about the key: What algorithm does the public key use, what parameters were used in the key generation, for example, algorithm=RSA, Keysize=2048.

There is a private key.

  • If you encrypt using the private key, you can use the public key to decrypt it.
  • If you encrypt using the public key, you can use the private key to decrypt it.
  • If you encrypt something with my public key, and then encrypt it with your private key. I know it came from you (or someone with your private key) and only I (or someone with my private key) can decrypt it.

Anyone can have the public key. You keep the private key secure.

Certificate Authority. This is used in validating the trust of certificates. You send your certificate to the CA, The CA does a checksum of your data, and encrypts this checksum with the CA private key. It returns your original data appended with the encrypted checksum, and information about the CA, and what was used to calculate the checksum. If someone else has the CA public key, they can do the opposite process. Do the checksum calculation, and decrypt the checksum value in the certificate, using the CA public key. If they match you know it was signed by the CA. This is known as signing the certificate.

To be able to validate a certificate sent to it, the client end needs the CA of the server end. The server needs the CA of the client end to be able to validate the client’s certificate.

During the handshake to establish the TLS connection there is a flow like

  • Establish the cipher spec to use
  • Server sends down its certificate, the client checks it
  • Servers sends down “Certificate request”, and these are the certificate(CAs) I know about
  • The client goes through it’s list of certificates (usually only one), to find the first certificate with a CA in the list sent from the server.
  • sends the client certificate to the server
    • The server checks the certificate. For example the server may be set up to accept a subset of valid algorithms, for example TLS 1.2, and Elliptic Curve. If a certificate is sent up using RSA, then this is not accepted
    • The server checks the signature of the certificate, finds the CA name, checks in the trust store for this CA, and validates the signature. Depending on the application it may check all the CA’s in the CA chain.

What do you need for the handshake to work

  • You need to have a Certificate Authority to sign certificates. In the CA certificate are some flags that say this is a CA.
  • You need to send the public key of each CA to the other end. You normally need to do this just once, and keep using the same certificates for all your TLS work.
  • You need to have a key store/trust store/keyring to hold certificates.
  • On z/OS
    • you may have a keyring for different projects, for example MQ, and TN3270.
    • You need to connect the client CA into each keyring where it will be used.
  • You need to check that the certificates are compatible with the remote end, such as Algorithm etc.

Openssl files

When using openssl, you can store common information in a configuration file. See here. This configuration file has some required options, and some optional options where you can specify common options you frequently use.

If you are using the openssl req command (for example), by default it will look for a section called [req]. This can in turn point to other sections. Using this file you can specify most of your fields in one place, and just override the specific ones.

Create a CA certificate on Linux

I have a bash file docca.sh file on Linux.

CA=”docca256″
casubj=” -subj /C=GB/O=DOC/OU=CA/CN=SSCA256″
days=”-days 1095″
rm $CA.pem $CA.key.pem

openssl ecparam -name prime256v1 -genkey -noout -out $CA.key.pem1

openssl req -x509 -sha384 -config caca.config -key $CA.key.pem2 -keyform pem -nodes $casubj -out $CA.pem3 -outform PEM $days

openssl x509 -in $CA.pem -text -noout|less4

This

  1. creates a private key (docca256.key.pem)
  2. self signs it. For any parameters not specified, it uses the configuration file caca.config and section “req” (signing request) within it.
  3. produces a public certificate in docca256.pem. This file will need to be sent to the backend servers. You can use cut and paste or FTP as ASCII.
  4. displays the x509 data

The caca.config file has

[ req ]
distinguished_name = ca_distinguished_name
x509_extensions = ca_extensions

prompt = no

authorityKeyIdentifier = keyid:always,issuer:always

[ca_distinguished_name ]
# C=GB
# O=DOC
# OU=Stromness
# CN=SSSCA4

####################################
[ ca_extensions ]

subjectKeyIdentifier = hash
authorityKeyIdentifier = keyid:always
basicConstraints = critical,CA:TRUE, pathlen:0
keyUsage = keyCertSign, digitalSignature,cRLSign

The distinguished_name = ca_distinguished_name says go and look in the file for a section [ca_distinguished_name], and x509_extensions = ca_extensions says go and look for a section called [ca_extensions]. You can specify your own names, for example I could have used section1, and s2.

When prompt = yes, openssl takes as defaults the values in the distinguished_name section. When prompt = no, the distinguished_name is still required – but the contents of the section are ignored.

The values in the x509_extensions are defined here.

Creating an Elliptic Curve certificate on Linux

I used another bash script docecadad.sh, to document an ElliptiCal certificate for userid ADCD. It uses the CA defined above.

name="docecadcd"
key="$name.key.pem"
cert="$name.pem"
subj="-subj /C=GB/O=Doc/CN="$name
CA="docca256"
cafiles="-cert $CA.pem -keyfile $CA.key.pem "

enddate="-enddate 20240130164600Z"
passin="-passin file:password.file"
passout="-passout file:password.file"

rm $name.key.pem
rm $name.csr
rm $name.pem

#define a certificate with elliptic key with size 256

openssl ecparam -name prime256v1 -genkey -noout -out $name.key.pem 
#create a certificate request (ie hello CA please sign this)
openssl req -config openssl.config -new -key $key -out $name.csr -outform PEM -$subj $passin $passout

# sign it.

caconfig="-config ca2.config"
policy="-policy signing_policy"
extensions="-extensions clientServer"

md="-md sha384"

openssl ca $caconfig $policy $md $cafiles -out $cert -in $name.csr $enddate $extensions

# display it 
openssl x509 -in $name.pem -text -noout|less

Where the openssl.config file has

[ req ]
default_bits       = 2048

distinguished_name = server_distinguished_name
req_extensions     = server_req_extensions
string_mask        = utf8only
subjectKeyIdentifier   = hash
#extendedKeyUsage     = critical, codeSigning


[ server_req_extensions ]

subjectKeyIdentifier = hash
# subjectAltName       = DNS:localhost, IP:127.0.0.1, IP:127.0.0.6
# nsComment            = "OpenSSL"
keyUsage             = critical, nonRepudiation, digitalSignature
# extendedKeyUsage     = critical, OCSPSigning, codeSigning
subjectKeyIdentifier   = hash 

[ server_distinguished_name ]
#c=GB
#o=SSS
#cn=mqweb
  • See above for the distinguished_name value.
  • req_extensions says use the section [server_req_extensions]

The ca2.config file used to sign it has

HOME            = .
RANDFILE        = $ENV::HOME/.rnd

####################################################################
[ ca ]
default_ca    = CA_default      # The default ca section
####################################################################
[ CA_default ]
default_days     = 1000         # How long to certify for
default_crl_days = 30           # How long before next CRL
#default_md       = sha1       # Use public key default MD
default_md       = sha256       # Use public key default MD
preserve         = no           # Keep passed DN ordering

x509_extensions = ca_extensions # The extensions to add to the cert

email_in_dn     = no            # Don't concat the email in the DN
copy_extensions = copy          # Required to copy SANs from CSR to cert

#defaults
base_dir      = .
certificate   = $base_dir/cacert.pem   # The CA certifcate
private_key   = $base_dir/cakey.pem    # The CA private key
new_certs_dir = $base_dir              # Location for new certs after signing
database      = $base_dir/index.txt    # Database index file
serial        = $base_dir/serial.txt   # The current serial number

unique_subject = no  # Set to 'no' to allow creation of
                     # several certificates with same subject.


####################################################################
[ ca_extensions ]

subjectKeyIdentifier   = hash
authorityKeyIdentifier = keyid:always, issuer:always
basicConstraints       = critical,CA:TRUE, pathlen:0
keyUsage               = nonRepudiation

####################################################################

[ signing_policy ]
countryName            = optional
stateOrProvinceName    = optional
localityName           = optional
organizationName       = optional
organizationalUnitName = optional
commonName             = supplied

[ clientServer ]

keyUsage               = digitalSignature, keyAgreement, digitalSignature, nonRepudiation, keyEncipherment, dataEncipherment
subjectAltName         = DNS:localhost, IP:127.0.0.1, 
extendedKeyUsage       = serverAuth,clientAuth
subjectKeyIdentifier   = hash
authorityKeyIdentifier = keyid:always, issuer:always
nsComment  = "clientserver"

The policy (my [ signing_policy] ) must have entries in it to create a valid Subject Distinguished name. Without it, I got a strange RACF code (0x0be8044d).

Send the CA to z/OS and import it

You need to send the CA public certificate to z/OS. This file looks like

-----BEGIN CERTIFICATE-----                                      
MIIFbTCCA1WgAwIBAgIUJw+gLLSFxqCyTdIyEUWyQ/g9JnEwDQYJKoZIhvcNAQEL.
...
-----END CERTIFICATE----- 

You can FTP the file, or create the file and use cut and paste. The file needs to be a sequential dataset with format VB. My file is VB, lrecl=256,blksize=6233. For the FTP I used

put docca256.pem ‘colin.docca256.pem’

You need to import this into RACF, and connect it to the keyrings.

//IBMRACF  JOB 1,MSGCLASS=H                                     
//S1  EXEC PGM=IKJEFT01,REGION=0M                               
//SYSPRINT DD SYSOUT=*                                          
//SYSTSPRT DD SYSOUT=*                                          
//SYSTSIN DD *                                                  
RACDCERT CHECKCERT('COLIN.DOCCA256.PEM') 
/*                       

The CHECKCERT gave me

Certificate 1:                                                          
                                                                        
  Start Date: 2022/10/09 11:45:43                                       
  End Date:   2025/10/08 11:45:43                                       
  Serial Number:                                                        
       >782A62948699FF3FB00238FB296E4A647B7DF07C<                       
  Issuer's Name:                                                        
       >CN=SSCA256.OU=CA.O=DOC.C=GB<                                    
  Subject's Name:                                                       
       >CN=SSCA256.OU=CA.O=DOC.C=GB<                                    
  Signing Algorithm: sha384ECDSA                                        
  Key Usage: HANDSHAKE, CERTSIGN                                        
  Key Type: NIST ECC                                                    
  Key Size: 256                                                         
                                                                        

Which matches what I expected, and gave me information about the certificate – ECC, 256, and signed with SHA 384 ECDSA, (from the -sha384 parameter above).

Define it to RACF and connect it to the user’s keyring

racdcert list  (label('Linux-CA256')) CERTAUTh 
RACDCERT DELETE  (LABEL('LINUXDOCCA256'))    CERTAUTH             
              
RACDCERT ADD('COLIN.DOCCA256.PEM') -                            
          CERTAUTH  WITHLABEL('LINUXDOCA256') TRUST 

RACDCERT ID(START1) CONNECT(RING(MQRING)-        
                            CERTAUTH     -       
                            LABEL('LINUXDOCCA256'))                     

If you delete a certificate, then it is removed from all keyrings. Once you have re-added it you need to reconnect it to all the keyrings. If you list the label (racdcert list (label(‘Linux-CA256’)) certauth) it will display where it is used, so you can read it.

Download the z/OS CA certificate

I downloaded the z/OS exported certificate in .pem format. it looks like

-----BEGIN CERTIFICATE-----                                       
MIIDYDCCAkigAwIBAgIBADANBgkqhkiG9w0BAQsFADAwMQ4wDAYDVQQKEwVDT0xJ  
... 
+9TRng==                                                          
-----END CERTIFICATE-----

You can use ftp or cut and paste. I created doczosca.pem.

Use it!

Before I could use it, I had to set up the server’s certificate, and download the z/OS CA certificate

I set up a bash script scl2.sh

name=docecadcd
host="-connect 10.1.1.2:4000"
CA="--CAfile doczosca.pem "

openssl s_client $host  $CA -cert $name.pem -certform PEM -key $name.key.pem -keyform PEM