Using and debugging RACF CLASS(APPL) and pthread_security_np

I’ve blogged I want to be someone else – or using pthread_security_np, how a server can be passed a userid or password to logon to a server to do some work. For example I was logging on from a web server to the RMF GPMSERVE for displaying RMF reports in a a web browser.

I had followed the documentation, but all my userids had access, when none of them should have done. This blog post covers the steps I used to dig into this.

The RACF calls used, have a flag set “Suppress any RACF Messages”, which makes it harder to diagnose problems.

With most RACF calls you can define a profile

ADDSD 'COLIN.PROT.DATASET' UACC(NONE) WARNING

where the WARNING option says “If the userid does not have access – give the userid access, but write a message on the console”. This allows you to find when you need to grant access. Once the profile is established, you can use the ALTSD … NOWARNING. Userids requesting access, but which are not permitted will now fail.

Defining a profile with CLASS(APPL), the warning has no effect. I had to use NOTIFY(COLIN) to get notified of problems.

Tracing the requests

To trace all of the RACF calls made by my job COLINNT I issued

#set trace(callable(all),racroute(all),jobname(COLINNT))      

This gave me many hundreds of calls.

Once I had been through the whole process, I found the trace for the pthread_security_np etc calls was just

set trace(callable(type(38)),jobname(COLINNT))

Collect a trace

See Collecting and understanding a RACF GTF trace output.

Looking at the trace output

There is a trace record “before” (PRE) matching “after” (POST). Many parameters are the same (as you might expect).

The output of these traces is verbose, with unnecessary information and with extra blanks lines. For example for one parameter

   Area length:                  00000008 

Area value:
D6C6C6E2 C5E30000 | OFFSET.. |

Area length: 00000004

Area value:
00000000 | .... |

Area length: 00000008

In the description below, I’ve squashed this down to a single line

area length  8  OFFSET0000 length 4 00000000 

It looks like the field OFFSET0000 has the offset in hex from somewhere. I didn’t find this information useful.

A compressed trace record is given below. For the interpretation of the fields see the following trace record.

RTRACE
OMVSPRE
Service number 00000026
Parameters
area length x6c
area length 8 OFFSET0000 length 4 00000000
area length 8 OFFSET0004 length 4 00000000
area length 8 OFFSET0008 length 4 00000000
area length 8 OFFSET000C length 4 00000000
area length 8 OFFSET0010 length 4 40404040
area length 8 OFFSET0014 length 4 00000000
area length 8 OFFSET0018 length 4 40404040
area length 8 OFFSET001C length 1 01
area length 8 OFFSET0020 length 4 C4800000
area length 8 OFFSET0024 length 6 05c1c2c3c4C2 = .ABCDB
area length 8 OFFSET0028 length 4 00000000
area length 8 OFFSET002C length 9 08C7D7D4 E2C5D9E5 C5 = .GPMSERVE
Internal information
area length 8 OFFSET0034 length x37 = "Server Userid=IBMUSER created a full ACEE"
area length 8 OFFSET0048 length 4 00000000
area length 8 OFFSET0050 length 9 08000000 00000000 00
area length 8 OFFSET0054 length 1 00
area length 8 OFFSET0018 length 4 40404040
Internal data
area length xc0 ACEE
area length x50 userid information
area length x90 ACEX
area length x50 USP

hex dump of record

The RACF commands documentation says for a callable service, 0x00000026 is function number for IRRSIA00. This page gives the name of the function name IRRSIA00 and the description z/OS kernel on behalf of servers that use pthread_security_np servers or __login, or MVS servers that do not use z/OS UNIX services.

The parameters are for the initACEE (IRRSIA00) call.

The matching “after” record, OMVSPOST was (with the parameters of the the IRRSIA00 call format, parameters) are

RTRACE
OMVSPOST
Service number: 00000026
RACF Return code: 00000008
RACF Reason code: 00000020

Parameters
area length 8 OFFSET0000 length 4 00000000 >work_area<
area length 8 OFFSET0004 length 4 00000000 >ALET<
area length 8 OFFSET0008 length 4 00000008 >SAF_return_code<
area length 8 OFFSET000C length 4 00000000 >ALET<
area length 8 OFFSET0010 length 4 00000008 >RACF_return_code<
area length 8 OFFSET0014 length 4 00000000 >ALET<
area length 8 OFFSET0018 length 4 00000020 >RACF_reason_code<
area length 8 OFFSET001C length 1 01 >Function_code 1 = Create an ACEE<
area length 8 OFFSET0020 length 4 C4800000 >Attributes - see below<
area length 8 OFFSET0024 length 6 05c1c2c3c4C2 >Userid .ABCDB<
area length 8 OFFSET0028 length 4 00000000 >ACEE_pointer<
area length 8 OFFSET002C length 9 08C7D7D4 E2C5D9E5 C5 >APPLID = .GPMSERVE<
Internal information
area length 8 OFFSET0034 length x37 = "Server Userid=IBMUSER created a full ACEE"
area length 8 OFFSET0048 length 4 00000000
area length 8 OFFSET0050 length 9 08000000 00000000 00
area length 8 OFFSET0054 length 1 00
area length 8 OFFSET0018 length 4 40404040
Internal data
area length xc0 ACEE
area length x50 userid information
area length x90 ACEX
area length x50 USP

There the attributes C480000 mean

  • X80000000 – Create the ACEE
  • X40000000 – Createthe USP for the userid
  • X04000000 – Suppress any RACF Messages
  • X00800000 – Return an OUSP in the output area

Note the flag:X04000000 – Suppress any RACF Messages.

The return code

area length  8  OFFSET0008 length 4 00000008 >SAF_return_code<
...
area length 8 OFFSET0010 length 4 00000008 >RACF_return_code<
area length 8 OFFSET0014 length 4 00000000 >ALET<
area length 8 OFFSET0018 length 4 00000020 >RACF_reason_code<

8,8,32 means The user does not have appropriate RACF access to either the SECLABEL, SERVAUTH profile, or APPL specified in the parmlist.

For other RACF services the trace entries follow a similar format.

Collecting and understanding a RACF GTF trace output

A RACF request can be

  • A callable service, such as IRRSDL00 which is used to extract information about keyrings. This can be used by High Level Languages. These have a trace type of OMVS
  • A RACROUTE request, an assembler macro interface, such as FASTAUTH. These have a trace type of RACF.

You get a trace entry PRE the call, and and a trace entry POST the call. A callable service would have an OMVSPRE, and an OMVSPOST entry.

When setting up your RACF trace you need to define which callable service, or which RACROUTE requests you want to trace.

Example of a callable server

Entry trace

Trace Identifier:             00000036 
Record Eyecatcher: RTRACE
Trace Type: OMVSPRE
Ending Sequence: ........
Calling address: 00000000 20801A0F
Requestor/Subsystem: ........ ........
Primary jobname: IBMINC5
Primary asid: 00000029
Primary ACEEP: 00000000 008FA8A8
Home jobname: IBMINC5
Home asid: 00000029
Home ACEEP: 00000000 008FA8A8
Task address: 00000000 008D6A88
Task ACEEP: 00000000 00000000
Time: DFF36A27 5460CC00
Error class: ........
Service number: 00000029
RACF Return code: 00000000
RACF Reason code: 01000000
Return area address: 00000000 00000000
Parameter count: 00000021

Interesting information

  • Trace Type: OMVSPRE this is an entry trace
  • Primary jobname: IBMINC5
  • Home jobname: IBMINC5
  • The RACF return and reason code have no meaning on entry
  • Service number: 00000029 is hex so the service is 41. The RACF commands documentation says for a callable service, 41 is function number IRRSDL00.

Exit trace

Trace Identifier:             00000036 
Record Eyecatcher: RTRACE
Trace Type: OMVSPOST
Ending Sequence: ........
Calling address: 00000000 20801A0F
Requestor/Subsystem: ........ ........
Primary jobname: IBMINC5
Primary asid: 00000029
Primary ACEEP: 00000000 008FA8A8
Home jobname: IBMINC5
Home asid: 00000029
Home ACEEP: 00000000 008FA8A8
Task address: 00000000 008D6A88
Task ACEEP: 00000000 00000000
Time: DFF36A27 5468DC40
Error class: ........
Service number: 00000029
RACF Return code: 00000008
RACF Reason code: 00000014
Return area address: 00000000 00000000
Parameter count: 00000021

Interesting fields

  • Trace Type: OMVSPOST this is the exit trace
  • As above service number 41 is function
  • The RACF Return code is 8
  • The RACF reason code is 14

Example of a RACROUTE trace

Entry

Trace Identifier:             00000036 
Record Eyecatcher: RTRACE
Trace Type: RACFPRE
Ending Sequence: ........
Calling address: 00000000 A096A08A
Requestor/Subsystem: CRYPTO CRYPTO
Primary jobname: CSF
Primary asid: 00000038
Primary ACEEP: 00000000 008F6B98
Home jobname: IBMEXP
Home asid: 00000029
Home ACEEP: 00000000 008FA8A8
Task address: 00000000 008D6A88
Task ACEEP: 00000000 00000000
Time: DFF37DD1 AB7FF040
Error class: ........
Service number: 00000002
RACF Return code: 00000000
RACF Reason code: 00000000
Return area address: 00000000 00000000
Parameter count: 00000009

This trace entry was from a batch job IBMEXP, using an ICSF service to access an encryption key.

Interesting fields

  • RACFPRE this is a RACROUTE before entry
  • Requestor/Subsystem: CRYPTO which z/OS component
  • Primary jobname: CSF This was the address space which issued the RACROUTE request
  • Home jobname: IBMEXP This was the job requesting an ICSF service
  • Service number: 2. The RACF commands documentation says for a RACROUTE service, 2 is FASTAUTH
  • Ignore RACF return and RACF reason codes. They are set on exit,

Exit

Trace Identifier:             00000036 
Record Eyecatcher: RTRACE
Trace Type: RACFPOST
Ending Sequence: ........
Calling address: 00000000 A096A08A
Requestor/Subsystem: CRYPTO CRYPTO
Primary jobname: CSF
Primary asid: 00000038
Primary ACEEP: 00000000 008F6B98
Home jobname: IBMEXP
Home asid: 00000029
Home ACEEP: 00000000 008FA8A8
Task address: 00000000 008D6A88
Task ACEEP: 00000000 00000000
Time: DFF37DD1 AB85DE40
Error class: ........
Service number: 00000002
RACF Return code: 00000008
RACF Reason code: 00000000
Return area address: 00000000 00000000
Parameter count: 00000009

This trace entry was from a batch job IBMEXP, using an ICSF service to access an encryption key.

Interesting fields

  • RACFPOST this is a RACROUTE after exit entry
  • Requestor/Subsystem: CRYPTO which z/OS component
  • Primary jobname: CSF This was the address space which issued the RACROUTE request
  • Home jobname: IBMEXP This was the job requesting an ICSF service
  • Service number: 2. The RACF commands documentation says for a RACFROUTE service, 2 is FASTAUTH
  • RACF Return code: 00000008, RACF Reason code: 00000000 This shows there was a problem. The documentation for FASTAUTH shows
    • RC=8 -> the user or group is not authorized to use the resource.
    • RS=0 -> The invoker does not need to log the request.

The RACF trace command for this was

#SET TRACE(JOBNAME(csf),callable(none),RACROUTE(type(2)))

and the following commands

S GTF.GTF
R 1,trace=usrp
R 2,USR=(F44)
R 3,END
R 4,U

Fast start GTF

I can use

S GTF.GTF,M=GTFRACF

Where my GTF proc has

//GTFNEW  PROC M=GTFPARM,ID=SYS1 
//DELETE EXEC PGM=IEFBR14
//IEFRDER DD DSNAME=&ID..TRACE,UNIT=SYSDA,SPACE=(TRK,20),
// DISP=(MOD,DELETE)
//IEFPROC EXEC PGM=AHLGTF,PARM='MODE=EXT,DEBUG=NO,TIME=YES,NP',
// TIME=1440,REGION=2880K
//IEFRDER DD DSNAME=&ID..TRACE,UNIT=SYSDA,SPACE=(TRK,20),
// DISP=(NEW,KEEP)
//SYSLIB DD DSNAME=USER.Z24C.PROCLIB(&M),DISP=SHR

I have a member in USER.Z24C.PROCLIB(GTFRACF)

TRACE=USRP 
USR=(F44)
END

Debugging the “you do not have access to something, but I’m not telling you what” problem

The problem, I had a message

SSL Handshake Failed, ICSF error. Review ‘RACF CSFSERV Resource Requirements’ of the z/OS documentation.
Reason: The webservers userid does not have access to CSFSERV resource classes required for SSL.

But it does not tell me what it does not have access to.

When an application tries to access a resource, and the userid is not authorised to that resource, RACF can produce an error message, which tells you the resource name.

If the application asks “does this application have access to this resource”, then RACF produces no error message, and it is up to the application to provide a sensible and useful message.

Collecting a RACF trace

You can use a command

#SET TRACE(CLASS(CSFSERV),RACROUTE(ALL))

to turn on the trace for that class. You can also use USERID(…) and jobname(…) to further restrict what is traced.

The output goes to GTF.

s gtf,gtf
01 AHL125A  RESPECIFY TRACE OPTIONS OR REPLY U 
 1,trace=usrp                                                                                 
IEE600I REPLY TO 01 IS;TRACE=USRP                                                              
    09.53.19 STC00315  TRACE=USRP                                                                                     
02 AHL101A  SPECIFY TRACE EVENT KEYWORDS --USR=                                                
  - 09.53.27           r 2,usr=(F44),end                                                                              
    09.53.27 STC00315  IEE600I REPLY TO 02 IS;USR=(F44),END                                                           
    09.53.27 STC00315  USR=(F44),END                                                                                  
    09.53.27 STC00315  AHL103I  TRACE OPTIONS SELECTED --USR=(F44)                                                    
  | 09.53.27 STC00315 *03 AHL125A  RESPECIFY TRACE OPTIONS OR REPLY U                                                 
00- 09.53.30           r 3,u                                                                                          
    09.53.30 STC00315  IEE600I REPLY TO 03 IS;U                                                                       
    09.53.30 STC00315  U                                                                                              

Run your work.

P GTF
AHL006I GTF ACKNOWLEDGES STOP COMMAND
AHL904I THE FOLLOWING TRACE DATASETS CONTAIN TRACE DATA :
SYS1.TRACE

if you do not get “THE FOLLOWING TRACE DATASETS CONTAIN TRACE DATA…” it means you did not collect any data.

Use IPCS to look at it

  • =0 and specify the trace data set name
  • if you change scope to both it will remember the data for next time
  • =6 to get you to IPCS Subcommand Entry panel
  • if this is the first time you have used this instance of the data set, you should issue the dropd command to get IPCS to forget about previous usage
  • gtf usr(all) This displays the data
  • You can process this
    • type M and press PF8 to get to the bottom of the data
    • report view will display the data in ISPF edit (view mode)
    • You can now issue commands like
    • x all
    • f code all and look for non zero return codes.
    • del all x
    • sort 30 50 to display all return codes in numerical order. You need to look at the top and the bottom.
    • make a note of the return code ( copy the line to your clipboard)
    • quit
    • report view
    • find return code

To get rid of some of the forest of unhelpful data

  • x all
  • find ‘ ‘ 1 20 all
  • find ‘+’ 1 2 all
  • delete all nx

#SET TRACE(NOCLASS,RACROUTE(ALL))

How do I test my RACF updates or recover from a boo-boo?

I am running z/OS on zPDT on my Linux server. I want to test out some RACF commands, but want to be able to recover if I get it wrong. What can I do?

The product zSecure Admin offers RACF Offline can be used by those who have the license.

I do not have this product so I need a different way.

I am running a single user z/OS image. For anything more complex than this, talk to the experts.

Background to RACF

RACF has a database, typically SYS1.RACFDS. For availability you can have a backup RACF database, typically SYS1.RACFDS.BACKUP. Typically these are both updated when changes are made to the RACF environment. I would call this a duplexed dataset.

You can make a copy of a RACF database using the RACF utility IRRUT200. This takes a lock on database for the duration of the copy and ensures the copy is consistent. If you use other tools to backup the primary database, and there were concurrent database updates, the backup may not be consistent. You risk a partial update, and an inconsistent database.

Changing your RACF

You can use the RVARY command (TSO and operator) to change the RACF data sets. You can setup controls for the RVARY command.

If you use SETROPTS LIST it will list the RACF options being used. Mine has (near the bottom)

DEFAULT RVARY PASSWORD IS IN EFFECT FOR THE SWITCH FUNCTION.
...
DEFAULT RVARY PASSWORD IS IN EFFECT FOR THE STATUS FUNCTION.

In this case the default password is YES

See SETROPTS RVARY(…)

You will get a prompt

ICH703A ENTER PASSWORD TO SWITCH RACF DATASETS JOB=IBMUSER USER=IBMUSER

If you have an I/O problem with the SYS1.RACFDS database

You can

  • switch to use the backup, and stop duplex updates.
  • fix the problem. Perhaps delete the data set and recreate it on a different device.
  • copy the backup dataset using IRRUT200 to the newly allocated dataset
  • activate the newly allocated dataset activate the duplex updates.

Updates to the RACF database are written to SMF 80 records.

If you make mistakes with your commands

Some RACF command changes you can undo, for example, you can use an alter command to set a new value. To be able to undo the change you need to know what the original value was, so it is always a good idea to display a resource before you change or delete it.

If you delete a record there is no easy way of undoing it.

Rob van Hoboken suggested the following.

Plan A – online switch

Display the status

#rvary list

This gave me

ICH15013I RACF DATABASE STATUS:                          
ACTIVE USE NUM VOLUME DATASET
------ --- --- ------ -------
YES PRIM 1 A3CFG1 SYS1.RACFDS
YES BACK 1 A3CFG1 SYS1.RACFDS.BACKUP

This shows both data sets are active and which is the primary and which ise the backup.

Switch off your RACF DUPLEX (backup) data set

#RVARY INACTIVE,DATASET(SYS1.RACFDS.BACKUP)

It prompts with

ICH702A ENTER PASSWORD TO DEACTIVATE RACF JOB=RACF USER=START1

I didnt know the password, but I replied r xx,YES and it worked!

Run the commands that you hope will work.

If the commands do not work successfully

If the commands do not work successfully, make the backup active and switch to it.

#RVARY ACTIVE,DATASET(SYS1.RACFDS.BACKUP)
#RVARY SWITCH

See the command RVARY (Change status of RACF database) .

After you have switched to the backup, make the primary inactive

#RVARY INACTIVE,DATASET(SYS1.RACFDS)

and copy the backup into the primary using IRRUT200

//IBMUSRAC  JOB 1,MSGCLASS=H                              
//S1 EXEC PGM=IRRUT200 PARM=ACTIVATE
//SYSRACF DD DISP=SHR,DSN=SYS1.RACFDS.BACKUP
//SYSUT1 DD DISP=SHR,DSN=SYS1.RACFDS
//SYSUT2 DD SYSOUT=*
//SYSIN DD *
//SYSPRINT DD SYSOUT=*

First run it without PARM=ACTIVATE. If this works successfully, then use PARM=ACTIVATE, this take a lock during the copy, and then activates the SYSUT1 data set, so both are now active.

This gave the output

IRR62005I - IDCAMS REPRO copied SYSRACF to the work data set SYSUT1                                         
ICH15013I RACF DATABASE STATUS:
ACTIVE USE NUM VOLUME DATASET
------ --- --- ------ -------
YES PRIM 1 A3CFG1 SYS1.RACFDS.BACKUP
YES BACK 1 A3CFG1 SYS1.RACFDS
ICH15020I RVARY COMMAND HAS FINISHED PROCESSING.

The #RVARY LIST gave, as expected

ICH15013I RACF DATABASE STATUS:                                                  
ACTIVE USE NUM VOLUME DATASET
------ --- --- ------ -------
YES PRIM 1 A3CFG1 SYS1.RACFDS.BACKUP
YES BACK 1 A3CFG1 SYS1.RACFDS

Which shows it has worked.

If the commands do work successfully

If the commands do work successfully, you need to bring the backup up to date with the primary.

//IBMUSRAC  JOB 1,MSGCLASS=H                              
//S1 EXEC PGM=IRRUT200,PARM=ACTIVATE
//SYSRACF DD DISP=SHR,DSN=SYS1.RACFDS
//SYSUT1 DD DISP=SHR,DSN=SYS1.RACFDS.BACKUP
//SYSUT2 DD SYSOUT=*
//SYSIN DD *
//SYSPRINT DD SYSOUT=*

The #rvary active command.

I would be careful about using the #RVARY ACTIVE command. If you copy a RACF data set, then use the RVARY ACTIVE command to make it available, there is a small window where changes made to the active RACF data base are not made to the offline one. You could tell people “do not issue any commands while this work is going on”, but the RACF database is updated during normal work, for example with profile use counts, and last used date.

Plan B – REIPL with different data sets.

  • Convert to using IRRPRMxx PARMLIB member definitions to specify the data set names, instead of using a load module. See RACF setup, and changing the RACF dataset
  • Copy the RACF database to a recovery RACF data set using IRRUT200
  • Create a new IRRPRMxx member pointing to your recovery data set.
  • Create a new IEASYSxx member to use the RACF=xx member
  • You may have to create a new LOADxx member in SYS1.IPLPARM to point to the new IEASYSxx member.
  • REIPL, and specify the appropriate LOADxx suffix.
  • Fix the problem; copy from the recovery into the SYS1.RACFDS dataset and SYS1.RACFDS.BACKUP, so the primary and backup are the same.
  • Reipl using the normal IPL parameters

You could use the emergency stand-alone one pack system SARES1, but this may not have access to all of the data sets due to SMS and catalogs.

RACF setup, and changing the RACF dataset

I used ADCD supplied version of z/OS, and needed to change which data set I used, so it could be used on z/OS 3.1 or earlier (but not both at the same time).

What options are active?

You can use the operator command

#rvary

where # is the RACF subsystem character (defined in INITPARM(‘#’) in the IEFSSN definition for RACF).

Before z/OS 2.3

You had to define RACF data set parameters using a load module.

  • The databases are defined in a module ICHRDSNT.
  • If you use SDSF and issue the LOAD ICHRDSNT command, it will display the data.
  • The source is in ADCD.LIB.JCL(ICHRDSNT).
  • If you want to change it, you need to generate the module, and REIPL

Parmlib definitions

In z/OS 2.3 and later you can specify parameters in the parmlib concatenation.

In your IEASYS member you can specify RACF=(xx,yy,zz) and specify up to three parameters. These parameters take precedence over the ICHRDSNT table. I specified RACF=(00)

My equivalent definition in USER.Z25D.PARMLIB(IRRPRM00) is

DATABASE_OPTIONS 
DATASETNAMETABLE
ENTRY
PRIMARYDSN(SYS1.RACFDS)
BACKUPDSN(SYS1.RACFDS.BACKUP)

You can have one primary data set and up to one backup data set. If you want to partition your database by key, you can have more primary and backup data sets entries, and you need to specify the key range to data set mapping.

The syntax is defined here.

The datasets have to be cataloged. I have used a data set catataloged in a user catalog, they do not need to be SYS1.xxxx.

If you get it wrong and the member is invalid, at IPL RACF prompts

 *IRRY115I RACF IS NOT USING PARMLIB FOR INITIALIZATION.                     
*01 ICH502A SPECIFY NAME FOR PRIMARY RACF DATA SET SEQUENCE 001 OR 'NONE'
IRA600I SRM CHANNEL DATA NOW AVAILABLE FOR ALL SRM FUNCTIONS
IRA860I HIPERDISPATCH MODE IS NOW ACTIVE
R 01,SYS1.RACFDS
*02 ICH502A SPECIFY NAME FOR BACKUP RACF DATA SET SEQUENCE 001 OR 'NONE'
R 2,SYS1.RACFDB.BACKUP

Changing parmlib

If you change the IEASYS RACF=, or change an IRRPMRxx member, you have to REIPL to pick up the changes. Stopping and restarting RACF does not pick up the changes, because the RACF started task handles operator commands and other admin stuff, it does not process the data sets. See here.

Copying the RACF database

You can copy a RACF database and use that. For example

//IBMCRACO  JOB 1,MSGCLASS=H 
//STEP EXEC PGM=IRRUT200
//SYSRACF DD DSN=SYS1.RACFDS,DISP=SHR
//SYSUT1 DD UNIT=SYSDA,SPACE=(CYL,(20)),
// DCB=(LRECL=4096,RECFM=F),DSN=COLIN.RACFDS,DISP=(MOD,CATLG)
//SYSUT2 DD SYSOUT=A
//SYSPRINT DD SYSOUT=A
//SYSIN DD *
INDEX
MAP
END
/*

IRRUT200 is called a RACF database verification utility program (IRRUT200), and to make an exact copy of a RACF data set. It takes a lock on the RACF database for the duration of the copy to ensure the data is consistent.

You should check the size of the source dataset, and make the copy the same size.

RACF – I cannot delete a certificate!

I made a mistake creating a certificate, and could not delete it, because I got a message

IRRD109I The certificate cannot be added. Profile … is already defined.

In my JCL I had

RACDCERT GENCERT  -                                           
CERTAUTH -
SUBJECTSDN(CN('DocZosCADSA')-
O('COLIN') -
OU('CA')) -
NOTAFTER( DATE(2027-07-02 ))-
KEYUSAGE( CERTSIGN ) -
DSA
SIZE(1024) -
WITHLABEL('DocZosCADSA')

This created a certificate

I was missing the – after DSA, so the DSA, SIZE(1024) and importantly the WITHLABEL() was missing.

When I tried to recreate this certificate I got message

IRRD109I The certificate cannot be added. Profile 00.CN=DocZosCADSA.OU=CA.O=COLIN is already defined.

My problem was, what do I delete ?

The following command listed all of the certificate owned by certauth.

RACDCERT certauth LIST

I searched for CN=DocZosCADSA and it found. (Use the CN=… not the whole string)

Label: LABEL00000002 
...
Subject's Name:
>CN=DocZosCADSA.OU=CA.O=COLIN<
...

Note the label.

The command

RACDCERT CERTAUTH DELETE(LABEL('LABEL00000002'))

Deleted the certificate in error – problem solved.

Note: For a personal certificate the reported certificate was like

02AB.CN=SSCA256.OU=CA.O=SSS.C=GB

If you use the list command the certificate sequence number 02AB is on a different line to the remainder of the label.

Using RACF callable services including from a 64bit bit program

You can use RACF callable services to programatically get and set RACF information, for example to list and display digital certificates, and objects.

There is a C interface to these services. These interfaces are easy to use as long as you are careful with your data types, and get your compile JCL right. You can use 31 and 64 mode programs with these services.

JCL to compile a 64 bit program

Below is the JCL I use for compile programs which use gskit and RACF callable services.

//COLINC5    JOB 1,MSGCLASS=H,COND=(4,LE) 
//S1 JCLLIB ORDER=CBC.SCCNPRC
// SET LOADLIB=COLIN.LOAD
//*OMPILE EXEC PROC=EDCCB,
//COMPILE EXEC PROC=EDCQCB,
// LIBPRFX=CEE,
// CPARM='OPTFILE(DD:SYSOPTF),LSEARCH(/usr/include/)',
// BPARM='SIZE=(900K,124K),RENT,LIST,RMODE=ANY,AMODE=64,AC=1'
//COMPILE.SYSOPTF DD *
...
/*
//COMPILE.SYSIN DD DISP=SHR,DSN=COLIN.C.SOURCE(...)
//BIND.SYSLMOD DD DISP=SHR,DSN=COLIN.LOAD
//BIND.OBJLIB DD DISP=SHR,DSN=COLIN.OBJLIB
//BIND.GSK DD DISP=SHR,DSN=SYS1.SIEALNKE
//BIND.CSS DD DISP=SHR,DSN=SYS1.CSSLIB
//BIND.SYSIN DD *
INCLUDE GSK(GSKCMS64)
INCLUDE GSK(GSKSSL64)
INCLUDE CSS(IRRSDL64)

NAME AMSCHE64(R)

Note the 64 bit specific items

  • PROC=EDCQCB
  • RMODE=ANY,AMODE=64
  • The includes of the GSK*64 stubs
  • The include of the 64 bit RACF callable stub IRRSDL64

The 31 bit equivilants are

  • PROC=EDCCB
  • RMODE=ANY,AMODE=31
  • The includes of the GSK*31 stubs: GSKCMS31,GSKSSL
  • The include of the 31 bit RACF callable stub IRRSDL00

The source is specified via //COMPILE.SYSIN

SYSOPTF

For both 64 bit and 31 bit programs

//COMPILE.SYSOPTF DD * 
LIST,SOURCE
aggregate(offsethex) xref
SEARCH(//'COLIN.C.H',//'SYS1.SIEAHDR.H')
TEST
RENT LO
OE
INFO(PAR,USE)
NOMARGINS EXPMAC SHOWINC XREF
LANGLVL(EXTENDED) sscom dll
DEFINE(_ALL_SOURCE)
DEBUG

Skeleton of C program

#pragma linkage(IRRSFA64 ,OS) 

The pragma is needed for the bind operation. It says the module is a z/OS callable service type of module (and not a C program).

#ifdef _LP64 
#include <irrpcomy.h>
#else
#include <irrpcomx.h>
#endif

You need a different copy book for the RACF constants depending on the 31/64 bit mode.

IRRPCOMY contains definitions for 64 bit programs, IRRCOMX is for 31 bit programs.

char * workarea ; 
workarea = (char *) malloc(1024);
int ALET1= 0;
int parmAlet = 0;
int numParms =11;
short function_code = 1;
int ALET2= 0;
int ALET3= 0;
int SAF_RC = 0;
int RACF_RC = 0;
int RACF_RS = 0;

The variables have to be “int”, not “long”, as they are 4 bytes long. With 64 bit program, a long is 8 bytes long. See here for a table about the types and lengths in 31 bit and 64 bit programs. A short is 2 bytes long.

Set up the parameter list 

The macro IRRPCOM? provides header files for some definitions.

For example

char * pSTC = "AZFTOTP1"; 
char area[1000];

struct fact_getf_plist pl;
pl.fact_getf_options = 0;
pl.fact_getf_factor_length = 8;
pl.fact_getf_factor_a = pSTC;
pl.irrpcomy_dummy_34 = 0;
pl.fact_getf_af_length = sizeof(area);
pl.fact_getf_af_a = & area;

where pl is used below.

Call the function

 IRRSFA64( workarea, // WORKAREA 
&ALET1 , // ALET
&SAF_RC, // SAF RC
&ALET2, // ALET
&RACF_RC,// RACF RC
&ALET3 , // ALET
&RACF_RS,// RACF Reason
&numParms,
&parmAlet, //
&function_code,
&pl );

The irrpcomx has a structure definition for the parameter list, but I could not get it to work in these programs, as it passes the address of the data, instead of the data itself.

How to take (and process) a RACF GTF trace with Java

When trying to resolve a certificate problem in a Java program, see here, I tried unsuccessfully to take a RACF trace to see what calls were being issued, and what reason codes were being returned.

The RACF GTF had no entries for the Java program!

Start RACF trace

My started task was called OZUSRV4. I had to specify a jobname to RACF trace of OZUSRV4* because Java spawns address spaces, and it was a spawned address space that did all of the Java work. If your started task is 8 characters long – just specify the 8 character name.

The trace command was the RACF SET TRACE command, where # is my RACF subsystem recognition character.

#SET TRACE(CALLABLE(TYPE(41))JOBNAME(OZUSVR4*))

Where type(41) is for IRRSDL00 which performs the R_datalib, keyring processing.

Start GTF

S GTF.GTF
R 1,trace=usrp
R 2,USR=(F44) 
R 3,END
R 4,U 

Run the test

I ran my started task, and stopped the RACF trace

#SET TRACE(CALLABLE(NONE))JOBNAME(OZUSVR4*)) 
#set list

The output of the #set list command included

TRACE OPTIONS                   - NOIMAGE                                    
                                - NOAPPC                                     
                                - NOSYSTEMSSL                                
                                - NORRSF                                     
                                - NORACROUTE                                 
                                - NOCALLABLE                                 
                                - NOPDCALLABLE                               
                                - NODATABASE                                 
                                - NOGENERICANCHOR                            
                                - NOASID                                     
                                - JOBNAME                                    
                                   OZUSVR4*                                  
                                - NOCLASS                                    
                                - NOUSERID                                   
SUBSYSTEM USERID                - START1                                     

So the traces are off…. but it still has a reference to OZUSVR4 – strange.

Process the GTF file.

I used IPCS to look at the GTF file

  • =0 and specify the GTF file name
  • =6 dropd to drop any saved status from last time that dataset was used
  • gtf usr(all) It displays the output in an editor like window.
  • report view displays it in ISPF editor, view mod.
  • You can the do things like
    • x all
    • f ‘RACF Reason code’ all

To display the records with non zero return codes.

The output is very chatty – and it was hard to find the data I wanted from data with a hex dump of the string “OFFSET” etc. For example

Trace Identifier:             00000036                           
Record Eyecatcher:            RTRACE                             
Trace Type:                   OMVSPRE                            
Ending Sequence:              ........                           
Calling address:              00000000  79403A2D                 
Requestor/Subsystem:          ........  ........                 
Primary jobname:              OZUSVR44                           
Primary asid:                 00000035                           
Primary ACEEP:                00000000  008FC8A0                 
Home jobname:                 OZUSVR44                           
Home asid:                    00000035                           
Home ACEEP:                   00000000  008FC8A0                 
Task address:                 00000000  008CF298                 
Task ACEEP:                   00000000  00000000                 
Time:                         DDD4C11D  776E2A40                 
Error class:                  ........                           
Service number:               00000029                           
RACF Return code:             00000000                           
RACF Reason code:             00000000                           
Return area address:          00000000  00000000                 
Parameter count:              0000002B    
...                       
Area length:                  00000008                                                                                
                                                                                                                      
Area value:                                                                                                  
D6C6C6E2  C5E30050                               | OFFSET.&                         |  
                                                                                                                      
Area length:                  00000007                                                                                
                                                                                                                      
Area value:                                                                                                           
06E2E3C1  D9E3F1                                 | .START1                          |  

I wrote a REXX exec which post processes the output and removes what I think is irrelevant data.

An example of what I think is useful is below. Non zero return codes have ! in column 1

! Return code: 00000008 8 
! Reason code: 00000004 4  4 Parameter list error occurred. 
-  -  -  -  -  -  -  -  -  -  -  -  -  -  -  -  -  -  -  -  -  -  -  -  -  -  - 
! Return code: 00000008 8 
! Reason code: 0000002C 44 44 No certificate found with the specified status 
-  -  -  -  -  -  -  -  -  -  -  -  -  -  -  -  -  -  -  -  -  -  -  -  -  -  - 
Area value: 
00000050  10AFC67C  ...
...
  -  -  -  -  -  -  -  -  -  -  -  -  -  -  -  -  -  -  -  -  - 
Area value:          | .START1                          | 
06E2E3C1  D9E3F1                                                
Area value:          | .MQRING                          | 
06D4D8D9  C9D5C7                                                

You can download the rexx exec from

You need to upload it to a CLIST available to ISPF.

Solving certificate problems in Java on z/OS

I spent many any hour trying to understand why z/OSMF was getting a message saying certificate not found in keyring, when it was always there when I checked it.

I tried Java trace options but they did not help. I have my own Java program, and that gave me a message from IRRSDL00 (the callable service to access keyrings). But when I did a RACF GTF trace to get see what was going on I got no entries in the trace. Weird. Once I solved the problems, the solution was obvious.

My Java program reported

java.io.IOException: The private key of NEWTECCTEST is not available or no authority to access the private key

z/OSMF report

[ERROR ] CWPKI0024E: The NISTECCTEST certificate alias specified by the attribute serverKeyAlias is either not found in KeyStore safkeyring://START1/MQRING or it is invalid.

The problem and the solution

The message The private key … is not available or no authority to access the private key. Has a hint as to the problem. The documentation is hidden away. It was not as bad as

It was on display in the bottom of a locked filing cabinet stuck in a disused lavatory with a sign on the door saying ‘Beware of the Leopard.”

but it is not easy to find. It says

Applications can call the R_datalib callable service (IRRSDL00) to extract the private keys from certain certificates after they have access to the key ring. A private key is returned only when the following conditions are met:

  1. For RACF real key rings:
    • User certificates An application can extract the private key from a user certificate if the following conditions are met:
      • The certificate is connected to the key ring with the PERSONAL usage option.
      • One of the following two conditions is true:
        • The caller’s user ID is the user ID associated with the certificate if the access to the key ring is through the checking on IRR.DIGTCERT.LISTRING in the FACILITY CLASS, or
        • The caller’s user ID has READ or UPDATE authority to the <ringOwner>.<ringName>.LST resource in the RDATALIB class. READ access enables retrieving one’s own private key, UPDATE access enables retrieving other’s.

I had a keyring START1.MQRING and the start task userid had read access to it. Within the keyring was the certificate NISTECCTEST owner by userid START1. The started task userid needs UPDATE access to the keyring to be able to access the private key belonging to a different userid.

Reasons for “not found” reason code

Under the covers the callable server IRRSDL00 is called. The reason code are documented here. You might get SAF return code 8, RACF return code 8, RACF reason code 44.

  • The certificate was not in the keyring
  • It was NOTRUST
  • It had expired
  • The CA for the certificate was not in the keyring,
  • The userid did not have update access to the keyring when there are private certificates from other userids.

Certificate dates with ICSF and PKISERV

Digging into certificates provided by ICSF, I got confused with the dates.

There are 3 places that start/end dates can be, and they have different meanings and uses.

  1. The validity period you see in the certificate is part of the certificate itself. That is added at certificate creation and enforced by the application (such as System SSL).
  2. Within ICSF there are fields START DATE/END DATE you see in the panels are CKA_START_DATE and CKA_END_DATE. They are defined in the PKCS#11 standards but are not enforced.
  3. Within record metadata for a KDSR format record, you will see Cryptoperiod start date/Cryptoperiod end date. This is enforced by ICSF. Usage outside this time frame is not permitted.

Note that PKCS#11 services are the only place you can see all three of these. Neither CKDS nor PKDS can hold certificates, nor do they support PKCS#11 attributes.

Official document

The standards document PKCS #11 Cryptographic Token Interface Base Specification says

CKA_START_DATE – Start date for the certificate (default empty)
CKA_END_DATE – End date for the certificate (default empty

Section 4.6.2 (Certificate objects Overview):
The CKA_START_DATE and CKA_END_DATE attributes are for reference only; Cryptoki does not attach any special meaning to them. When present, the application is responsible to set them to values that match the certificate’s encoded “not before” and “not after” fields (if any).

Section 4.7.2 (Key Objects Overview) has similar wording:
Note that the CKA_START_DATE and CKA_END_DATE attributes are for reference only; Cryptoki does not attach any special meaning to them. In particular, it does not restrict usage of a key

Thanks to Eric Rossman, for helping me understand this.