What’s the best way of connecting to an HTTPS server. Pass ticket or JWT?

This blog post was written as background to some blog posts on Zowe API-ML. It provides back ground knowledge for HTTPS servers running on z/OS, and I think it is useful on its own. Ive written about an MQWEB server – because I have configured this on my system.

The problem

I want to manage my z/OS queue manager from my Linux machine.I have several ways of doing it.

Which architecture?

  • Use an MQ client. Establish a client connect to the CHINIT, and use MQPUT and MQGET administration messages to the queue manager.
    • You can issue a command string, and get back a response string which you then have to parse
    • You can issue an MQINQ API request to programmatically query attributes, and get the values back in fields. No parsing, but you have to write a program to do the work.
  • Use the REST API. This is an HTTP request in a standard format into the MQWEB server.
    • You can issue a command string, and get back a response string which you then have to parse to extract the values.
    • You can issue a JSON object where the request is encoded in a URL, and get the response back in JSON format. It is trivial to extract individual fields from the returned data.

Connecting to the MQWEB server

If you use REST (over HTTPS) there are several ways of doing this

  • You can connect using userid and password. It may be OK to enter your password when you are at the keyboard, but not if you are using scripts and you may be away from your keyboard. If hackers get hold of the password, they have weeks to use it, before the password expires. You want to give your password once per session, not for every request.
  • You can connect using certificates, without specifying userid and password.
    • It needs a bit of set up at the server to map your certificate to a userid.
    • It takes some work to set up how to revoke your access, if you leave the company, or the certificate is compromised.
    • Your private key could be copied and used by hackers. There is discussion about reducing the validity period from over a year to 47 days. For some people this is still too long! You can have your private certificate on a dongle which you have to present when connecting to a back end. This reduces the risk of hackers using your private key.
  • You can connect with a both certificate and userid and password. The certificate is used to establish the TLS session, and the userid and password are used to logon to the application.
  • You can use a pass ticket. You issue a z/OS service which, if authorised, generates a one time password valid for 10 minutes or less. If hackers get hold of the pass ticket, they do not have long to be able to exploit it. The application generating the pass ticket, does not need the password of the userid, because the application has been set up as trusted.
  • You can use a JSON Web Token (JWT). This has some similarities with certificates. In the payload is a userid value and issuer value . I think of issuer as the domain the JWT has come from – it could be TEST or a company name. From the issuer value, and IP address range, you configure the server to specify a realm value. From the userid and realm you can map this to a userid on the server. This JWT can be valid from minutes to many hours (but under a day). The userid and realm mapping to a userid is different to certificate mapping to a userid.

Setting up a pass ticket

The passticket is used within the sysplex. It cannot be used outside of a sysplex. The pass ticket is a password – so needs to be validated against the RACF database.

The application that generates the pass ticket must be authorised to a profile for the application. For example, define the profile for the application TSO on system S0W1, the profile is TSOS0W1.

 RDEFINE PTKTDATA TSOS0W1 

and a profile to allow a userid to create a pass ticket for the application

RDEFINE PTKTDATA   IRRPTAUTH.TSOS0W1.*  UACC(NONE) 

PERMIT IRRPTAUTH.TSOS0W1.* CLASS(PTKTDATA) ID(COLIN) ACCESS(UPDATE)
PERMIT IRRPTAUTH.TSOS0W1.* CLASS(PTKTDATA) ID(IBMUSER)ACCESS(UPDATE)

Userids COLIN and IBMUSER can issue the callable service IRRSPK00 to generate a pass ticket for a user for the application TSOS0W1.

The output is a one-use password which has a validity of up to 10 minutes.

As an example, you could configure your MQWEB server to use profile name MQWEB, or CSQ9WEB.

How is it used

A typical scenario is for an application running on a work station to issue a request to an “application” on z/OS, like z/OSMF, to generate a pass ticket for a userid and application name.

The client on the work station then issues a request to the back end server, with the userid and pass ticket. If the back end server matches the application name then the pass ticket will be accepted as a password. The logon will fail if a different application is used, so a pass ticket for TSO cannot be used for MQWEB.
This is more secure than sending a userid and password up with every back end request, but there is additional work in creating the pass ticket, and two network flows.

This solution scales because very little work needs to be done on the work station, and there is some one-off work for the setup to generate the pass tickets.

JSON Web Tokens

See What are JSON Web Tokens and how do they work?

The JWT sent from the client has an expiry time. This can be from seconds to hours. I think it should be less than a day – perhaps a couple of hours at most. If a hacker has a copy of the JWT, they can use it until it expires.

The back end server needs to authenticate the token. It could do this by having a copy of the public certificate in the server’s keyring, or send a request down to the originator to validate it.

If validation is being done with public certificates, because the client’s private key is used to generate the JWT, the server needs a copy of the public certificate in the server’s keyring. This can make it hard to manage if there are many clients.

The Liberty web server has definitions like

<openidConnectClient id="RSCOOKIE" 
clientId="COLINCOO2"
realmName="zOSMF"
inboundPropagation="supported"
issuerIdentifier="zOSMF"
mapIdentityToRegistryUser="false"
signatureAlgorithm="RS384"
trustAliasName="CONN1.IZUDFLT"
trustStoreRef="defaultKeyStore"
userIdentifier="sub"
>
<authFilter id="afint">
<remoteAddress id="myAddress" ip="10.1.0.2" matchType="equals" />
</authFilter >

</openidConnectClient>

For this entry to be used various parameters need to match

  • The issuerIdentifier. This string identifies the client. It could be MEGABANK, TEST, or another string of your choice. It has to match what is in the JWT.
  • signatureAlgorithm. This matches the incoming JWT.
  • trustAliasName and trustStoreRef. These identify the certificate used to validate the certificate
  • remoteAddress. This is the address, or address range of the client’s IP addresses.

If you have 1000 client machines, you may need 1000 <openidConnectClient…/> definitions, because of the different certificate and IP addresses.

You may need 1000 entries in the RACMAP mapping of userid + realm to userid to be used on the server.

How is it used

You generate the JWT. There are different ways of doing this.

  • Use a service like z/OSMF
  • Use a service on your work station. I have used Python to do this. The program is 30 lines long and uses the Python jwt package

You get back a long string. You can see what is in the string by pasting the JWT in to jwt.io.
You pass this to the backend as a cookie. The cookie name depends on what the server is expecting. For example

'Authorization': "Bearer " + token

The JWT has limited access

For the server to use the JWT, it needs definitions to recognise it. If you have two back end servers

  • Both servers could be configured to accept the JWT
    • If the server specified a different REALM, then the mapped userid from the JWT could be different for each server because the userid/realm to userid mapping can be different.
  • One server is configured to accept the JWT
    • If only one server has the definitions for the JWT, then trying to use the JWT to logon to another server will fail.

RACF: Processing audit records

RACF can write to SMF data information about which userid logged on, what resources it accessed etc.. This can be used to check there are no unexpected accesses, and any violations are actioned.

The data tends to be “this userid had access to that resource”. It does not contain numeric values, such as response time.

Overview of SMF data

SMF data is a standard across z/OS. Each product has an SMF record type, and record subtypes are used to provide granularity within a product’s records. It is common for an SMF record to have sections within it. There may be 0 or more sections, and the sections can be of varying length. A SMF formatting program needs to build and report useful information from these sections.

There are many tools or products to process SMF records. Individual products may produce tools for formatting records, and there are external tools available to process the records.

Layout of RACF SMF records

The layout of the RACF SMF records are described in the publications. Record type 80: RACF processing record. It describes the field names, at which offsets, and how to interpret the data (what each bit means), this information is sufficient for someone to write a formatting program.


RACF also provides a formatter. The formatter runs as a SORT exit, and expands the data. For example in the SMF data is a bit saying a userid has the SPECIAL attribute. The formatter expands this and creates a column “SPECIAL” with the value YES or NO. This makes it easy to filter and display records, because you do not need to map bits to their meaning – the exit has done it for you. The layout of the expanded records is described here.

What tools format the records?

A common(free) tool for processing the records that RACF produces is an extension to DFSORT called ICETOOL. (The IBM sort modules all begin with ICE… so calling it ICETOOL was natural).

With ICETOOL you can say include rows where…., display and format these fields, count the occurrences of this field, and add page titles. You can quickly generate tabular reports.

The output file of the RACF exit has different format records mixed up. You need to filter by record type and display the subset of records you need.

JCL to extract the RACF SMF record and convert to the expanded format

//* DUMP THE SMF DATASETS 
// SET SMFPDS=SYS1.S0W1.MAN1
// SET SMFSDS=SYS1.S0W1.MAN2
//*
//SMFDUMP EXEC PGM=IFASMFDP,REGION=0M
//DUMPINA DD DSN=&SMFPDS,DISP=SHR,AMP=('BUFSP=65536')
//DUMPINB DD DSN=&SMFSDS,DISP=SHR,AMP=('BUFSP=65536')
//DUMPOUT DD DISP=(NEW,PASS),DSN=&RMF,SPACE=(CYL,(1,1))
//OUTDD DD DISP=(NEW,PASS),DSN=&OUTDD,
// SPACE=(CYL,(1,1)),DCB=(RECFM=VB,LRECL=12288)
//ADUPRINT DD SYSOUT=*
//*XMLFORM DD DSN=COLIN.XMLFORM,DISP=(MOD,CATLG),
//* SPACE=(CYL,(1,1)),DCB=(RECFM=VB,LRECL=12288)
//SYSPRINT DD SYSOUT=*

//SYSIN DD *
INDD(DUMPINA,OPTIONS(DUMP))
INDD(DUMPINB,OPTIONS(DUMP))
OUTDD(DUMPOUT, TYPE(30,80,81,83))
START(0000)
END(2359)
DATE(2025230,2025360)
ABEND(NORETRY)
USER2(IRRADU00)
USER3(IRRADU86)
/*

The RACF exits produce several files.

  • //OUTDD the expanded records are written to this dataset
  • //ADUPRINT contains information on how many of each record type the exit processed
  • //XMLFORM you can have it write data in XML format – for post processing

JCL to process the expanded records

The JCL below invokes the ICETOOL processing.

//S1      EXEC  PGM=ICETOOL,REGION=0M 
//DFSMSG DD SYSOUT=*
//TOOLMSG DD SYSOUT=*
//IN DD DISP=(SHR,PASS,DELETE),DSN=*.SMFDUMP.OUTDD
//TEMP DD DSN=&&TEMP3,DISP=(NEW,PASS),SPACE=(CYL,(1,1))
//PRINT DD SYSOUT=*

Where

  • //IN refers to the //OUTDD statement in the earlier step
  • //TEMP is an intermediate dataset. The sort program writes filtered records to this data set.
  • //PRINT is where the formatted output goes

ICETOOL Processing

The whole job is

//IBMJOBI  JOB 1,MSGCLASS=H RESTART=PRINT 
// JCLLIB ORDER=COLIN.RACF.ICETOOL
// INCLUDE MEMBER=RACFSMF
// INCLUDE MEMBER=PRINT
// INCLUDE MEMBER=ICETOOL
//TOOLIN DD *
COPY FROM(IN) TO(TEMP) USING(TEMP)
DISPLAY FROM(TEMP) LIST(PRINT) -
BLANK -
ON(5,8,CH) HEADER('EVENT') -
ON(63,8,CH) HEADER('USER ID') -
ON(14,8,CH) HEADER('RESULT') -
ON(23,8,CH) HEADER('TIME') -
ON(175,8,CH) HEADER('TERMINAL') -
ON(184,8,CH) HEADER('JOBNAME') -
ON(286,8,CH) HEADER('APPL ')
//TEMPCNTL DD *
INCLUDE COND=(5,8,CH,EQ,C'JOBINIT ')
OPTION VLSHRT
//

You have to be careful about the offsets. The record has a 4 byte length field on the front of each record. So the field in the layout of the expanded records described here is column 1 for length 8, in the JCL you specify column 5 of length 8. In the documentation the userid is columns 59 of length 8, in the JCL it is ON(63,8,CH).

The processing is ….

  • Copy the data from the dataset in //IN and copy it to the dataset in //TEMP. Using the sort instructions in TEMPCNTL. You take name name in USING(TEMP) and put CNTL on the end to locate the DDname.
  • The sort instructions say include only those records where columns 5 of length 8 of the record are the string ‘JOBINIT ‘ ( so columns 1 for length 8 in the mapping description).
  • The DISPLAY step copies record from the //TEMP dataset to the //PRINT DDNAME.
  • The ON() selects the data from the record, giving start column, length and formatting. For each field, it uses the specified column heading.

The output

In the //PRINT is

EVENT      USER ID    RESULT     TIME       TERMINAL   JOBNAME    APPL    
-------- -------- -------- -------- -------- -------- --------
JOBINIT START1 SUCCESS 09:54:11 SMFCLEAR
JOBINIT START1 TERM 09:54:20 SMFCLEAR
JOBINIT START1 TERM 09:55:13 CSQ9WEB
JOBINIT IBMUSER SUCCESS 10:12:14 LCL702 IBMUSER
JOBINIT IBMUSER SUCCESS 10:14:18 IBMJOBI
JOBINIT IBMUSER TERM 10:14:18 IBMJOBI
JOBINIT IBMUSER SUCCESS 10:21:39 IBMACCES
JOBINIT IBMUSER TERM 10:21:40 IBMACCES
JOBINIT IBMUSER SUCCESS 10:22:10 IBMACCES
JOBINIT IBMUSER TERM 10:22:11 IBMACCES
JOBINIT IBMUSER SUCCESS 10:23:01 IBMPASST
JOBINIT IBMUSER TERM 10:23:05 IBMPASST

Extending this

Knowing the format of the RACF extend record, you can add more fields to the reports.

You can filter which records you want. For example all records for userid START1. You can link filters with AND and OR statements.

Of course – JCL subroutines is the answer

I was processing RACF SMF records to report how clients were logging into MQ. This as a multi step job, and with each report I added, the JCL got more and more messy.
The requirements were simple

  • JCL to copy the dump the SMF data sets to a temporary data set
  • Run a tool against this data set to product the reports
    • There were reports for logon and logoff, and pass tickets, and access to profiles and…
  • I wanted it to be easy to use – and the JCL to fit on one screen!

All of this was easy except my JCL file got bigger with every report I wanted, and I spent a lot of time scrolling up and down, and changing the wrong file!

The solution was to use JCL subroutines – or INCLUDE JCL.

Examples

JCL to process the SMF data sets

You do not need to know what the JCL does – but you need to know it was in COLIN.JCL(RACFSMF)

//* DUMP THE SMF DATASETS 
// SET SMFPDS=SYS1.S0W1.MAN1
// SET SMFSDS=SYS1.S0W1.MAN3
//*
//SMFDUMP EXEC PGM=IFASMFDP,REGION=0M
//DUMPINA DD DSN=&SMFPDS,DISP=SHR,AMP=('BUFSP=65536')
//DUMPINB DD DSN=&SMFSDS,DISP=SHR,AMP=('BUFSP=65536')
//DUMPOUT DD DISP=(NEW,PASS),DSN=&RMF,SPACE=(CYL,(1,1))
//OUTDD DD DISP=(NEW,PASS),DSN=&OUTDD,
// SPACE=(CYL,(1,1)),DCB=(RECFM=VB,LRECL=12288)
//ADUPRINT DD SYSOUT=*
//SYSPRINT DD SYSOUT=*
//SYSIN DD *
//SYSIN DD *
INDD(DUMPINA,OPTIONS(DUMP))
INDD(DUMPINB,OPTIONS(DUMP))
OUTDD(DUMPOUT, TYPE(30,80,81,83))
START(1040)
END(2359)
DATE(2025229,2025360)
ABEND(NORETRY)
USER2(IRRADU00)
USER3(IRRADU86)

/*

Use it

//IBMJOBI  JOB 1,MSGCLASS=H RESTART=PRINT 
// JCLLIB ORDER=COLIN.JCL
// INCLUDE MEMBER=RACFSMF

//S1 EXEC PGM=ICETOOL,REGION=1024K
//DFSMSG DD SYSOUT=*
//TOOLMSG DD SYSOUT=*
//IN DD DISP=(SHR,PASS,DELETE),DSN=*.SMFDUMP.OUTDD
//JOBI DD DSN=&&TEMPJOBI,DISP=(NEW,PASS),SPACE=(CYL,(1,1))
//PJOBI DD SYSOUT=*
//TOOLIN DD *
COPY FROM(IN) TO(JOBI) USING(JOBI)
DISPLAY FROM(JOBI) LIST(PJOBI) -
...
//JOBICNTL DD *
INCLUDE COND=(5,8,CH,EQ,C'JOBINIT ')
//

The clever bits are the JCLLIB which gives the JCL library, and the INCLUDE MEMBER=RACFSMF which copies in the JCL.

To use the JOBI content, I needed to specify JOBI, PJOBI and JOBICTL, and similarly for each data component. 10 components meant 30 data sets – all with similar content and names, this lead to a mess of JCL.

Going further, I could use a template with the same data set names, (TEMP, PRINT etc) and just change the content.

I coverted the above JCL to create a member ICETOOL

//S1      EXEC  PGM=ICETOOL,REGION=1024K 
//DFSMSG DD SYSOUT=*
//TOOLMSG DD SYSOUT=*
//IN DD DISP=(SHR,PASS,DELETE),DSN=*.SMFDUMP.OUTDD
//TEMP DD DSN=&&TEMP3,DISP=(NEW,PASS),SPACE=(CYL,(1,1))
//PRINT DD SYSOUT=*

and use it with

//IBMJOBI  JOB 1,MSGCLASS=H RESTART=PRINT 
// JCLLIB ORDER=COLIN.JCL
// INCLUDE MEMBER=RACFSMF
//* INCLUDE MEMBER=PRINT
// INCLUDE MEMBER=ICETOOL

//TOOLIN DD *
COPY FROM(IN) TO(TEMP) USING(TEMP)
DISPLAY FROM(TEMP) LIST(PRINT) -

...
//TEMPCNTL DD *
INCLUDE COND=(5,8,CH,EQ,C'JOBINIT ')
//

Where I just had to change the data in italics – and not the boiler plate.

For each RACF record type, I had a different JCL member, based on the above file.
To select SMF records with a date and time range, I just edited member RACFSMF, and submitted the jobs, and they all used it.

This was easy to do and it let me focus on the problem – rather than on the JCL.

Why did my certificate mapping go wrong?

I had a working mapping for a Linux generated certificate to a z/OS userid. And then it wasn’t working. It took me 2 days before I had enlightenment. Although I had undone all of the changes I had made – well all but one.

I had defined

//IBMRACF  JOB 1,MSGCLASS=H 
//S1 EXEC PGM=IKJEFT01,REGION=0M
//SYSPRINT DD SYSOUT=*
//SYSTSPRT DD SYSOUT=*
//SYSTSIN DD *
RACDCERT DELMAP(LABEL('colinpaice'))ID(IBMUSER)
RACDCERT MAP ID(IBMUSER) -
WITHLABEL('colinpaice') -
SDNFILTER('CN=colinpaice.O=cpwebuser.C=GB')
SETROPTS RACLIST(DIGTNMAP, DIGTCRIT) REFRESH
racdcert listMAP id(IBMUSER)
/*

Which says it the certificate with Subject: C = GB, O = cpwebuser, CN = colinpaice come in, then it maps to IBMUSER. Yes, the terms are in a different order, and there are “.” instead of “.” but it worked.

I started working with JSON Web Tokens (JWT), and it stopped working. The userid was coming out as IZUSVR – which is the userid of z/OSMF. I struggled with traces, and wrote my own little program to map the certificate to a userid – but still it was IZUSVR.

The enlightenment.

With JWT they are signed by a private key, and the public key is used to check the signature (that is check the checksum of the data is valid). To do this, the keyring needs the certificate in the keyring.
I was lazy and used the same certificate to sign the JWT, as I used to do certificate logon to z/OSMF.

To put the certificate in the keyring you need to import the certificate. I copied the certificate from Linux, using cut and paste and imported it

I used

//IBMRACF2 JOB 1,MSGCLASS=H 
//S1 EXEC PGM=IKJEFT01,REGION=0M
//SYSPRINT DD SYSOUT=*
//SYSTSPRT DD SYSOUT=*
//SYSTSIN DD *
RACDCERT CHECKCERT('COLIN.COLIN.PAICE.PEM')
RACDCERT DELETE (LABEL('COLINPAICE')) ID(IZUSVR)
RACDCERT ADD('COLIN.COLIN.PAICE.PEM') -
ID(IZUSVR) WITHLABEL('COLINPAICE') TRUST


RACDCERT ID(IZUSVR) CONNECT(RING(CCPKeyring.IZUDFLT) -
USAGE(CERTAUTH) -
LABEL('COLINPAICE') -
id(IZUSVR))

SETROPTS RACLIST(DIGTCERT,DIGTRING ) refresh
/*

This imports the certificate and associates it with the specified userid, ID(IZUSVR).
Now, when the certificate arrives as part of the certificate logon to z/OSMF, it checks to see if it is in the RACF data base – yes it is – under userid IZUSVR. It does not use the RACDCERT MAP option.

I reran this job with userid ADCDB – and the JWT had ADCDB in the definition.

To make it more complex, the Liberty Web Server within z/OSMF caches some information, and this complicated the diagnosis. In the evening it worked – next morning after IPL – it didn’t!

Lesson learned

Use one certificate for certificate logon, and another certificate for JWT.

MQWEB and passtickets

The RACF PassTicket is a (one-time-only/short duration) password that is generated by a requesting product or function. It is an alternative to the RACF password.
You create a passticket specifying the userid and the application, and a one off password is generated. You can specify a validity period.

By default the passticket has replay protection – in that once used, the passticket cannot be used again, and so prevent replay. You can allow a passticket to be used more than once either by specifying APPLDATA(‘NO REPLAY PROTECTION’) for basic pass tickets, or REPLAY(YES) for enhanced pass tickets.

The server can use the function __login__applid() (or similar function) to run a thread as the specified userid. You pass the userid, password (pass ticket) and the application to use.

The MQWeb server is code running on top of Liberty Web server.

For my MQWeb server, running as started task CSQ9WEB, it was configured so my mqweb/mqwebuser.xml configuration file had <safCredentials profilePrefix=”MQWEB“…./>

I created a passticket for my userid COLIN, and application MQWEB, and I was able to logon to the the MQWEB server using userid COLIN and with the pass ticket as my password.

Creating and using pass tickets on z/OS.

As part of looking into secure way of logging on to z/OS, I looked into pass tickets (because Zowe can generate a pass ticket to connect to other sub-systems). I set up the simplest (and oldest) pass ticket configuration. The best practice is to use enhanced pass tickets, and store values encrypted. With enhanced pass tickets you can specify the validity period of the pass ticket – it defaults to 10 minutes. I wanted the easiest way, so I used the older technique.

With thanks to Philippe Richard for his many comments, I’ve incorporated them in the post.

I had the usual struggles with getting the C program to work, but overall it was quite easy.

I successfully used the RACF callable function IRRSPK00 R_ticketserv (IRRSPK00).

You pass a userid and an application name and the service returns a temporary, time limited, password for that userid and application.

The application name depends on what system you are logging on to. I submitted a job from TSO on system with SYSID S0W1, and the application name was TSOS0W1. You cannot use a pass ticket for TSO on a CICS system, because the application names will not match.

When you are under TSO and enter submit the application is still TSO, so use TSOS0W1.

If, for instance, you try to submit a job through the internal reader, then it will use application MVSS0W1.

For example:

//INTRDRS1 EXEC PGM=IEBGENER 
//SYSUT1 DD DSN=PASSTIKT.ENHC.JCL(REFRESH),
// DISP=SHR
//SYSUT2 DD SYSOUT=(,INTRDR)
//*
//SYSPRINT DD SYSOUT=*
//SYSIN DD DUMMY

and in member REFRESH, you have a job with userid=racf admin user, password= where you substitute the passticket, like:

//SYSADMX JOB 30000000,’MVS JOB CARD ‘,MSGLEVEL=(1,1), 
// CLASS=A,MSGCLASS=Q,NOTIFY=&SYSUID,TIME=1440,REGION=0M,
// USER=SYSADM,PASSWORD=PSEG7TXM,
// JOBRC=MAXRC
//IEFPROC EXEC PGM=IKJEFT01,REGION=4M,DYNAMNBR=10
//SYSTSPRT DD SYSOUT=*
//SYSTSIN DD *
SETROPTS RACLIST(PTKTDATA) REFRESH

It will use MVSS0W1 as the application ID.

Andrew Mattingly has written a very well detailed blog on passtickets which is well worth a read.

It described with ample details the algorithm and the various techniques to generate pass-tickets.

Security definitions

The security definitions are in two parts

  • The profile for using a pass ticket, for example, who can use a ticket for logging on to TSO,
  • The profile for which userids can create a pass ticket.

Who can use a pass ticket with which application

You can limit who can use the application, for example

  • a profile just TSOS0W1,
  • or members of group SYS1 profile TSOS0W1.SYS1,
  • or a userid COLIN in group SYS1, profile TSOS0W1.SYS1.COLIN

Example definitions for TSOS0W1 profile

RDEFINE PTKTDATA TSOS0W1  SSIGNON(KEYMASKED(7E4304D681920260)) - 
APPLDATA('NO REPLAY PROTECTION')

SETROPTS RACLIST(FACILITY,PTKTDATA) REFRESH

The server, TSO in this case, can use the function __login__applid() to run the thread as the specified userid. You pass the userid, password (pass ticket) and the application to use (TSOS0W1).

Who can define which pass tickets?

You have to define a RACF profile for the application name, and a profile for userids than can generate a pass ticket for that application.

RDEFINE PTKTDATA   IRRPTAUTH.TSOS0W1.*  UACC(NONE)
PERMIT IRRPTAUTH.TSOS0W1.* CLASS(PTKTDATA) ID(COLIN) ACCESS(UPDATE)
PERMIT IRRPTAUTH.TSOS0W1.* CLASS(PTKTDATA) ID(IBMUSER)ACCESS(UPDATE)
SETROPTS RACLIST(PTKTDATA) REFRESH

The above statements define a profile for defining pass ticket with the TSOS0W1 application.

Userids COLIN and IBMUSER can define pass tickets for this application.

What can you use to generate a pass ticket?

My application code

See C calling a function setting the high order bit on, and passing parameters for a discussion about calling the callable service, and passing the parameters.

 //   Code to generate a pass ticket 
#pragma linkage(IRRSPK00 ,OS)
#pragma runopts(POSIX(ON))
/*Include standard libraries */
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <stdarg.h>
#include <iconv.h>

int main( int argc, char *argv??(??))
{
if (argc != 3)
{
printf("Syntax is %s userid applid\n",argv[0]);
return 12 ;
}
if (strlen(argv[1]) > 8)
{
printf("length of userid must be <= 8\n");
return 12;
}
if (strlen(argv[2]) > 8)
{
printf("length of applid must be <= 8\n");
return 12;
}

char work_area[1024];
int Option_word = 0;
int rc;
long SAF_RC,RACF_RC,RACF_RS;
SAF_RC=0 ;
long ALET = 0;
short Function_code= 3;
struct {
short length;
char value[8];
} appl;
struct {
short length;
char value[8];
} userid;
struct {
short length;
char value[20];
} ticket;
ticket.length=20;
char * u= argv[1] ;
strncpy(&userid.value[0],u,8);
userid.length =strlen(u);
char * pAppl = argv[2];
strncpy(&appl.value[0],pAppl,8);
appl.length =strlen(pAppl);


int Ticket_options = 1;
int * pTO = & Ticket_options;

rc=IRRSPK00(
&work_area,
&ALET , &SAF_RC,
&ALET , &RACF_RC,
&ALET , &RACF_RC,
&ALET , &RACF_RS,
&ALET ,&Function_code,
&Option_word,
&ticket, // length followed by area
&pTO,
&userid,
&appl
);

printf("return code SAF %d RACF %d RS %d \n",
SAF_RC,RACF_RC,RACF_RS );
if (SAF_RC == 0)
{
int l = ticket.length;
printf("Pass ticket:%*.*s\n",l,l,ticket.value);
}
return SAF_RC;

}

The compile JCL was

//IBMPASST   JOB 1,MSGCLASS=H,COND=(4,LE) 
//S1 JCLLIB ORDER=CBC.SCCNPRC
// SET LOADLIB=COLIN.LOAD
//DOCLG EXEC PROC=EDCCB,INFILE='COLIN.C.SOURCE(TICKET)',
// CPARM='OPTF(DD:COPTS)'
//COMPILE.ASMLIB DD DISP=SHR,DSN=SYS1.MACLIB
//COMPILE.COPTS DD *
LIST,SOURCE
aggregate(offsethex) xref
SEARCH(//'ADCD.C.H',//'SYS1.SIEAHDR.H')
TEST
ASM
RENT ILP32 LO
OE
NOMARGINS EXPMAC SHOWINC XREF
LANGLVL(EXTENDED) sscom dll
DEFINE(_ALL_SOURCE)
DEBUG
/*
//BIND.SYSLMOD DD DISP=SHR,DSN=&LOADLIB.
//*IND.SYSLIB DD DISP=SHR,DSN=&LIBPRFX..SCEELKED
//*IND.OBJLIB DD DISP=SHR,DSN=COLIN.OBJLIB
//BIND.CSS DD DISP=SHR,DSN=SYS1.CSSLIB
//BIND.SYSIN DD *
INCLUDE CSS(IRRSPK00)
NAME TICKET(R)
/*
//START1 EXEC PGM=TICKET,REGION=0M,PARM='ADCDB TSOS0W1'
//STEPLIB DD DISP=SHR,DSN=&LOADLIB
//SYSERR DD SYSOUT=*,DCB=(LRECL=200)
//SYSERROR DD SYSOUT=*,DCB=(LRECL=200)
//SYSOUT DD SYSOUT=*,DCB=(LRECL=200)
//SYSPRINT DD SYSOUT=*,DCB=(LRECL=200)
//CEEDUMP DD SYSOUT=*,DCB=(LRECL=200)
/&

Problems

I could not get R_GenSec (IRRSGS00 or IRRSGS64): Generic security API interface RACF callable services to work because of the 31 bit program, and the service expecting 64 bit addresses.

This blog post has code which uses R_GenSec in 64 bit C.

Mapping a certificate to a userid and so avoid needing a password is good – but…

You can use the RACDCERT MAP command to map a certificate to a userid, and so avoid the need for specifying a password. Under the covers code uses the pthread_security_np and pass a certificate, or a userid and password, and if validated, the thread becomes that userid, just the same as if the userid was logged on.

Is this secure?

If you store a userid and password on your laptop, even though the data may be “protected” someone who has access to your machine may be able to copy the file and so impersonate you.

With a public certificate and private key, if someone can access your machine, they may be able to copy these files and so impersonate you.

You can get dongles which you plug into your laptop on which you can store protected data. In order to use the data, you need the physical device.

You need to protect the RACF command

Because the RACFCERT command has the power to be dangerous, you need to protect it.

You do not want someone to specify their certificate maps to a powerful userid, such as SYS1. The documentation says

To issue the RACDCERT MAP command, you must have the SPECIAL attribute or sufficient authority to the IRR.DIGTCERT.MAP resource in the FACILITY class for your intended purpose.

For a general user to create a mapping associated with their own user ID they need READ access to IRR.DIGTCERT.MAP.

For a general user to create a mapping associated with another user ID or MULTIID, they need need UPDATE access to IRR.DIGTCERT.MAP.

What’s the best way to set this up?

I think that as part of your process for setting up userids, the process should create the mapping for the certificate to a userid. This way you do not have people creating the mapping. If a mapping already exists, you cannot create another mapping.

You may want an automated process which checks the approval, and issues the commands, and so you do not have humans with the authority to issue the commands.

Of course you’ll have a break-glass all powerful userid in case of emergencies.

But….


Even though the password had expired, I could logon using the certificate. If I revoked the userid the logon failed.

I used certificate logon from z/OSMF and issued console commands. The starts a TSO address space, and z/OSMF passes the commands and responses to the tso address space.

Once a TSO address space has been started, there are no more checks to see if the userid is still valid.

If you want to inactivate the userid, you’ll need to revoke it, and then cancel all the TSO address spaces running on behalf of the userid. Walking someone off site is not good enough. There may be scripts which are automated, and will logon with no human intervention.
TSO address spaces may be configured to be cancelled if there is no activity. If the TSO address space is kept busy, (for example by sending it requests) it may never be forced off.

Giving a started task userid a password should be a sackable offence.

Last week I was going through some product documentation, and I got to the part where it said “Now change the started task userid, and give it a password”. I made a note to raise a defect on this, because this is a no-no.
This week someone asked on IBM-MAIN, the impact of giving a started task userid a password. There were many well informed comments covering things I didn’t know about, so I thought I’d make a blog post of of the comments.

Best practices dictate that passwords only be provided when a userid will be used by a specific person and that person needs a password for that userid. All other userids should never have a password.

A userid can be revoked because of too many invalid password attempts, or revoked because of inactivity. If a userid is revoked it cannot logon. If the userid of a started task is revoked, the started task may start, but it may be restricted as to what it can do – because it cannot logon.

You can alter a userid to have NOPASSWORD (and have no PHRASE). This means started tasks can start, but the userid cannot be used to logon to the system. This is known as making the userid PROTECTED.

Often started task userids have special capabilities, such as running as different userids, being able to set security options, or modify system storage. This means you do not want your Help Desk staff from adding or resetting passwords for protected userid. Changes to these protected userids should be done from a secure, limited access userid.

If you think through the impact of using started tasks. It may be better for all routine production jobs to be run as started tasks. This has the advantage that there are no passwords involved, and you can use automation to issue the start command based on a timer.

You might have a CICS started task userid for all CICS regions or just for a subset of regions. You might have one started task userid for all started tasks, or a started task userid for each logical instance, eg CICS, MQ, DB2, Zowe, TCP/IP etc.

System jobs

System jobs should be run as started tasks, and the started tasks should be protected

Personal jobs

You can submit jobs from your userid, (and not specify a password) and the job will run under your userid.

You can put USER=name,PASSWORD=… on a job card, and if these validate the job will run with the specified userid. This is not a good idea, as the password may be visible in the dataset.

Departmental userids

You can put USER=name and omit the password, and use surrogate checking. The documentation says

You can allow the use of surrogate users. A surrogate user is a RACF-defined user who has been authorized to submit jobs on behalf of another user (the execution user) without specifying the execution user’s password. Jobs submitted by a surrogate user run with the identity of the execution user. For example, if user JOE submits a job with the following JOB statement, JOE is the surrogate user and TOM is the execution user:

JOE can submit userid containing

//jobname JOB 'accounting-information',USER=TOM

You set up a security profile (see the documentation) to control which userid can specify a userid on the JOB USER= statement.

All access checks are done with TOM’s user ID.

The TOM userid can be a protected userid – without a password, if surrogates are used.

To set up surrogates, defined the profile, and give a group access to the profile, rather than give userids access. You are likely to have a group already defined. Administration, such as when someone leaves the department, is much easier, as you just need to remove the persons userid from the group.

Thanks to

Robert S. Hansel, Seymour J Metz, Steve Beaver, Jack Zukt, Mike Schwab, Jon Perryman for their comments.

Jump up and down: Do not give userids access to resources!

I am doing some work connecting subsystems together. The documentation for all the products involved, describe giving a userid permission to access to a resource.

This is not best practice. Like many things it will be obvious once you understand it. You need to look at the bigger pictures.
The documentation says for userids who want to use “this”, give them access to the profile “…”. What’s wrong with that?

  • If you have a department of 1000 people, giving them all access to the resource will be tedious
  • There are likely to be several resources people need access to, so connecting these 1000 userids to multiple resources will be even more tedious.
  • Someone joins your department – so you need to connect their userid to the long list of groups.
  • Someone leaves your department – you cannot trivially ask what resources can this userid access – you have to look at the access list for each resource and remove the userid from the group.

You most probably have groups set up already. Rather than give the userid access, give the group access to the resource.

If someone joins your group – you connect their userid to the groups – and they have access. If someone leaves your department, remove them from the groups – and they no longer have access to the resources.

You my want to support a new product… That’s easy – give the group(s) access to the resources – and the people will auto-magically get access.

As I said it is obvious once you see it.

Do not give userids access to resources – give groups access, and connect the userid to the group. I’ll go and raise documentation comments on the products’ documentation.

Zowe: Planning – certificates

A private key is used for encryption and should be kept private. If someone has access to the private key they can impersonate you.

A public key is the opposite of the private key. It is used for decrypting data encrypted with the private key. The public key can (and should) be generally available.

Certificates are used in TLS for authentication and encryption, and can be used for identification. They include a public key.

A Certificate Authority(CA) certificate is used to validate other certificates. It involves doing a checksum of a certificate, and encrypting it with the CA private key – a process known as signing the certificate

To validate a certificate, the recipient needs a copy of the CA which was used to sign the original certificate. It can the decrypt the encrypted checksum, and compare it with the certificates checksum.

If you create a Certificate Authority certificate you will need to distribute this to all machines that might communicate with your server(possibly thousands of machines), and installed into the keystore on those machines. This can be a big task. Most system have one site wide CA certificate which is distributed to all machines. You might have a second CA to limit access to a system. This CA is used to sign the server certificates.

Creating certificates

As part of creating a certificate you create the private/public keys. There are different algorithms for these keys. Some are stronger than others (it takes more time and CPU to break them) Keys with elliptic curve algorithms are generally stronger than using RSA techniques, and there other techniques resistant to Quantum Computing.

You have to use a server certificate with RSA and keysize 2048.

I found that authentication with JWT (Java Web Tokens) only worked with RSA keys and not Elliptic Curves. This is because of encryption with JWT.

Key stores

You should use keyrings rather than .pem files, as they are more secure. .pem files can be copied, and anyone with authority can copy them. You have to give explicit permission to be allowed to access a keyring. Certificate and the private keys within a keyring can be stored in cryptographic hardware, and the private keys are never exposed in clear text.

Many systems uses a keystore for storing the private key used by the server, and a trust store for the Certificate Authority keys needed to validate any client certificate sent to the server. This can, but is not recommended, be the same as the keystore.

The keystore will need the server key. You can specify which key should be used.

  • the Certificate Authority keys needed to validate the server key.
  • the server userid needs access to the keyring. If the private key belongs to the server’s userid, then the server’s userid needs read access to the keyring. If the private key belongs to a different userid, the server’s userid needs update access to the keyring. See here for more information.

The trust store needs all the CA certificates which may be needed to verify a client certificate.

The client machine will have one or more certificates. A copy of the CA used to create these needs to be installed in the trust store on the server.

I understand that if you add a certificate to the trust store, you need to restart Zowe for it to be picked up, so try to get a list of all the CAs you will need before you start.

Validating certificates

The trust store needs the certificates to validate any client certificates sent to the server. This will usually just be Certificate Authority certificates.

Zowe works with z/OSMF. They communicate with certificates. The z/OSMF trust store keyring needs the CA of the Zowe server certificate, and the Zowe trust store keyring needs the CA of the z/OSMF server key. If they are not set up properly you can get messages “LoadBalancer does not contain an instance for the service zaas”.

If you have a site wide CA, which is the same for both of the server keys, then you do not need to do any more work, as both trust store keyrings will already have the CA. If Zowe and z/OSMF have different CA certificates, the CA certificates need to be connected to the other keyrings.

Subject Alternative Name

The Subject Alternative Name within a certificate provides the IP addresses, or IP Names of the server. A client can check this address with the IP address of the session, and terminate the session if they do not match. This is considered best practice. This check can be disabled in Zowe by using verifyCertificates NOSTRICT in the zowe.yaml file.

RACF allows one name or address when the certificate is defined, for example ALTNAME(IP(127.0.0.1))

On my z/OS system the command TSO NETSTAT HOME gives three addresses 127.0.0.1 and 10.1.1.2 and 10.1.2.6.

You can configure your sysplex, so all systems in the sysplex have the same IP address, and traffic gets routed internally to the correct system. Without this, if you start a server on a different LPAR it will have a different IP address, and so the validation will fail.

On my system, the zOSMF certificate did not have an ALTNAME specified, and so failed the Zowe checks. I had to set the Zowe option verifyCertificates NOSTRICT for it to work, until I fixed the certificate.

If the z/OSMF certificate has an ALTNAME(IP…) specified, use the IP address value when you configure zOSMF for example

zOSMF: 
host: 127.0.0.1
port: 10443
applId: IZUDFLT

Mapping certificates

If you are using client certificate to authenticate rather than a userid and password, then you’ll need to map certificates to userids, for example with the RACDCERT MAP command. You can specify which CA the certificates were signed with, and fields from the subject Distinguished Name. The question of which is better: using userid and password, or client certificate to logon has no easy answer

  • It is easy for a hacker to get a password (lost handbag, yellow sticky stuck to the screen, a couple of pints down the pub) It is easy to change a password once is is compromised.
  • It is harder for a hacker to get a certificate – but it is harder to change and re-issue a certificate. You have to get the updated certificate down to the client’s machine. It could be stolen if a hacker has access to the machine.
  • Using Biometric data to logon is the ultimate limit in this area. Hackers could steal it – but there is no way of changing it if it is compromised!

Decisions

You need to decide

  • Are you going to have a CA just for Zowe? or reuse the site CA.
    • If you are going to have a Zowe specific CA – how are you going to distribute it to all the client machines.
    • You’ll need to ensure Zowe and z/OSMF have the other’s CA certificates in their trust store.
  • Are you going to use Subject Alternative Name ALTNAME(IP(10.1.1.2))
    • What value of verifyCertificates STRICT|NOSTRICT are you going to specify in the zowe.yaml file.
  • Are you going to authenticate using certificates. You will need to set up mapping from certificate to userid. RACDCERT MAP
  • If you will be using JWT, and so need an RSA key in the server certificate