Defibrillators in Orkney.

Someone having a heart attack does not necessarily need a defibrillator, it is normally needed when someone suffers a cardiac arrest. If they are breathing and conscious they would not require a defibrillator.

The usual process, if someone is suspected of having a cardiac event, is to phone 999, and the call centre can advise you were the nearest defibrillator is located and supply the unlock code if it is required. Calling 999 also ensures medical assistance is on the way

Once switched on the defibrillator will talk you through the whole process and if needed guide you through CPR. There is no need to be concerned about wrongly shocking a patient as the defibrillator analyses the situation and will not allow a shock to be administered if it is not required. Depending on the type of defibrillator you may be ask to press a button to administer the shock (which tends to be the most common and preferred option) or it will happen automatically.

Getting a machine.

The defib store has been used by people in Orkney. They are used to sending the machines with the Lithium batteries to Orkeny.

Defibs recently purchased (April 2023) are the iPAD SP1 semi-automatic units by CU medical Systems and the cabinet was the defibstore 4000

The prices quoted on defib store are – iPad SP1 £1134 (including VAT) , and 4000 cabinet £598.80. Replacement pads cost £66 and battery £246 All prices include VAT, Defibstore are offering free shipping at the moment.

One advantage of the iPAD SP1 is that you can switch between infant (under 8 years old) and adult modes.

The defib should be registered with “The Circuit” which allows the emergency services to know its location and availability and they can allow public access when required. This does require a responsible person to log on to the web site at regular intervals to confirm the defib is in good order and available.

Maintenance

A weekly health check of the defibrillator (usually indicated by a green light).

A monthly check to make sure the key-press combination lock is in working order, and a light spray of WD40 to the combination lock and hinges.

Replace the pads every 2 years (£65).

Replace the battery – about 4-5 years (£100 – £200).

The external cabinet should have a main s supply so a small thermostatically controlled heater can maintain good storage conditions for the defibrillator.

zpdt and Ubuntu 22.04 -it worked!

I installed Ubuntu 22.04 on an isolated hard disk drive and installed IBM Z Development and Test Environment Personal Edition 14.0. on it.

I followed the documentation. The executable was sudo ./zdt-install-pe

The first time it ran, I allowed it to configure the network. It then prompted me with

Preconfiguration steps …

32 bit support not installed
Some of the above software dependencies are not installed

Do you want the necessary Linux dependencies for the product IBM® ZD&T Personal Edition to be installed? By entering y, all required dependencies will be installed. The list of dependencies are mentioned in the Prerequisites. You need to have access to internet and software repository to install the dependencies. Otherwise, installation will complete without dependencies, and you need to install the dependencies manually. For more information about linux prerequisites, see: https://ibm.biz/zdt_prerequisites

Y

It then hung.

When I reran it, but did not allow it to configure the network, and it ran successfully to completion in under a minute. It could have been a wi-fi problem.

I had to install x3270 (sudo apt install x3270) before starting my z/OS system. My zPDT environment was on a removable hard disk drive, so I plugged it it, started it up, and my system came up with no problems.

Performance tuning at the instruction level is weird

This post came out of a presentation on performance which I gave to some Computer Science students.

When I first joined the IBM MQ performance team, I was told that it was obvious that register/register instructions were fast, and storage/register were slow. Over time I found is not always true, I’ll explain why below…

For application performance there are some rules of thumb,

  • Use the most modern compilers, as they will have the best optimisation and use newer, faster instructions
  • Most single threading applications will gain from using the best algorithms, and storage structures, but they may gain very little from trying to tune which instructions to use.
  • Multi threading programs may get benefit from designing their programs so they do not interact at the instruction level.

You may do a lot of work to tune the code – and find it makes no difference. You change one field, and it can make a big difference. You think you understand it – and find you do not.

Background needed to understand the rest of the post.

Some of what I say is true. Some of what I say may be false – but you it will help you understand – even though it is false. For example, the table in front of me is solid and made out of wood. That is not strictly accurate. Atoms are mainly empty space. Saying the table is solid, is a good picture, but strictly inaccurate.

I spent my life working with the IBM 390 series and the description below is based on it – but I’ve changed some of it to keep it simple.

Physical architecture

The mainframe has an instruction “Move Character Long” (MVCL). If you get microscope and look at the processor chips, you will not find any circuitry which implements this instruction. This instruction is implemented in the microcode.

Your program running on a processor is a bit like Java byte codes. The microcode reads storage and finds an instruction and executes it.

For an instruction “move data from a virtual address into a register”, the execution can be broken down into steps

  1. Read memory and copy the instruction and any parameters
  2. Parse the data into operation-code, registers, and virtual storage address
  3. Jump to the appropriate code for the operation code
  4. Convert the virtual storage address into a real page address (in RAM). This is complex code. Every thread has its own address say, 4 000 000 , so you need the thread-look-up tables to get the “real address” for that thread. Your machine may be running virtualised, so the the “real address” needs a further calculation of the next level of indirection.
  5. “Lock” the register” and “lock” the real address of the data
  6. Go and get the data from storage
  7. Move the data into the register
  8. Unlock the register, unlock the real address of the data
  9. End.

Where is the data?

There is a large (TB) of RAM in the physical box. The processor “chips” are in books (think pluggable boards). The “chips” are about the size of my palm, one per book. There is cache in the books. Within the “chips” are multiple CPUs, storage management processors and more cache.

The speed of data access depends on the speed of light.

  • To get from the RAM storage to the CPU, the distance could be 100 cm – or 3 nanoseconds
  • To get from the book’s cache storage to the CPU this could be 10 cm or about 0.3 nanoseconds
  • The time for the CPU to access the memory on the chip is about 0.03 nano seconds.

The time for “Go and get the data from storage” (above) depends on where the data is. The first access may take 3 nano seconds when the data is read from RAM, if a following instruction uses the same data, it is already in the CPU cache, and so take 0.03 nanoseconds ( 100 times faster).

How is the program executed?

In the flow of an instruction (above) each stage is executed in a production line known as a pipeline. There are usually multiple instructions being processed at the same time.

While one instruction is in the stage “Convert the virtual storage address into a real page address”, another instruction is being parsed and so on.

If we had instructions

  1. Load register 5 from 4 000 000
  2. Load register 4 from register 5
  3. Clear register 6
  4. Clear register 7

Instruction 2 (Load register 4 from register 5) needs to use register 4 and cannot execute until the first instruction has finished. Instruction 2 has to has to wait until instruction 1 has finished. This shows that a register to register instruction may not be the fastest; it has to wait for a previous instruction to finish.

A clever compiler can reorder the code

  1. Load register 5 from 4 000 000
  2. Clear register 6
  3. Clear register 7
  4. Load register 4 from register 5

and so this code may execute faster because the clear register instructions can execute without changing the logic of the program. By the time the clear register instructions have finished, register 4 may be available.

If you look at code generated from a compiler, a register may be initialised many instructions away from where it is next used.

The hardware may be able to reorder these instructions; as long as they end in the correct order!

Smarter use of the storage and cache

Data is read and written to storage in “cache lines” these may be blocks of 256, 512 or 1024 byte blocks.

If you have a program with a big control block, you may get benefit from putting hot fields together. If your structure is spread across two cache lines, you may have processing like.

  • Load register 5 from cache block 1. This takes 3 ns.
  • Load register 6 from cache block 2. This takes 3 ns.

If the fields are adjacent you might get

  • Load register 5 from cache block 1. This takes 3 ns.
  • Load register 6 from cache block 1. This takes 0.03 ns because the cache block is already in the CPU’s cache.

Use faster instructions

Newer generations of CPUs can have newer and faster instructions. To load a register with the constant value – say 5. In the old days, it had to read this value from storage. Newer instructions may have the value(5) as part of the instruction – so no storage access is required, and so the instruction should be faster.

The second time round a loop may be faster than the first time around a loop.

The stages of the production line may cache data, for example converting a virtual address to a real page address. The stage may look in it’s cache – if not found then do the expensive calculation. If it is found then use the value directly.

If your program is using the same address (same page) the second time round the loop, the the real address of the data may already be in the CPU cache. The first time may have had to go to RAM, the second time the data is in the CPU cache.

This can all change

Consider the scenario where the first time round the loop was 100 times slower than later loop iterations – it may all suddenly change. Your program is un-dispatched to let someone else’s program run. When your program is re-dispatched, the cached values may no longer be available, so your program has a slow iteration while the real address of you virtual page is recalculated, and the data read in from RAM.

Multi programming interactions.

If you have multiple instances of your program running accessing shared fields, you can get interference at the instruction level.

Consider a program executing

  • Add value 1 to address 4 000 000

Part of the execution of this is to take a lock on the cache line with address 4 000 000. If another CPU executes the same instruction, the second CPU will have to wait until the first CPU has finished with it. If both CPUs are on the same chip the delay may be small (0.03 ns). If the CPUs are in different chips (in different books) it will take 0.3 nanoseconds to notify the second CPU.

If lots of CPUs are trying to access this field there will be a long access time.

You should design your program so each instance has its own cache line, so the CPUs do not compete for storage. I know of someone who did this and got 30% throughput improvement!

Configuring the hardware for virtual machines.

You should also consider how you configure your hardware. If you give each virtual machine CPUs on the same chip, then any interference should be small. If a virtual machine has CPUs in different books (so takes 0.3 nano seconds to talk to the other CPU) the interference will be larger because the requests take longer. Ive seen performance results vary by a couple of percentages because the CPUs allocated to a virtual machine were different on the second run.

Going deeper into the murk

If you have virtual machines sharing the same CPUs, this may affect your results, because your virtual machine may be un-dispatched, and another virtual machine is dispatched on the processor(s). The cached values for your virtual machine may be been overwritten.

Improving application performance – why, how ?

I’m working on a presentation on performance, for some university students, and I thought it would be worth blogging some of the content.

I had presented on what it was like working in industry, compared to working in a university environment. I explained what it is like working in a financial institutions; where you have 10,000 transactions a second, transactions response time is measured in 10s of milliseconds, and if you are down for a day you are out of business. After this they asked how you tune the applications and systems at this level of work.

Do you need to do performance tuning?

Like many questions about performance the answer is it depends….. it comes down to cost benefit analysis. How much CPU (or money) will you save if you do analysis and tuning. You could work for a month and save a couple of hundred pounds. You could work for a day and find CPU savings which means you do not need to upgrade your systems, and so save lots of money.

It is not usually worth doing performance analysis on programs which run infrequently, or are of short duration.

Obvious statements

When I joined the performance team, the previous person in the role had left a month before, and the hand over documentation was very limited. After a week or so making tentative steps into understanding work, I came to the realise the following (obvious once you think about it) statements

  • A piece of work is either using CPU or is waiting.
  • To reduce the time a piece of work takes you can either reduce the CPU used, or reduce the waiting time.
  • To reduce the CPU you need to reduce the CPU used.
  • The best I/O is no I/O
  • Caching of expensive operations can save you a lot.

Scenario

In the description below I’ll cover the moderately a simple case, and also the case where there are concurrent threads accessing data.

Concurrent activity

When you have more than one thread in your application you will need to worry about data consistency. There are locks and latches

  • Locks tend to be “long running” – from milliseconds to seconds. For example you lock a database record while updating it
  • Latches tend to be held across a block of code, for example manipulation of lists and updating pointers.

Storing data in memory

There are different ways of storing data in memory, from arrays, hash tables to binary trees. Some are easy to use, some have good performance.

Consider having a list of 10,000 names, which you have to maintain.

Array

An array is a contiguous block of memory with elements of the same size. To locate an element you calculate the offset “number of element” * size of element.

If the list is not sorted, you have to iterate over the array to find the element of interest.

If the list is sorted, you can do a binary search, for example if the array has 1000 elements, first check element 500, and see if the value is higher or lower, then select element 250 etc.

An array is easy to use, but the size is inflexible; to change the size of the array you have to allocate a new array, copy old to new, release old.

Single Linked list

This is a chain of elements, where each element points to the next, the there is a pointer to the start of the chain, and something to say end of chain ( often “next” is 0).

This is flexible, in that you can easily add elements, but to find an element you have to search along the chain and so this is not suitable for long chains.

You cannot easily delete an element from the chain.

If you have A->B->D->Q. You can add a new element G, by setting G->Q, and D->G. If there are multiple threads you need to do this under a latch.

Doubly linked lists

This is like a single linked list, but you have a back chain as well. This allows you to easily delete an element. To add an element you have to update 4 pointers.

This is a flexible list where you can add and remove element, but you have to scan it sequentially to find the element of interest, and so is not suitable for long chains.

If there are multiple threads you need to do this under a latch.

Hash tables

Hash tables are a combination of array and linked lists.

You allocate an array of suitable size, for example 4096. You hash the key to a value between 0 and 4095 and use this as the index into the array. The value of the array is a linked list of elements with the same hash value, which you scan to find the element of interest.

You need a hash table size so there are a few (up to 10 to 50) elements in the linked list. The hash function needs to produce a wide spread of values. Having a hash function which returned one value, means you would have one long linked list.

Binary trees

Binary trees are an efficient way of storing data. If there are any updates, you need to latch the tree while updates are made, which may slow down multi threaded programs.

Each node of a tree has 4 parts

  • The value of this node such as “COLIN PAICE”
  • A pointer to a node for values less than “COLIN PAICE”
  • A pointer to a node for values greater than “COLIN PAICE”
  • A pointer to the data record for this node.

If the tree is balanced the number of steps from the start of the tree to the element of interest is approximately the same for all elements.

If you add lots of elements you can get an unbalanced tree where the tree looks like a flag pole – rather than an apple tree. In this case you need to rebalanced the tree.

You do not need to worry about the size of the tree because it will grow as more elements are added.

If you rebalance the tree, this will require a latch on the tree, and the rebalancing could be expensive.

There are C run time functions such as tsearch which walks the tree and if the element exists in the tree, it returns the node. If it did not exist in the tree, it adds to the free, and returns the value.

This is not trivial to code – (but is much easier than coding a tree yourself).

You need to latch the tree when using multiple threads, which can slow down your access.

Optimising your code

Take the scenario where you write an application which is executed a 1000 times a second.

int myfunc(char * name, int cost, int discount)
{
  printf(“Values passed to myfunc %s cost discount" i\n”,name,cost,discount);
  rc= dosomething()  
  rc = 0;
  printf(“exit from myfunc %i\n”,rc);
  return rc;
}

Note: This is based on a real example, I went to a customer to help with a performance problem, and found the top user was printf() – printing out logging information. They commented this code out in all of their functions and it went 5 times faster

You can make this go faster by having a flag you set to produce trace output, so

if (global.trace ) 
    printf(“Values passed to myfunc %s cost discount" i\n”,name,cost,discount);

You could to the same for the exit printf, but you may want to be more subtle, and use

if (global.traceNZonexit  && rc != 0)
   printf(“exit from myfunc %i\n”,rc);

This is useful when the return code is 0 most of the time. It is useful if someone reports problems with the application – and you can say “there is a message access-denied” at the time of your problem.

FILE * hFILE = 0;
for ( I = 0;i < 100;i ++)
    /* create a buffer with our data in it */
    lenData =  sprintf(buffer,”userid %s, parm %s\n”, getid(), inputparm); 
    error = ….()
    if (error > 0)
    {
     hFILE = fopen(“outputfile”,”a);
     fwrite(buffer,1,lenData,fFile)
     fclose(hFile)
    }
…
}

This can be improved

  • by moving the getid() out of the loop – it does not change within the loop
  • move the lenData = sprintf.. within the error loop.
  • change the error loop
{
  ... 
  if (error > 0)
  {
     if (hFile == 0 )
     {  
        hFILE = fopen(“outputfile”,”a”);
        pUserid = strdup(getuserid());  
     } 
     fwrite(buffer,1,lenData,fFile)     
  }
...
}
if (hFile > 0) 
   fclose(hFile);

You can take this further, and have the file handle passed in to the function, so it is only opened once, rather than every time the function is invoked.

main()
{
   struct {FILE * hFile
      …
    } threadBlock
   for(i=1,i<9999,i++)
   myprog(&threadBlock..}
   if (threadBlock →hFile != 0 )fclose(theadBlock → hFile);
   }
}
// subroutine
   myprog(threadblock * pt....){
...

  if (error > 0)
  {
     if (pt -> hFile == 0 )
     {  
        pt -> hFile= fopen(“outputfile”,”a”);       
     } 
     fwrite(buffer,1,lenData,pt -> hFile)
  }
   

Note: If this is a long running “production” system you may want to open the file as part of application startup to ensure the file can be opened etc, rather than find this out two days later.

Programming using AT-TLS

My mini project was to connect an openssl client to z/OS with AT-TLS only using a certificate. This was a challenging project partly because of the lack of a map and a description of what to do.

Overview

The usual way a server works with TCP/IP is using socket calls; socket(), bind(), listen() accept(), recv() and send(). You control the socket using ioctl().

This does not work with AT-TLS because ioctl() does not support the AT-TLS calls SIOCTTLSCTL; PL/I, REXX and Assembler supports it, but not C. (See here for a list of supported requests in C). I had to use a lower level set of interfaces (z/OS callable services); BPX1SOC(), BPX1BND(), BPX1LSN(), BPX1ACP(), BPX1RCV(), BPX1SND() and BPX1IOC1()

The documentation says

The application must have the ApplicationControlled parameter set to ON in their TTLSEnvironmentAdvancedParms or TTLSConnectionAdvancedParms statement. This causes AT-TLS to postpone the TLS handshake. After the connection is established, the application can issue the SIOCTTLSCTL IOCTL to get the current AT-TLS connection status and determine whether or not AT-TLS support is available on this connection.

Once the TLS session has been established, you can retrieve the userid associated with the certificate, or you can extract the client’s certificate and process it.

Once you have the userid or certificate you can use the pthread_security_applid_np(__CREATE_SECURITY_ENV…) to change the thread to a different userid. Note you have to run this as a thread – not as the main task.

The application flow

The application has the following logic

Main program

  • create a thread using pthread_create
  • wait for the thread to end – using rc =pthread_join
  • return

Thread subtask

  • Allocate a socket using bpx1soc.
  • Set the socket so it can quickly be reused. By default a port cannot be reused for a period of minutes, while waiting for a response from the client.
  • Bind the port to listen on to this socket using bpx1bnd.
  • Listen(wait for) a connection request on this socket using bpx1lsn.
  • Accept the request, using bpx1acp.
  • Issue the ioctl request using bpx1ioc to query information about the connection (TTLS_QUERY_ONLY). It returned:
    • Policy:Policy defined for connection – AT-TLS enabled and Application Controlled.
    • Type :Connection is not secure.
    • SSL Protocol Version 0 – because the session has not been established.
    • SSL Protocol Modifier 0 – because the session has not been established.
    • Rule name COLATTLJ.
    • Group Action TNGA.
    • Environment TNEA.
    • Connection TNCA.
    • Note: asking for TTLSK_Host_Status gave me “EDC5121I Invalid argument.” because this request is meaningless at this time.
  • Issue the ioctl request using BPX1IOC to start the the connection (TTLS_INIT_CONNECTION). This initiates the TLS handshake. If this is successful, it returned in the ioc control block
    • Policy:Policy defined for connection – AT-TLS enabled and Application Controlled
    • Type :Connection is secure
    • SSL Protocol Version 3
    • SSL Protocol Modifier 3
    • SSL Cipher 4X. 4X means look at the 4 byte field
    • SSL Cipher C02C. C02C is
      • TLS_ECDHE_ECDSA_WITH_AES_256_GCM_SHA384, which is “256-bit AES in Galois Counter Mode encryption with 128-bit AEAD message authentication and ephemeral ECDH key exchange signed with an ECDSA certificate”. I can’t tell where the 128 bit AEAD is in the short description.
    • userid COLIN
  • Receive the data BPX1RCV(). You can peek to see how much data is available using flags = MSG_PEEK
  • Send a response using BPX1SND()
  • Close the remote session using BPX1CLO
  • Close the server’s socket using BPX1CLO
  • Exit the thread using pthread_exit()

Mapping certificate to userid

You can use AT-TLS to use the certificate and return the userid associated with the certificate; or you can use the pthread_security_applid_np to pass a certificate and change the thread to be the certificate owner.

You map a certificate to a userid with commands like

//COLRACF  JOB 1,MSGCLASS=H 
//S1  EXEC PGM=IKJEFT01,REGION=0M 
//SYSPRINT DD SYSOUT=* 
//SYSTSPRT DD SYSOUT=* 
//SYSTSIN DD * 
RACDCERT LISTMAP ID(COLIN) 
RACDCERT DELMAP(LABEL('CP'))  ID(COLIN) 
RACDCERT MAP ID(COLIN  )  - 
   SDNFILTER('CN=docec256.O=Doc.C=GB')                  - 
   IDNFILTER('CN=SSCA256.OU=CA.O=DOC.C=GB')             - 
   WITHLABEL('CP') 
RACDCERT LISTMAP ID(COLIN) 
SETROPTS RACLIST(DIGTNMAP, DIGTCRIT) REFRESH 
/* 

This associates the certificate with the given Subject Name (SN) value , and the Issuer’s value with the userid COLIN. See Using certificates to logon to z/OS for a high level perspective.

AT-TLS definitions

TTLSRule                      COLATTLJ 
{ 
  LocalPortRange              4000 
# Jobname                     COLCOMPI 
# Userid                      COLIN 
  Direction                   BOTH 
  TTLSGroupActionRef          TNGA 
  TTLSEnvironmentActionRef    TNEA 
  TTLSConnectionActionRef     TNCA 
} 

TTLSRule                      COLATTLS 
{ 
  LocalPortRange              4000 
# Jobname                     COLATTLS 
  Userid                      START1 
  Direction                   BOTH 
  TTLSGroupActionRef          TNGA 
  TTLSEnvironmentActionRef    TNEA 
  TTLSConnectionActionRef     TNCA 
} 

TTLSConnectionAction              TNCA 
{ 
  TTLSCipherParmsRef              TLS13TLS12 
  TTLSSignatureParmsRef           TNESigParms 
  TTLSConnectionAdvancedParmsRef  TNCOonAdvParms 
  CtraceClearText                 Off 
  Trace                           255 
} 

TTLSConnectionAdvancedParms       TNCOonAdvParms 
{ 
 ServerCertificateLabel  NISTECC521 
#ServerCertificateLabel  RSA2048 
#ccp this was added 
  ApplicationControlled         On 
  SSLv3          OFF 
  TLSv1          OFF 
  TLSv1.1        OFF 
  TLSv1.2        ON 
  TLSv1.3        OFF 
  SecondaryMap   OFF 
  HandshakeTimeout 3 
} 

TTLSSignatureParms                TNESigParms 
{ 
   CLientECurves Any 
} 

TTLSEnvironmentAction                 TNEA 
{ 
  HandshakeRole                       ServerWithClientAuth 
# HandshakeRole                       Server 
  TTLSKeyringParms 
  { 
    Keyring                   start1/TN3270 
  } 
  TTLSSignatureParmsRef       TNESigParms 
  TTLSCipherParmsRef  TLS13 
} 

TTLSCipherParms             TLS13TLS12 
{ 
#TLS 1.3 
 V3CipherSuites      TLS_CHACHA20_POLY1305_SHA256 
#V3CipherSuites      TLS_AES_256_GCM_SHA384 
#V3CipherSuites      TLS_AES_128_GCM_SHA256 
#TLS 1.2 
# NSTECC 
 V3CipherSuites      TLS_RSA_WITH_AES_256_CBC_SHA256 
 V3CipherSuites   TLS_ECDHE_ECDSA_WITH_AES_256_GCM_SHA384 
#RSA 
}
 
TTLSCipherParms             TLS13 
{ 
 V3CipherSuites      TLS_CHACHA20_POLY1305_SHA256 
}
 
TTLSGroupAction             TNGA 
{ 
  TTLSEnabled               ON 
  trace                     255 
} 

Submitting a job from userid COLIN, got definition Rule name COLATTLJ

When I used a started task COLATTLS with userid START1, I was expecting Rule name COLATTLS but got Rule name COLATTLJ. It looks like PAGENT uses the first matching rule; it matches rule PORT 4000, so used COLATTLJ.

My program printed out

Group Action TNGA.         
Environment  TNEA.         
Connection   TNCA.         

which matches the definitions above.

The program printed out

BPX1SOC rv 0                                                                                         
BPX1SOC socket value 0  
                                                                             
BPX1OPT SET SO_REUSEADDR rv 0    
                                                                    
BPX1BND rv 0  
                                                                                       
BPX1LSN rv 0                                                                                         

BPX1ACP rv 1                                                                                         
After IOC   
                                                                                         
BPX1IOC rv 0                                                                                         
printATTLS rv 0                                                                                      
query.header.TTLSHdr_BytesNeeded 136                                                                 
BPX1IOC Policy:Policy defined for connection - AT-TLS enabled and Application Controlled             
BPX1IOC Conn  :Connection is not secure                                                              
BPX1IOC Type  :Server with client authentication ClientAuthType = Required                           
SSL protocol version TTLS_PROT_UNKNOWN  
Rule name    COLATTLJ.                                                                          
Group Action TNGA.                                                                              
Environment  TNEA.                                                                              
Connection   TNCA.                                                                              

TTLS_INIT_CONNECTION                                                   
BPX1SOC TTLS_INIT_CONNECTION  rv 0                                                              
After INIT_CONNECTION                                                                           
printATTLS rv 0                                                                                 
query.header.TTLSHdr_BytesNeeded 128                                                            
BPX1IOC Policy:Policy defined for connection - AT-TLS enabled and Application Controlled        
BPX1IOC Conn  :Connection is secure                                                             
BPX1IOC Type  :Server with client authentication ClientAuthType = Required                      
SSL protocol version TTLS_PROT_TLSV1_2                                                          
SSL Cipher  C02C 
                                                                               
ioc.TTLSi_Cert_Len 1080                                                                         
get cert  IOC                                                                                   
BPX1IOC Get cert rv 0   
userid 8     ADCDA                                                                
pthread_s... applid ZZZ rc = 0 userid    ADCDA.                                   
                                                                                  
BPX1RCV Peek rv 4  
                                                                
BPX1RCV bytes 4 
                                                         
BPX1SND rv 48                                                                                                                                                                                                          

AT-TLS programming

In my program I had

Accept the session and invoke ATTLS

struct sockaddr_in client; /* client address information          */ 

BPX1SOC()...
BPX1OPT()... // Set SO_REUSEADDR
BPX1BND()...
BPX1LSN().. // this returns once there is a connection to the socket

int lClient = sizeof(client); 
BPX1ACP(&Socket_vector[0], 
        &lClient, 
        &client, 
        &rv, 
        &rc, 
        &rs); 
if (check("BPX1ACP",rv,rc,rs) < 0 ) // -1 is error 
  exit(4); 
int sd = rv; // save the returned value 

#include <attls.h> 
#include <attlssta.h> 
                                                                           

Issue ATTLS query before initial TLS handshake

Member attls.h had

// AT-TLS 
struct TTLS_IOCTL ioc;            // ioctl data structure 
memset(&ioc,0,sizeof(ioc));     //* set all unused fields to zero 
ioc.TTLSi_Req_Type = TTLS_QUERY_ONLY ; 
int command; 
                                                                     
command = SIOCTTLSCTL; 
ioc.TTLSi_Ver = TTLS_VERSION2; 
int lioc; 
lioc = sizeof(ioc); 
// 
// this is used for getting data from ATTLS 
// a header and a number of quads 
// 
memset(&query,0,sizeof(query)); 
// move the eye catcher 
memcpy(&query.header.TTLSHeaderIdent[0],TTLSHeaderIdentifier,8); 
query.header.TTLSHdr_BytesNeeded = 128; 
query.header.TTLSHdr_SetCount =  0; 
query.header.TTLSHdr_GetCount =  4;                                                                  
query.q1.TTLSQ_Key = TTLSK_TTLSRuleName  ;                                                                  
query.q2.TTLSQ_Key = TTLSK_TTLSGroupActionName;                                                                  
query.q3.TTLSQ_Key = TTLSK_TTLSEnvironmentActionName ;                                                                  
query.q4.TTLSQ_Key = TTLSK_TTLSConnectionActionName ; 
                                                                 
ioc.TTLSi_BufferPtr = (char *) &query; 
ioc.TTLSi_BufferLen = sizeof(query); 

ioc.TTLSi_Ver = TTLS_VERSION2; 
                                                        
BPX1IOC(&sd, 
        &command, 
        &lioc, 
        &ioc , 
        &rv, 
        &rc, 
        &rs); 
 
 if (check("BPX1IOC",rv,rc,rs) != 0) 
    exit(1); 
 printATTLS( &ioc,rv,rc,rs); 

Issue the start connection

Member attlssta.h had

command = SIOCTTLSCTL; 
ioc.TTLSi_Ver = TTLS_VERSION2; 
lioc = sizeof(ioc); 
ioc.TTLSi_Req_Type = TTLS_INIT_CONNECTION ; 
printf("TTLS_INIT_CONNECTION\n"); 
// 
// 
memset(&query,0,sizeof(query)); 
// move the eye catcher 
memcpy(&query.header.TTLSHeaderIdent[0],TTLSHeaderIdentifier,8); 
query.header.TTLSHdr_BytesNeeded = 128; 
query.header.TTLSHdr_SetCount =  0; 
query.header.TTLSHdr_GetCount =  0; 
ioc.TTLSi_BufferPtr = 0; 
ioc.TTLSi_BufferLen = 0; 
//printHex(stdout,&query,256); 
ioc.TTLSi_Ver = TTLS_VERSION2; 
BPX1IOC(&sd, 
          &command, 
          &lioc, 
          &ioc , 
          &rv, 
          &rc, 
          &rs); 
if (check("BPX1SOC TTLS_INIT_CONNECTION ",rv,rc,rs) != 0) 
    exit(1); 
printATTLS( &ioc,rv,rc,rs); 
if (rv >= 0) 
  { // get the certificate 
    #include <ATTLSGC.h> 
  } 

Get the certificate fromAT-TLS

ATTLSGC.h (get certificate from TCPIP) had

// AT-TLS 
// 
//  Get the certificate 
// 
char * applid = "ZZZ"; 
memset(&ioc,0,sizeof(ioc));     //* set all unused fields to zero 
ioc.TTLSi_Req_Type = TTLS_QUERY_ONLY ; 
int command; 
                                                                     
command = SIOCTTLSCTL; 
ioc.TTLSi_Ver = TTLS_VERSION2; 
int lioc; 
lioc = sizeof(ioc); 
// 
// this is used for getting data from ATTLS 
// a header and a number of quads 
// 
memset(&query,0,sizeof(query)); 
// move the eye catcher 
memcpy(&query.header.TTLSHeaderIdent[0],TTLSHeaderIdentifier,8); 
query.header.TTLSHdr_BytesNeeded = 128; 
query.header.TTLSHdr_SetCount =  0; 
query.header.TTLSHdr_GetCount =  1; 

query.q1.TTLSQ_Key = TTLSK_Certificate   ; 
                                                                   
ioc.TTLSi_BufferPtr = (char *) &query; 
ioc.TTLSi_BufferLen = sizeof(query); 
ioc.TTLSi_Ver = TTLS_VERSION2;
BPX1IOC(&sd, 
          &command, 
          &lioc, 
          &ioc , 
          &rv, 
          &rc, 
          &rs); 
 printf("ioc.TTLSi_Cert_Len %d \n",ioc.TTLSi_Cert_Len); 
 printf("get cert  IOC\n"); 
 if (check("BPX1IOC Get cert",rv,rc,rs) != 0) 
    exit(1);

AT-TLS will return a userid for application OMVSAPPL, which may not be what was wanted.

Use the certificate to change the userid of the thread.

This uses pthread_security_applid_np to the userid determined from the certificate, and the specified applied.

Using userid and password the code is

rc = pthread_security_applid_np(__CREATE_SECURITY_ENV, 
           __USERID_IDENTITY, 
           5,         // length of userid
           "COLIN",   // userid
           "PASSWORD",  // password- null terminated
           0,"OMVSAPPL"); 

Using a certificate is a little more complicated.

// use certificate to change userid 

char * applid = "ZZZ";                                                             
struct __certificate  ct; 
ct.__cert_type = __CERT_X509; 
char * pData = (char *)  ioc.TTLSi_BufferPtr; 
           // offsets are from start of header 
ct.__cert_length = ioc.TTLSi_Cert_Len; 
ct.__cert_ptr    =& pData[query.q1.TTLSQ_Offset] ; 
//printHex(stdout,ct.__cert_ptr, 66); 
rc = pthread_security_applid_np(__CREATE_SECURITY_ENV, 
         __CERTIFICATE_IDENTITY, 
         sizeof(ct),  // size of object
         &ct,         // adress of object
         "xxxxxxxx",  // not used with certificate
         0,           // options
         applid); // this controls which applid security checks are done.
if ( rc != 0) 
  perror("pthead security"); 
switch (errno) 
 { 
 case ESRCH : 
   printf("ESRCH:" 
   "The user ID provided as input is not defined to the " 
   "security product or does not have an OMVS segment defined" 
   "\n"); 
   break; 
 } 
if (rc != 0) 
{ 
lOutBuff = sprintf(&outBuff[0], 
"pthread_s... applid %s  rc = %d errno %d %s errno2 %8.8x\n\n", 
        applid, 
        rc,errno,strerror(errno),__errno2()); 
} 
else 
{ 
  userlen = 0;  
  rc = __getuserid(&userid[0], userlen); 
  if (rc != 0) 
     printf("getuser rc %d\n",rc); 
  printf("userid %d  %*.*s\n",userlen,userlen,userlen,userid); 
    lOutBuff = sprintf(&outBuff[0], 
          "pthread_s... applid %s rc = 0 userid %*.*s.\n", 
          applid, 
          userlen,userlen,userid); 
} 
printf("%s\n",outBuff); 

This certificate was mapped to userid ADCDA, and the userid ADCDA was printed. See Using certificates to logon to z/OS-Use a subject DN for the mapping.

Routine to print out the IOC and its data

int  printATTLS(struct  TTLS_IOCTL * pioc, 
                 int rv, int rc, int rs)
{ 
    if (check("printATTLS",rv,rc,rs) != 0) // check the return code
       return(8); 
    printf("query.header.TTLSHdr_BytesNeeded %d\n", 
        query.header.TTLSHdr_BytesNeeded); 
    printf("BPX1IOC Policy:%s\n",Stat_Policy[pioc->TTLSi_Stat_Policy]); 
    printf("BPX1IOC Conn  :%s\n", Stat_Conn[ pioc->TTLSi_Stat_Conn]); 
    printf("BPX1IOC Type  :%s\n", Set_type[ pioc->TTLSi_Sec_Type]); 
    char * pProt = "Unknown Protocol"; 
    switch( pioc->TTLSi_SSL_Prot) 
    { 
      case TTLS_PROT_UNKNOWN: pProt = "TTLS_PROT_UNKNOWN";break; 
      case TTLS_PROT_SSLV2  : pProt = "TTLS_PROT_SSLV2  ";break; 
      case TTLS_PROT_SSLV3  : pProt = "TTLS_PROT_SSLV3  ";break; 
      case TTLS_PROT_TLSV1  : pProt = "TTLS_PROT_TLSV1  ";break; 
      case TTLS_PROT_TLSV1_1: pProt = "TTLS_PROT_TLSV1_1";break; 
      case TTLS_PROT_TLSV1_2: pProt = "TTLS_PROT_TLSV1_2";break; 
      case TTLS_PROT_TLSV1_3: pProt = "TTLS_PROT_TLSV1_3";break; 
    }
//  printf("SSL Protocol Version  %u\n",  pioc->TTLSi_SSL_ProtVer); 
//  printf("SSL Protocol Modifier %hhu\n",  pioc->TTLSi_SSL_ProtMod); 
    printf("SSL protocol version %s\n",pProt); 
    if (pioc->TTLSi_Neg_Cipher[0] != 0 ) 
    { 
      if ( memcmp(&pioc-> TTLSi_Neg_Cipher[0] ,"4X",2) == 0) 
      printf("SSL Cipher  %4.4s\n",pioc->TTLSi_Neg_Cipher4   ); 
      else 
      printf("SSL Cipher    %2.2s\n",pioc->TTLSi_Neg_Cipher   ); 
    } 
    if (pioc->TTLSi_Neg_KeyShare[0]!= 0) 
      printf("SSL key share   %4.4s\n",pioc->TTLSi_Neg_KeyShare   ); 
    int lUserid = pioc->TTLSi_UserID_Len; 
    if (lUserid >0 ) 
    { 
      printf("userid %*.*s\n",lUserid,lUserid,&pioc->TTLSi_UserID[0]); 
    }
    if (pioc->TTLSi_BufferLen > 0 
       && pioc->TTLSi_BufferPtr > 0) 
    { 
      int len = 256; 
      if (pioc->TTLSi_BufferLen < len) 
      len = pioc->TTLSi_BufferLen; 
      //printHex(stdout,pioc->TTLSi_BufferPtr,len ); 
      char * pData = (char *) pioc->TTLSi_BufferPtr; 
                    // offsets are from start of header 
      if (query.q1.TTLSQ_Offset > 0) 
        printf("Rule name    %s.\n",&pData[query.q1.TTLSQ_Offset]); 
      else 
        printf("Rule name missing\n"); 
      if (query.q2.TTLSQ_Offset > 0) 
        printf("Group Action %s.\n",&pData[query.q2.TTLSQ_Offset]); 
      else 
        printf("Group Action missing\n"); 
      if (query.q3.TTLSQ_Offset > 0) 
        printf("Environment  %s.\n",&pData[query.q3.TTLSQ_Offset]); 
      else 
        printf("Environment  missing\n"); 
      if (query.q4.TTLSQ_Offset > 0) 
        printf("Connection   %s.\n",&pData[query.q4.TTLSQ_Offset]); 
      else 
        printf("Connection   missing\n"); 
   } 
}    

Header file

// used to query data
struct  { 
  struct TTLSHeader header; 
  struct TTLSQuadruplet q1; 
  struct TTLSQuadruplet q2; 
  struct TTLSQuadruplet q3; 
  struct TTLSQuadruplet q4; 
  struct TTLSQuadruplet q5; 
  char buffer[4096]; 
} query; 
 
// used in printing IOC                                                              
 char * Stat_Policy[]={ 
    "reserved", 
    "AT-TLS function is off", 
    "No policy defined for connection", 
    "Policy defined for connection - AT-TLS not enabled", 
    "Policy defined for connection - AT-TLS enabled", 
    "Policy defined for connection - AT-TLS enabled and " 
    "Application Controlled"};
char * Stat_Conn[] = { 
      "reserved", 
      "Connection is not secure", 
      "Connection handshake in progress", 
      "Connection is secure"}; 
char * Set_type[] = { 
    "reserved", 
    "Client", 
    "Server", 
    "Server with client authentication " 
    "ClientAuthType = PassThru", 
    "Server with client authentication " 
    "ClientAuthType = Full", 
    "Server with client authentication " 
    "ClientAuthType = Required ", 
    "Server with client authentication " 
    "ClientAuthType = SAFCheck"}; 
struct TTLS_IOCTL ioc;            // ioctl data structure 
char buff[1000];                  // buffer for certificate  

Aside on ClientHandshakeSNI

I spent a couple of hours trying to get this to work. I got ServerHandshakeSNIto work.

The documentation says

ClientHandshakeSNI


For TLSv1.0 protocol or later, this keyword specifies whether a client can specify a list of server names. The server chooses a certificate based on that server name list for this connection. For System SSL, the extension ID is set to GSK_TLS_SET_SNI_CLIENT_SNAMES and a flag is set in the gsk_tls_extension structure if it is required. Valid values are:

  • Required: Specifies that server name indication support must be accepted by the server. Connections fail if the server does not support server name indication.
    • Tip: When you specify ClientHandshakeSNI as required, specify SSLv3 as Off.
  • Optional -Specifies that server name indication negotiation is supported, but allows connections with servers that do not support server name indication negotiation.
  • Off – Specifies that server name indication is not supported. The function is not enabled. Connections fail if the server requires support for server name indication. This is the default.

I think this only applies when the program on z/OS is running as a client.

Using ServerHandshakeSNI

For example

ServerHandshakeSNI Required
ServerHandshakeSNIMatch Required
ServerHandshakeSNIList COLINCLIENZ/NISTECC521
ServerHandshakeSNIList CLIENT2/BB
  • If my client uses -servername COLINCLIENZ then the certificate with label NISTECC521 will be used.
  • If my client uses -servername CLIENT2 then the BB certificate will be used
  • Any other server name (or if is omitted) the connection will fail

Verify the sender

The client can use –verify_hostname ZZZZZZ to verify the name of the host.

Write instructions for your target audience – not for yourself.

Over the last couple of weeks, I’ve been asked questions about installing two products on z/OS. I looked at the installation documentation, and it was written the way I would write it for myself – it was not written for other people to follow.

I sent some comments to one of the developers, and as the comments mainly apply to the other products as well, I thought I would write them down – for when another product comes along.

I’ve been doing some documentation of for AT-TLS which allows you to give applications TLS support, without changing the application, so I’ll focus on a product using TCP/IP.

What is the environment?

The environment can range from one person running z/OS on a laptop, to running a Parallel Sysplex where you have multiple z/OS instances running as a Single System Image; and taking it further, you can have multiple sites.

What levels of software

Within a Sysplex you can have different levels of software, for example one image at z/OS 2.4 and another image at z/OS 2.5 You tend to upgrade one system to the next release, then when this has been demonstrated to be stable, migrate the other systems in turn.

Within one z/OS image you can have multiple levels of products, for example MQ 9.2.3 and MQ 9.1. People may have multiple levels so they test the newer level, and when it looks stable, they switch to the newer level and later remove the older level. If the newer level does not work in production – they can easily switch back to the previous level.

Each version may have specific requirements.

  • If your product has an SVC, you may need an SVC for each version, unless the higher level SVC supports the lower level code.
  • If your product uses a TCP/IP port, you will need a port for each instance.

You need to ensure your product can run in this environment, with more than one version installed on an image.

How do things run?

Often z/OS images and programs run for many months. For example IPLing every three months to get the latest fixes on. Your product instance may run for 3 months before restarting. If you write message to the joblog, or have output going to the JES2 spool, you want to be able to purge old output without shutting down your instance. You can specify options to “spin” off output and make the file purge-able.

Your instance may need to be able to refresh its parameters. For example, if a key in a keyring changes, you need to close and reopen the keyring. This implies a refresh command, or the keyring is opened for each request.

Who is responsible for the system?

For me – I am the only person using the system and I am responsible for every thing.

For big systems there will be functions allocated to different departments:

  • Installation of software (getting the libraries and files to the z/OS image)
  • The z/OS systems team – creating and updating the base z/OS system
  • The Security team – this may be split into platform security(RACF), and network security
  • Data management – responsible for data, backup (and restore), migration of unused data sets to tape, ensuring there is enough disk space available.
  • Communications team – responsible for TCPIP connectivity, DNS, firewalls etc.
  • Database team – responsible for DB2 and other products
  • Liberty and z/OSMF etc built on top of Liberty.
  • MQ – responsible for MQ, and MQ to MQ connectivity.

Some responsibilities could be done by different teams, for example creating the security profile when creating a started task. This is a “security” task – but the z/OS systems programmer will usually do it.

How are systems changes managed?

Changes are usually made on a test system and migrated into production. I’ve seen a rule “nothing goes into production which has not been tested”. Some implications of this are

  • No changes are typed into production. A file can be copied into production, and a file may have symbolic substitution, such as SYSTEM=&SYSNAME. You can use cut and paste, but no typing. This eliminates problems like 0 being misread as O, and 1,i,l looking similar.
  • Changes are automated.
  • Every change needs a back-out process – and this back-out has been tested.
    • Delete is a 2 phase operation. Today you do a rename rather than a delete; next week you do the actual delete. If there is a problem with the change you can just rename it back again. Some objects have non obvious attributes, and if you recreate an object, it may be different, and not work the same way as it used to.

There are usually change review meetings. You have to write a change request, outlining

  • the change description
  • the impact on the existing system
  • the back-out plan
  • dependencies
  • which areas are affected.

You might have one change request for all areas (z/OS, security, networking), or a change request for each area, one for z/OS, one for security, one for networking.

Affected areas have to approve changes in their area.

How to write installation instructions

You need to be aware of differences between installing a product first time, and successive times. For example creating a security definition. It is easy to re-test an install, and not realise you already have security profiles set up. A pristine new image is great for testing installation because it is clean, and you have to do everything.

Instructions like

  • Task 1 – create sys1.proclib member
  • Task 2 – define security profile
  • Task 3 – allocate disk storage
  • Task 4 – define another security profile
  • Task 5 – update parmlib

may make sense when one person is doing the work, but not if there are many teams.

It is better to have a summary by role like

  • z/OS systems programmer
    • create proclib member
    • update parmlib
  • Security team
    • Define security profile 1
    • Define security profile 2
  • Storage management team
    • Allocate disk space

and have links to the specific topics. This way it is very clear what a team’s responsibilities are, and you can raise one change request per team.

This summary also gives a good road map so you can see the scale of the installation task.

It is also good to indicate if this needs to be done once only per z/OS image, or for every instance. For example

  • APF authorise the load libraries – once per z/OS image
  • Create a JCL procedure in SYS1.PROCLIB – once per instance

Some tasks for the different roles

z/OS system programmers

  • Create alias for MYPROD.* to a user catalog
  • APF authorise MYPROD…. datasets
  • Create PARMLIB entries
  • Update LNKLST and LPA
  • Update PROCLIB concatenation with product JCL
  • Create security profiles for any started tasks; which userid should be used?
  • WLM classification of the started task or job.
  • Schedule delete of any old log files older than a specified criteria
  • When multiple instances per LPAR, decide whether to use S MYSTASK1, S MYSTASK2, or S MYSTASK.T1, S MYSTASK.T2
  • Do you need to specify JESLOG SPIN to allows JES2 logs to be spun regulary, or when they are greater than a certain size, or any DD SYSOUT with SPIN?
  • ISPF
    • Add any ISPF Panels etc into logon procedures, or provide CLIST to do it.
    • Update your ISPF “extras” panel to add product to page.
  • Try to avoid SVCs. There are better ways, for example using authorized services.
  • Propagate the changes to all systems in the Sysplex.
  • What CF structures are needed. Do they have any specific characteristics, such as duplexed?
  • How much (e)CSA is needed, for each product instance.
  • Does your product need any Storage Class Memory (SCM).

Security team

  • Create groups as needed eg MYPRODSYS, MYPRODRO, and make requester’s userid group special, so they can add and remove userids to and from the groups.
  • Create a userid for the started task. Create the userid with NOPASSWORD, to prevent people logging on with the userid and password.
  • Protect the MYPROD.* datasets, for example members of group MYPRODSYS can update the datasets, members of group MYPRODRO only have read-only access.
  • Create any other profiles.
  • Create any certificate or keyrings, and give users access to them.
  • Set up profiles for who can issue operator commands against the jobs or procedures.
  • Does the product require an “applid”. For example users much have access to a specific APPL to be able to use the facilities. An application can use pthread_security_applid_np, to change the userid a thread is running on – but they must have access to an applid. The default applid is OMVSAPPL.
  • Do users needing to use this product need anything specific? Such as id(0), needing a Unix Segment, or access to any protected resources? See below for id(0).
  • If a client authenticates to the server, the server needs access to BPX.SERVER in the RACF FACILITY.
  • The started task userid may need access to BPX.DAEMON.
  • If a userid needs access to another user’s keyring, the requestor needs read access to user.ring.LST in CLASS(RDATALIB) or access to IRR.DIGTCERT.LISTRING.
  • If a userid needs access to a private key in a keyring the requester needs If a userid needs access to another user’s keyring, the requester needs control access to user.ring.LST in CLASS(RDATALIB).
  • You might need to program control data sets, for example RDEF PROGRAM * ADDMEM(‘SYS1.LINKLIB’//NOPADCHK) UACC(READ) .
  • Users may need access to ICSF class CSFSERV and CSFKEYS.
  • Use of CLASS(SURROGAT) BPX.SRV.<userid> to allow one userid to be a surrogate for another userid.
  • Use of CLASS(FACILITY) BPX.CONSOLE to remove the generation of BPXM023I messages on the syslog.

Storage team

  • How much disk space is needed once the product has been installed, for data sets, and Unix file systems. This includes product libraries and instance data, and logs which can grow without limit.
  • How much temporary space is needed during the install.
  • Where do Unix files for the product go? for example /opt/ or /var….
  • Where do instance files go. For example on image local disks, or sysplex shared disks. You have an instance on every member of the Sysplex – where you do put the instance files?
  • How much data will be produced in normal running – for example traces or logs.
  • When can the data be pruned?
  • Does the product need its own ZFS for instance data, to keep it isolated and so cannot impact other products.
  • Are any additional Storage Classes etc needing to be defined? These determine if and when datasets are migrated to tape, or get deleted.
  • Are any other definitions needed. For example for datasets MYPROD.LOG*, they need to go on the fastest disks, MYPROD.SAMPLES* can go on any disks, and could be migrated.

Database team

  • What databases, tables,indexes etc are required?
  • How much disk space is needed.
  • What volume of updates per second. Can the existing DB2 instances sustain the additional throughput?
  • What security and protection is needed at the table level and at the field level.
  • What groups are permitted to access which fields?
  • What auditing is needed?
  • Is encryption needed?

MQ

  • Do you need to uses MQ Shared Queue between queue managers?
  • How much data will be logged per second?
  • What is the space needed for the message storage, disk space, buffer pool and Coupling Facility?
  • Product specific definitions.
  • Security protection of any product specific definitions.

Networking

  • Which port(s) to use?
    • Do you need to control access to ports with the SAF resource on the PORT entry, and permit access to profile EZB.PORTACCESS.sysname.tcpname.resname
    • Use of SHAREPORT and SHAREPORTWLM
  • Use of Sysplex Distributor to route work coming in to a Sysplex to any available system?
  • Update the port list – so only specific job can use it
  • RACF profile for port?
  • Which cipher specs
  • Which level of TLS
  • Which certificates
  • Any AT-TLS profile?
  • Any firewall changes?
  • Any class of service?
  • Any changes to syslogd profile?
  • Are there any additional sites that will be accessed, and so need adding to the “allow” list.

Automation

  • If the started tasks, or jobs need to be started at IPL, create the definitions. Do they have any pre-reqs, for example requiring DB2 to be active.
  • If the jobs are shutdown during the day, should they be automatically restarted?
  • Add automation to shut down any jobs or started tasks, when the system is shutdown
  • Which product messages need to be managed – such as events requiring operator action, or events reported to the site wide monitoring team.

Operations

  • Play book for product, how to start, and stop it
  • Are there any other commands?

Monitoring

  • Any SMF data needed to be collected.
  • Any other monitoring.
  • How much additional CPU will be needed – at first, and in the long term.

Making your product secure

Many sites are ultra careful about keeping their system secure. The philosophy is give a user access for what they need to do – but no more. For example

  • They will not be comfortable installing a non IBM SVC into their system. An SVC can be used from any address space, so if there is any weakness in the SVC it could be exploiter.
  • Using id(0) (superuser) in Unix Services is not allowed. The userid needs to be given specific permission. If the code “changes userid” then services like pthread_security_applid_np() should be used; where the applid is part of the configuration. Alternatives include __login_applid. End users of this facility will need read access to the specific applid.

TLS and SSL

If you are using TLS there are other considerations

  • Any certificate you generate needs a long validity date, and JCL to recreate it when it expires.
  • If you create a Certificate Authority you need to document how to export it and distribute it to other platforms
  • Browsers and application may verify the host name, so you need to generate a certificate with a valid name. The external z/OS name may be different from the internal name.
  • You should support TLS V1.2 and TLS 1.3 Other TLS and SSL versions are deprecated.
  • It is good practice to have one keyring with the server certificate with its private key, and a “common” trust store keyring which has the Certificate Authorities for all the sites connecting to the z/OS image. If you connect to a new site, you update the common keyring, and all applications pick up the new CA. If you have one keyring just for your instance, you need to maintain multiple keyrings when a new certificate is added, one for each application.

Migrating from cc to xlc is like playing twister

I needed to compile a file in Unix System Services; I took an old make file, changed cc to xlc expecting it to compile and had lots of problems.

It feels like the documentation was well written in the days of the cc and c89 complier, and has a different beast inserted into it.

As started to write this blog post, I learned even more about compiling in Unix Services on z/OS!

Make file using cc

cparmsa= -Wc,"SSCOM,DEBUG,DEF(MVS),DEF(_OE_SOCKETS),UNDEF(_OPEN_DEFAULT),NOOE 
cparmsb= ,SO,SHOW,LIST(),XREF,ILP32,DLL,SKIPS(HIDE)" 
syslib= -I'/usr/include' -I'/usr/include/sys'  -I"//'TCPIP.SEZACMAC'" -I"//'TCPIP.SEZANMAC'" 
all: main 
parts =  tcps.o 
main: $(parts)
  cc -o tcps  $(parts) 
                                                                                                                            
%.o: %.c 
 cc  -c -o $@   $(syslib) $(cparmsa)$(cparmsb)    -V          $< 
 
clean: 
 rm  *.o 

The generated compile statement is

cc -c -o tcps.o -I’/usr/include’ -I’/usr/include/sys’ -I”//’TCPIP.SEZACMAC'” -I”//’TCPIP.SEZANMAC'” -Wc,”SSCOM,DEBUG,DEF(MVS),DEF(_OE_SOCKETS),UNDEF(_OPEN_DEFAULT),NOOE,SO, SHOW,LIST(),XREF,ILP32,DLL,SKIPS(HIDE)” -V tcps.c

Note the following

  • the -V option generates the listing. “-V produces all reports for the compiler, and binder, or prelinker, and directs them to stdout“. If you do not have -V you do not get a listing.
  • -Wc,list() says generate a list with a name like tcps.lst based on the file name being compiled. If you use list(x.lst) it does not produce any output! This is contrary to what the documentation says. (Possible bug on compiler when specifying nooe”
  • SHOW lists the included files
  • SKIPS(HIDE) omits the stuff which is not used – see below.

Make using xlc

I think the xlc compiler has bits from z/OS and bits from AIX (sensibly sharing code!). On AIX some parameters are passed using -q. You might use -qSHOWINC or -qNOSHOWINC instead of -Wc,SHOWINC

cparmsx= -Wc,"SO,SHOW,LIST(lst31),XREF,ILP32,DLL,SSCOM, 
cparmsy= DEBUG,DEF(MVS),DEF(_OE_SOCKETS),UNDEF(_OPEN_DEFAULT),NOOE" 
cparms3= -qshowinc -qso=./lst.yy  -qskips=hide -V 
syslib= -I'/usr/include' -I'/usr/include/sys'  -I"//'TCPIP.SEZACMAC'" -I"//'TCPIP.SEZANMAC'" 
all: main 
parts =  tcps.o 
main: $(parts) 
  cc -o tcps  $(parts) 
                                                                                                      
%.o: %.c 
 xlc -c -o $@   $(syslib) $(cparmsx)$(cparmsy) $(cparms3)     $< 
                                                                                                      
clean: 
 rm  *.o 

This generates a statement

xlc -c -o tcps.o -I’/usr/include’ -I’/usr/include/sys’ -I”//’TCPIP.SEZACMAC'” -I”//’TCPIP.SEZANMAC'” -Wc,”SO,SHOW,LIST(lst31),XREF, ILP32,DLL, SSCOM,DEBUG,DEF(MVS),DEF(_OE_SOCKETS), UNDEF(_OPEN_DEFAULT),NOOE” -qshowinc -qso=./lst.yy -qskips=hide tcps.c

Note the -q options. You need -qso=…. to get a listing.

Any -V option is ignored, and LIST(…) is not used.

Note: There is a buglet in the compiler, specifying nooe does not always produce a listing. The above xlc statement gets round this problem.

SKIPS(SHOW|HIDE)

The SKIPS(HIDE) also known as SKIPSRC shows you what is used, and suppresses text which is not used. I found this useful trying to find the combination of #define … to get the program to compile.

For example with SKIPS(SHOW)

170 |#if 0x42040000 >= 0X220A0000                               | 672     4      
171 |    #if defined (_NO_PROTO) &&  !defined(__cplusplus)      | 673     4      
172 |        #define __new210(ret,func,parms) ret func ()       | 674     4      
173 |    #else                                                  | 675     4      
174 |        #define __new210(ret,func,parms) ret func parms    | 676     4      
175 |    #endif                                                 | 677     4      
176 |#elif !defined(__cplusplus) && !defined(_NO_NEW_FUNC_CHECK)| 678     4      
177 |       #define __new210(ret,func,parms) \                  | 679     4      
178 |        extern struct __notSupportedBeforeV2R10__ func     | 680     4      
179 |   #else                                                   | 681     4      
180 |     #define __new210(ret,func,parms)                      | 682     4      
181 |#endif                                                     | 683     4      

With SKIPS(HIDE) the bold lines are not displayed,

170 |#if 0x42040000 >= 0X220A0000                              | 629     4 
171 |    #if defined (_NO_PROTO) &&  !defined(__cplusplus)     | 630     4 
172 |        #define __new210(ret,func,parms) ret func ()      | 631     4 
173 |     else                                                 | 632     4 
175 |    #endif                                                | 633     4 
176 |                                                          | 634     4 
179 |   #else                                                  | 635     4 
181 |#endif                                                    | 636     4 
182 |#endif                                                    | 637     4 

This shows

  • 170 The line number within the included file
  • 629 The line number within the file
  • 4 is the 4th included file. In the “I N C L U D E S” section it says 4 /usr/include/features.h
  • rows 174 is missing … this is the #else text which was not included
  • rows 177, 178,180 are omitted.

This makes is much easier to browse through the includes to find why you have duplicate definitions and other problems.

Tracing AT-TLS on z/OS

AT-TLS on z/OS provides TLS support for applications by magically inserting itself into an application using TCP/IP, without changing the application.

You can collect a trace of AT-TLS starting up, but I was interested in tracing the handshake.

  • If syslogd (system wide program for collecting log data) is active, then trace will be written to the Unix file system.
  • if syslogd is not active then the data is written to syslog.
  • You can configure it so errors get written to syslog and syslogs.

My server COLATTLS started task is a program acting as a TCP/IP program, with Program Control, so my application gets to interface with AT-TLS, extract information and control the connection.

AT-TLS definitions

In my AT-TLS definitions I had

TLSConnectionAction              TNCA 
{ 
  TTLSCipherParmsRef              TLS13TLS12 
  TTLSSignatureParmsRef           TNESigParms 
  TTLSConnectionAdvancedParmsRef  TNCOonAdvParms 
  CtraceClearText                 Off 
  Trace                          255 
} 

This trace statement traces everything. See below for a description of what is traced.

Using syslogd

Syslogd is a daemon for applications, they write data to syslogd, and you configure syslogd to define where the output goes to.

My syslog JCL started task procedure is:

//SYSLOGD PROC 
//* Read parms from /etc/syslog.conf 
//CONFHFS EXEC PGM=SYSLOGD,REGION=4096K,TIME=NOLIMIT, 
//         PARM='ENVAR("_CEE_ENVFILE_S=DD:STDENV")/-c -i       ' 
//STDENV   DD DUMMY 
//SYSPRINT DD SYSOUT=* 
//SYSIN    DD DUMMY 
//SYSERR   DD SYSOUT=* 
//SYSOUT   DD SYSOUT=* 
//CEEDUMP  DD SYSOUT=* 

This reads its control statements from /etc/syslog.conf (the default). See Configuring the syslog daemon. My file has

*.INETD*.*.*       /var/log/inetd 
auth.* /var/log/auth 
mail.* /var/log//mail -F 640 -D 770 
local1.err       /var/log/local1 
*.err            /var/log/errors 
*.CPAGENT.*.*       /var/log/CPAGENT.%Y.%m.%d  
*.TTLS*.*.*          /var/log/TTLS.%Y.%m.%d  
*.Pagent.*.*        /var/log/Pagent.%Y.%m.%d  
*.TCPIP.*.debug     /var/log/TCPIPdebug.%Y.%m.%d  
*.TCPIP.*.warning   /var/log/TCPIP.%Y.%m.%d  
*.TCPIP.*.err       /var/log/TCPIPerr.%Y.%m.%d  
*.TCPIP.*.info      /var/log/TCPIPinfo.%Y.%m.%d  
*.SYSLOGD*.*.*      /var/log/syslogd.%Y.%m.%d  
*.TN3270*.*.*       /var/log/tn3270 
*.SSHD*.*.*         /var/log/SSHD 

The output for *.TCPIP.*.debug goes to a file like /var/log/TCPIPdebug.2023.04.03

The configuration says, for example,

  • the output from TCPIP, with priority code debug or less goes to file /var/log/TCPIPdebug…
  • the output from TCPIP, with priority code info or less goes to file /var/log/TCPINFO…
  • the output from TN3270 goes to /var/log/tn3270 – for all priorities.

Because “debug” is debug or lower, the file will also contain the “info” messages. Some messages are written to multiple files.

Note: although my application started task was called COLATTLS, the ATTLS trace came out from job TCPIP, not COLATTLS.

AT-TLS trace

The trace for application is configured with the TRACE option in definitions. The documentation says (TTLSGroupAction, and TLSEnvironmentAction):

Trace

Specifies the level of AT-TLS tracing. The valid values for n are in the range 0 – 255. The sum of the numbers associated with each level of tracing selected is the value that should be specified as n. If n is an odd number, errors are written to joblog and all other configured traces are sent to syslogd. If this value is specified on the TTLSEnvironmentAction statement, it is used instead of the value from the TTLSGroupAction statement referenced by the same TTLSRule statement.

  • 0 – No tracing is enabled.
  • 1 (Error) – Errors are traced to the TCP/IP joblog
  • 2 (Error) – Errors are traced to syslogd. The messages are issued with syslogd priority code err.
  • 4 (Info) – Tracing of when a connection is mapped to an AT-TLS rule and when a secure connection is successfully initiated is enabled. The messages are issued with syslogd priority code info.
  • 8 (Event) – Tracing of major events is enabled. The messages are issued with syslogd priority code debug.
  • 16 (Flow) – Tracing of system SSL calls is enabled. The messages are issued with syslogd priority code debug.
  • 32 (Data) – Tracing of encrypted negotiation and headers is enabled. This traces the negotiation of secure sessions. The messages are issued with syslogd priority code debug.

This means that if tracing the negotiation, it will be written with priority debug. From the *.TCPIP.debug statement in my syslogd definitions, the output will be written to /var/log/TCPIPdebug… .

Info output

The information in the info output looks like (two records for one connections):

Apr 2 17:25:53 S0W1 TTLS[16842781]: 17:25:53 TCPIP
EZD1281I TTLS Map CONNID: 00000032 LOCAL: 10.1.1.2..4000
REMOTE: 10.1.0.2..60742 JOBNAME: COLATTLS USERID: START1
TYPE: InBound STATUS: Appl Control RULE: COLATTLJ ACTIONS:
TNGA TNEA TNCA

This gives information on which rule was selected. For example it gives the local and remote ip address and port; job name and userid. It shows that rule COLATTLJ was used with group TNGA, environment TNEA, and connection TNCA .

Apr 2 17:25:53 S0W1 TTLS[16842781]: 17:25:53 TCPIP
EZD1283I TTLS Event GRPID: 00000007 ENVID: 00000003 CONNID: 00000032
RC: 0 Initial Handshake 0000005011440BB0
0000005011422870 TLSV1.2 C02C

This shows that for the same session (TTLS[16842781]) the initial handshake agreed on the TLS level conversation was at TLS V1.2 and the cipher spec(C02C).

Debug output

For one connection, there were over 130 lines out output in the file.

Some example lines are

EZD1283I TTLS Event GRPID: 00000007 ENVID: 00000000 CONNID: 00000032
RC: 0 Connection Init

EZD1284I TTLS Flow GRPID: 00000007 ENVID: 00000004 CONNID: 00000032
RC: 0 Set GSK_KEYRING_FILE(201) start1/TN3270

EZD1282I TTLS Start GRPID: 00000007 ENVID: 00000003 CONNID:
00000032 Initial Handshake ACTIONS: TNGA TNEA TNCA
HS-ServerWithClientAuth

EZD1285I TTLS Data CONNID: 00000032 RECV CIPHER 1603010116
EZD1285I TTLS Data CONNID: 00000032 RECV CIPHER 0100011203031FDDC…
EZD1285I TTLS Data CONNID: 00000032 SEND CIPHER16030309BC0200005…

… RC: 0 Call GSK_SECURE_SOCKET_INIT – 0000005011440BB0
… RC: 0 Get GSK_CONNECT_SEC_TYPE(208) – TLSV1.2
… RC: 0 Get GSK_CONNECT_CIPHER_SPEC(207) – C02C

You get

  • Events – (trace 8 event)
  • the traffic data flowing up and down (trace 32 data)
  • the System SSL calls (with return code) (trace 16 flow)

Trace output on syslog – when syslogd not active

Having AT-TLS writing to syslog is not a good idea – it can produce a lot of output. It may be acceptable on a small, low activity, system, tracing the minimum amount of data.

IEF403I COLATTLS - STARTED - TIME=17.16.51                             
BPXF024I (TCPIP) Apr  2 17:17:03 TTLS 16842781 : 17:17:03 TCPIP 
EZD1281I TTLS Map   CONNID: 0000002F LOCAL: 10.1.1.2..4000 REMOTE:     
10.1.0.2..43012 JOBNAME: COLATTLS USERID: START1 TYPE: InBound         
STATUS: Appl Control RULE: COLATTLJ ACTIONS: TNGA TNEA TNCA            
BPXF024I (TCPIP) Apr  2 17:17:03 TTLS 16842781 : 17:17:03 TCPIP 
EZD1283I TTLS Event GRPID: 00000007 ENVID: 00000000 CONNID: 0000002F   
RC:    0 Connection Init                                               
BPXF024I (TCPIP) Apr  2 17:17:03 TTLS 16842781 : 17:17:03 TCPIP 
EZD1282I TTLS Start GRPID: 00000007 ENVID: 00000001 CONNID: 00000000   
Environment Create ACTIONS: TNGA TNEA **N/A**                          
BPXF024I (TCPIP) Apr  2 17:17:03 TTLS 16842781 : 17:17:03 TCPIP 
EZD1283I TTLS Event GRPID: 00000007 ENVID: 00000002 CONNID: 00000000   
RC:    0 Environment Master Create 00000001                            
BPXF024I (TCPIP) Apr  2 17:17:03 TTLS 16842781 : 17:17:03 TCPIP    
EZD1284I TTLS Flow  GRPID: 00000007 ENVID: 00000002 CONNID: 0000002F   
RC:    0 Call GSK_ENVIRONMENT_OPEN - 0000005011421D10
...                  
                   

The output was produced with AT-TLS trace was enable, and ATTLS was not using the syslogd daemon.

The text in bold is the initial trace entry.

  • BPXF024I (TCPIP) Apr 2 17:17:03 TTLS 16842781 : 17:17:03 TCPIP is written because syslogd is not being used.
  • EZD1281I TTLS Map CONNID: 0000002F LOCAL: 10.1.1.2..4000 REMOTE: 10.1.0.2..43012 JOBNAME: COLATTLS USERID: START1 TYPE: InBound provides information about which AT-TLS rule is being used for the connection.
  • EZD1284I TTLS Flow GRPID: 00000007 ENVID: 00000002 CONNID: 0000002F RC: 0 Call GSK_ENVIRONMENT_OPEN – 0000005011421D10 shows you information about the system ssl call being used.

Using certificates to logon to z/OS.

I was testing out certificate access to logon to z/OS using pthread_secure_applid() services . It took a little while to get my program working, but once it was working I tried to be “customer like” rather than a simple Unit Test. The documentation is not very clear, but it does work.

You can create a RACF definition for the simple case

  • For a certificate where the Subject Distinguished Name(DN) matches the definition’s Subject DN, then use userid xyz.
  • For a certificate where part of the Subject DN matches the definitions Subject DN, then use userid ijklm.

These both default to using an APPLID OMVSAPPL

I wanted to say

  • for application id ZZZ, if a certificate with this subject DN, then use userid TESTZ
  • for application id YYY if a certificate with the same subject DN, then use userid TESTY

Topics covered n this post:

How to set up your enterprise certificates

If you want to control certificate to userid mapping, at an individual level

You can use the following, where SDN is the Subject Distinguished Name from the certificate,

RACDCERT MAP ID(....  ) SDN(...)

for each individual.

To remove access you just delete the profile (and refresh RACLIST).

You can also specify the Issuers Distinguished Name (IDN) so you refer to the correct certificate.

If you want to control certificate to userid mapping, at the group level

You can use a Subject DN filter. This may mean you need to define your Subject DN with Organisation Units(OU), as

  • CN=COLIN,OU=TEST,O=MYORG
  • CN=MARY,OU=SALES,O=MYORG

You can then have a filter SDN(‘OU=TEST,O=MYORG’) to allow just those in the OU=TEST group to logon, and sales people will not get access.

If you then want to prevent individuals from getting access you need to define a specific profile, and point it to a userid which can do nothing.

You could also have your certificates issued by different CA’s For example, have a certificate with an Issuer Distinguished Name(IDN) including the department name.

  • Subject DN(CN=COLIN,OU=TEST,O=MYORG) Issuer DN(OU=TEST,OU=CA,O=MYORG)

and specify IDN(OU=TEST,OU=CA,O=MYORG).

Note: with this, if someone changes department, and moved from Sales to Test, they will need a new certificate.

Some people may require more than one certificate. For example someone who has normal access for their day to day job, and super powers for emergencies.

Note: In addition to getting access with a certificate you may still want to use a password.

Define a certificate to user mapping (simple case)

Use a fully qualified Subject DN filter

For example, a certificate has SDN CN=docec256.O=Doc.C=GB .

//COLRACF  JOB 1,MSGCLASS=H 
//S1  EXEC PGM=IKJEFT01,REGION=0M 
//STEPLIB  DD  DISP=SHR,DSN=SYS1.MIGLIB 
//SYSPRINT DD SYSOUT=* 
//SYSTSPRT DD SYSOUT=* 
//SYSTSIN DD * 

RACDCERT DELMAP(LABEL('CP'))  ID(COLIN) 
RACDCERT MAP ID(COLIN  )  - 
   SDNFILTER('CN=docec256.O=Doc.C=GB')                  - 
   IDNFILTER('CN=SSCA256.OU=CA.O=DOC.C=GB')             - 
   WITHLABEL('CP') 
RACDCERT LISTMAP ID(COLIN) 
SETROPTS RACLIST(DIGTNMAP, DIGTCRIT) REFRESH 
/* 

This uses both Subject Distinguished Name, and Issuer Distinguished Name. You can use either or both filters.

With this definition when I used openssl s_client to talk to my server application running on z/OS. When I specified the client certificate with the specified Subject DN and signer, I could use:

  • AT-TLS, use BPX1IOC() and retrieve the userid. Internally this uses applid=OMVSAPPL
  • do it myself
    • retrieve the certificate with the BPX1IOC() TTLS_QUERY_ONLY call;
    • use rc = pthread_security_applid_np( __CREATE_SECURITY_ENV,…. “OMVSAPPL”) passing the certificate and the applid OMVSAPPL

The code checks my the extracted userid (COLIN) has read access to the profile OMVSAPPL in class(APPL)) and, if it has access, it returns the userid to the application.

See below if you do not want to use applid OMVSAPPL

Use a partially qualified Subject DN filter

Instead of a fully qualified Subject DN, CN=docec256.O=Doc.C=GB. You can use a partially qualified DN, for example CN=docec256.O=Doc.C=GB. Any certificate which has a DN including O=Doc.C=GB will be accepted

RACDCERT DELMAP(LABEL('CPGEN'))  ID(ADCDF) 
RACDCERT MAP ID(ADCDF  )  - 
   SDNFILTER('O=Doc.C=GB')                  - 
   WITHLABEL('CPGEN') 
RACDCERT LISTMAP ID(ADCDF) 
SETROPTS RACLIST(DIGTNMAP, DIGTCRIT) REFRESH 

With this definition when I used openssl s_client to talk to my server application running on z/OS. When I specified the client certificate with the specified Issuer CN, it worked the same way as the simple case above.

The code checks my the user extracted userid (ADCDF) has read access to the profile OMVSAPPL in class(APPL)) and, if it has access, it returns the userid to the application.

Define a certificate with mutliid mapping

A specific Subject DN gets a different userid depending on the application id.

As above you can specify an Issuer Distinguished Name or a Subject Distinguished Name.

Use an Issuer Distinguished Name

Use an Issuer certificate, so any certificate issued by DN this will be covered.

I defined the profile for my Issuer CA certificate. The definition below says – any certificate issued by this CA.

RACDCERT DELMAP(LABEL('CPMULTI')) MULTIID 

RACDCERT MULTIID MAP WITHLABEL('CPMULTI') TRUST - 
  IDNFILTER('CN=SSCA256.OU=CA.O=DOC.C=GB')      - 
  CRITERIA(ZZAPPLID=&APPLID) 

Define which userid to use based on the CRITeria

RDEFINE DIGTCRIT ZZAPPLID=ZZZ  APPLDATA('ADCDZ') 
RDEFINE DIGTCRIT ZZAPPLID=AAA  APPLDATA('ADCDF') 

In my program I had

struct __certificate  ct; 
ct.__cert_type = __CERT_X509; 
char * pData = (char *)  ioc.TTLSi_BufferPtr; 
       
ct.__cert_length = ...
ct.__cert_ptr    = ...;
rc = pthread_security_applid_np(__CREATE_SECURITY_ENV,
         __CERTIFICATE_IDENTITY, 
         sizeof(ct), 
         &ct, 
         "", 
         0,
        "AAA"); 

With this, the CRITERIA(ZZAPPLID=&APPLID) becomes CRITERIA(ZZAPPLID=AAA), which maps to CRITeria

RDEFINE DIGTCRIT ZZAPPLID=AAA  APPLDATA('ADCDF'), 

and so maps to userid ADCDF.

When applid ZZZ was used instead of AAA, then

RDEFINE DIGTCRIT ZZAPPLID=AAA  APPLDATA('ADCDF')

AT-TLS only seems to be able to use the APPL OMVSAPPL (the default). I could not find a way of getting it to use an APPLID, so I had to use pthead_security_appl_np to be able to use an applid different from the default.

Use a subject DN

You can use an explicit Subject DN

 RACDCERT MULTIID MAP WITHLABEL('CPMULTIU') TRUST - 
     SDNFILTER('CN=docecgen.O=Doc2.C=GB')            - 
   CRITERIA(UAPPLID=&APPLID) 

 RDEFINE DIGTCRIT UAPPLID=AAA  APPLDATA('ADCDA') 
 RDEFINE DIGTCRIT UAPPLID=BBB  APPLDATA('ADCDB') 

the userid used depending which APPLID was specified in my application.

I could use a subset of the the SDN

RACDCERT DELMAP(LABEL('CPMULTIU')) MULTIID 
RDELETE DIGTCRIT APPLID=OMVSAPPL 
RDELETE DIGTCRIT APPLID=ZZZ 
                                                                   
RACDCERT MULTIID MAP WITHLABEL('CPMULTIU') TRUST - 
    SDNFILTER('O=Doc2.C=GB')            - 
  CRITERIA(UAPPLID=&APPLID) 

RDEFINE DIGTCRIT UAPPLID=AAA  APPLDATA('ADCDD') 
RDEFINE DIGTCRIT UAPPLID=BBB  APPLDATA('ADCDE') 

Creating definitions

What the documentation does not say, is that you can have any keyword in the criteria, as long as the substitute value has a DIGTCRIT.

This means you can have

RACDCERT MULTIID MAP ...    CRITERIA(ZORK=&APPLID) 

and have statements like

RDEFINE DIGTCRIT ZORK=AAA  APPLDATA('IBMUSER')

Controlling access to the applid

If there is a profile for the applid in CLASS(APPL) the userid will need read access to the profile.

If the profile is not defined, then anyone can use the profile.

Problem I experienced

ESRCH: errno 143 EDC5143I No such process. errno2 0be8044c

I got messages

ESRCH:The user ID provided as input is not defined to the security product or does not have an OMVS segment defined
pthread_s… rc = -1 errno 143 EDC5143I No such process. errno2 0be8044c

Where TSO BPXMTEXT 0be8044c gives

BPXPTSEC 01/05/18
JRNoCertforUser: There is no userid defined for this certificate. Action: Ensure the userid is known to the SAF service.

Cause 1:

I used an APPLID which did not have a criteria

I also got this when I used pthread_security_applid_np(…) with an an applid which did not have a matching RDEFINE DIGTCRIT

For example the following worked

  • applid AAA and RDEFINE DIGTCRIT UAPPLID=AAA APPLDATA(‘ADCDA’)

when I used a different applid I got the “EDC5143I No such process. errno2 0be8044c” message, along with the following on the joblog.

ICH408I USER(COLIN ) GROUP(SYS1 ) NAME(COLIN PAICE )
DIGITAL CERTIFICATE IS NOT DEFINED. CERTIFICATE SERIAL NUMBER(027C)
SUBJECT(CN=docecgen.O=Doc2.C=GB) ISSUER(CN=SSCA256.OU=CA.O=DOC.C=GB).

Cause 2:

In the definitions I had the wrong case: c=GB and c=gb.

Cause 3:

NOTRUST was specified

RACDCERT MULTIID MAP WITHLABEL('CPMULTIU') noTRUST -
SDNFILTER('CN=docecgen.O=Doc2.C=GB') -
CRITERIA(UAPPLID=&APPLID)

This also had a message on the job log:

ICH408I USER(START1 ) GROUP(SYS1 ) NAME(####################)
DIGITAL CERTIFICATE DEFINED TO A RESERVED USER ID. CERTIFICATE SERIAL
NUMBER(027C) SUBJECT(CN=docecgen.O=Doc2.C=GB) ISSUER(CN=SSCA256.OU=C
A.O=DOC.C=GB).

Cause 4:

I had specified the certificate’s DN in Issuers DN. It should have been SDNFILTER(‘CN=rsarsa.O=cpwebuser.C=GB’)

EMVSSAFEXTRERR 163 EDC5163I SAF/RACF extract error. errno2 0be8081c

pthread_secuity_applid_np() rc = -1 errno 163 EDC5163I SAF/RACF extract error. errno2 0be8081c

TSO BPXMTEXT 0be8081c , just gives

BPXPTSEC 01/05/18

with no message description

Cause 1:

I got this when I used

RACDCERT MULTIID MAP WITHLABEL('CPMULTIU') TRUST - 
    SDNFILTER('CN=docecgen.O=Doc2.C=GB')            - 
  CRITERIA(UAPPLID=&APPLID) 

and the SDNFILTER value was in upper case. When I corrected it (to the above) it worked.

It looks like certificate filter not found.

Cause 2:

I had a lower case keyword such as SDNFILTER(‘CN=docecgen.O=Doc2.c=GB’)

EMVSSAFEXTRERR 163 EDC5163I SAF/RACF extract error. errno2 0be80820

TSO BPXMTEXT 0be8082 gave

BPXPTSEC 01/05/18

and no other information.

I had

RACDCERT MULTIID MAP WITHLABEL('CPMULTIU') TRUST - 
    SDNFILTER('CN=docecgen.O=Doc2.C=GB')            - 
  CRITERIA(UAPPLID=&APPLID) 

RDEFINE DIGTCRIT UAPPLID=ZZZ  APPLDATA('ADCDB') 

and userid ADCDB had no access to the RACF resource class(APPL) ZZZ.

IBM reason code 0x0BE80820 is related to a SAF (System Authorization
Facility) authorization error encountered during a session switch,
This error indicates thatthe system believes the user lacks the necessary authorization to switch to the target user ID , even though the profile is intended to handle this authorization.

Local fix: Give users and group READ access to profile OMVSAPPL in
class APPL.

EMVSERR 157 EDC5157I An internal error has occurred. errno2 0be800fc.

TSO BPXMTEXT 0be800fc gave

BPXPTSEC 01/05/18
JRSAFNoUID: The user ID has no UID

Action: Create an OMVS segment with a UID.

Cause:

If I display the userid LU ADCDG OMVS it gave UID= NONE.

I had created this situation using ALU ADCDG OMVS(NOUID). I reset it back using tso ALU ADCDG OMVS(UID(990098)).

How to import a certificate from openssl into z/OS

This question came up in an email exchange after someone had upgraded openssl from 1.1.1 to v3.

There are two format certificates – text, and binary.

Text certificate

The text certificate looks like

Certificate:                                        
    Data:                                           
        Version: 3 (0x2)                            
        Serial Number: 633 (0x279)                  
        Signature Algorithm: ecdsa-with-SHA384      
...
-----BEGIN CERTIFICATE-----                                      
MIICgTCCAiegAwIBAgICAnkwCgYIKoZIzj0EAwMwOjELMAkGA1UEBhMCR0IxDDAK 
BgNVBAoMA0RPQzELMAkGA1UECwwCQ0ExEDAOBgNVBAMMB1NTQ0EyNTYwHhcNMjMw 
MzE5MTIzODA1WhcNMjQwMTMwMTY0NjAwWjAuMQswCQYDVQQGEwJHQjEMMAoGA1UE 
CgwDRG9jMREwDwYDVQQDDAhkb2NlYzI1NjBZMBMGByqGSM49AgEGCCqGSM49AwEH 
...
-----END CERTIFICATE-----    

This has both a text version, and a base 64 encoded version within it.

On z/OS create a dataset with this file. Then use JCL like

//S1  EXEC PGM=IKJEFT01,REGION=0M 
//SYSPRINT DD SYSOUT=* 
//SYSTSPRT DD SYSOUT=* 
//SYSTSIN DD * 
RACDCERT CHECKCERT('COLIN.DOCEC256.NEW.PEM')           

This checks the file is usable, and you can check the contents before you install it on your system.

It produced output like

Certificate 1:                                               
                                                             
  Start Date: 2023/03/19 12:38:05                            
  End Date:   2024/01/30 16:46:00                            
  Serial Number:                                             
       >0279<                                                
  Issuer's Name:                                             
       >CN=SSCA256.OU=CA.O=DOC.C=GB<                         
  Subject's Name:                                            
       >CN=docec256.O=Doc.C=GB<                              
  Subject's AltNames:                                        
    IP: 127.0.0.1                                            
    Domain: localhost                                        
  Signing Algorithm: sha384ECDSA                             
  Key Usage: HANDSHAKE, DATAENCRYPT, DOCSIGN, KEYAGREE       
  Key Type: NIST ECC                                         
  Key Size: 256    
...                                         

It also gave me

Chain information:                                           
  Chain contains 1 certificate(s), chain is incomplete       

This message is because “A certificate chain is considered incomplete if RACF is unable to follow the chain back to a self-signed ‘root’ certificate”.

Which was true, RACF already had the CA in its database, and the CA certificate was not in the file.

When the file had already been imported I also got

Certificate 1:                                                     
Digital certificate information for CERTAUTH:
  Label: Linux-CARSA                                               

So I know it had already been imported into RACF.

I also got

Chain contains expired certificate(s)         

So I could tell I needed to get a newer certicate.

Add it to the RACF key store

There is no add-replace, so you have to delete it, then add it

RACDCERT DELETE - 
  (LABEL('DOCEC256')) ID(COLIN) 
RACDCERT ID(COLIN)    ADD('COLIN.DOCEC256.NEW.PEM') - 
  WITHLABEL('DOCEC256') 

SETROPTS RACLIST(DIGTNMAP, DIGTCRIT) REFRESH

The first time I ran this I got

RACDCERT DELETE   (LABEL('DOCEC256')) ID(COLIN)                                                           
IRRD107I No matching certificate was found for this user.                                                 
READY                                                                                                     
RACDCERT ID(COLIN)    ADD('COLIN.DOCEC256.NEW.PEM')   WITHLABEL('DOCEC256')                               
IRRD199I Certificate with label 'DOCEC256' is added for user COLIN.                                       
IRRD175I The new profile for DIGTCERT will not be in effect until a SETROPTS REFRESH has been issued.     

The second time I ran it I got

RACDCERT DELETE   (LABEL('DOCEC256')) ID(COLIN)                                                                   
IRRD176I RACLISTed profiles for DIGTCERT will not reflect the deletion(s) until a SETROPTS REFRESH is issued.     
READY                                                                                                             
RACDCERT ID(COLIN)    ADD('COLIN.DOCEC256.NEW.PEM')   WITHLABEL('DOCEC256')                                       
IRRD199I Certificate with label 'DOCEC256' is added for user COLIN.                                               
IRRD175I The new profile for DIGTCERT will not be in effect until a SETROPTS REFRESH has been issued.             

Refresh the RACLISTed data

If the classes are RACLISTed (cached in memory), you need

SETROPTS RACLIST(DIGTNMAP, DIGTCRIT) REFRESH

What happens if it is already in the key store under a different userid?

I tried adding it for a different userid and got

RACDCERT ID(ADCDA) ADD('COLIN.DOCEC256.NEW.PEM') WITHLABEL('DOCEC256')
IRRD109I The certificate cannot be added. Profile 0279.CN=SSCA256.OU=CA.O=DOC.C=GB is already defined.

Unfortunately it does not tell you the id it has already been defined for.

The only way I found of finding this information is

  • displaying all resources under class(DIGTCERT).
  • find the resource with the matching name.

For example

//COLRACF  JOB 1,MSGCLASS=H 
//S1  EXEC PGM=IKJEFT01,REGION=0M 
//SYSPRINT DD SYSOUT=* 
//SYSTSPRT DD SYSOUT=* 
//SYSTSIN DD * 
RLIST DIGTCERT  * 
/* 
// 

gave me output like

CLASS      NAME                                                  
-----      ----                                                  
DIGTCERT   0279.CN=SSCA256.OU=CA.O=DOC.C=GB                      
                                                                 
...                                                           
                                                                 
APPLICATION DATA                                                 
----------------                                                 
ADCDF                                                            

the current owner is ADCDF.

The serial number of the certificate is 0279, and the components are separated with periods ‘.’.

If the name has a blank in it you will get a value like

COLIN4Certification¢Authority.OU=TEST.O=COLIN

where the blank is replaced with the cent sign(¢).

Binary certificate, PKCS12, .P12

You should be able to upload the .p12 file to z/OS as a binary file and import it using the same JCL as above.

However in openssl V3 the packaging of the certificate has changed, and RACF does not yet support it.

The easiest way of getting a certificate into z/OS is to use the .pem file (grin)