What’s hammering my Linux Ethernet and how do I stop it?

I was downloading some stuff on one machine, and noticed that my Ethernet connection had a very high throughput – but it was doing nothing useful. This blog post gives some of the things I did to identify and resolve the problem.

Mount the file system

I used the command

sshfs colin@10.1.0.3:/home/zPDT/ ~/mountpoint

to mount the file system from 10.1.03 on my local machine.

Identify the problem

I used the Linux command nload to show the network activity.

For my wireless link (downloading a big file) the output was

I cannot currently reproduce the sustained Ethernet usage problem.

Wireshark showed my a lot of activity for SSH from port 55401 to port 22.

If you do not have access to Wireshark, the following command show all the socket activity which may help.

ss -t -a -i -O |grep delivery|awk '{print $4,$5, " ", $30,$31 }'

To find the owner of port 55401 I used the show socket command

ss -p |grep 55104
tcp ESTAB 0 0 10.1.0.2:55104 10.1.0.3:ssh users:(("ssh",pid=7258,fd=3))

This gave me the process id of the owner of the port. The ps command gives more information

ps -ef |grep 7258
colinpa+ 7258 ... ssh -x -a -oClearAllForwardings=yes -2 colin@10.1.0.3 -s sftp

Showing the sftp to 10.1.0.3.

How to stop the sftp?

The documentation for sshfs says use the fusermount3 command.

$fusermount3 -u ~/mountpoint 
fusermount3: failed to unmount /home/colinpaice/mountpoint: Device or resource busy

I needed to use the lazy unmount option -z

 fusermount3 -z  -u ~/mountpoint

and this successfully unmounted the remote file system

Chaff

I found out that information can be obtained from the profile of key strokes, and so chaff has been added to the SSH flow.

I fixed it by using setting ObscureKeystrokeTiming no in /etc/ssh/ssh_config. The documentation says

Specifies whether ssh(1) should try to obscure inter-keystroke timings from passive observers of network traffic. If enabled, then for interactive sessions, ssh(1) will send keystrokes at fixed intervals of a few tens of milliseconds and will send fake keystroke packets for some time after typing ceases. The argument to this keyword must be yes, no or an interval specifier of the form interval:milliseconds (e.g. interval:80 for 80 milliseconds). The default is to obscure keystrokes using a 20ms packet interval. Note that smaller intervals will result in higher fake keystroke packet rates.

Whoops I just used the wrong 3270 window.

For my z/OS system, I have multiple 3270 session. For example one has an all powerful userid, one has my normal userid, and one has a userid with no authority. I usually position them left to right so it is obvious which session I am using.

I recently had an incident where I disconnected my external monitor, used z/OS, then reconnected my external monitor. The 3270 sessions were not in their usual places, and I used the right hand session to do something, to find I was using the all powerful userid.

I’ve now fixed this by using

x3270 -model 5 -bd red tso@localhost:3270 &

where -bd red says give the window a red border. Of course if I do not look at the border, it will not help – but I hope it will.

My session now looks like

If you display the x3270 options, you will not find -bd mentioned. x3270 uses some of xterm, which has options which include:

  • -bd color This option specifies the color to use for the border of the window. The default is “black.”
  • -bg color This option specifies the color to use for the background of the window. The default is “white.”
  • -bw number This option specifies the width in pixels of the border surrounding the window.
  • -fg color This option specifies the color to use for displaying text. The default is “black.”
  • -fn font This option specifies the font to be used for displaying normal text. The default is fixed.
  • -name name This option specifies the application name under which resources are to be obtained, rather than the default executable file name. Name should not contain “.” or “*” characters.
  • -rv This option indicates that reverse video should be simulated by swapping the foreground and background colors.
  • +rv Disable the simulation of reverse video by swapping foreground and background colors.
  • -title string This option specifies the window title string, which may be displayed by window managers if the user so chooses. The default title is the command line specified after the -e option, if any, otherwise the application name.

Using and debugging RACF CLASS(APPL) and pthread_security_np

I’ve blogged I want to be someone else – or using pthread_security_np, how a server can be passed a userid or password to logon to a server to do some work. For example I was logging on from a web server to the RMF GPMSERVE for displaying RMF reports in a a web browser.

I had followed the documentation, but all my userids had access, when none of them should have done. This blog post covers the steps I used to dig into this.

The RACF calls used, have a flag set “Suppress any RACF Messages”, which makes it harder to diagnose problems.

With most RACF calls you can define a profile

ADDSD 'COLIN.PROT.DATASET' UACC(NONE) WARNING

where the WARNING option says “If the userid does not have access – give the userid access, but write a message on the console”. This allows you to find when you need to grant access. Once the profile is established, you can use the ALTSD … NOWARNING. Userids requesting access, but which are not permitted will now fail.

Defining a profile with CLASS(APPL), the warning has no effect. I had to use NOTIFY(COLIN) to get notified of problems.

Tracing the requests

To trace all of the RACF calls made by my job COLINNT I issued

#set trace(callable(all),racroute(all),jobname(COLINNT))      

This gave me many hundreds of calls.

Once I had been through the whole process, I found the trace for the pthread_security_np etc calls was just

set trace(callable(type(38)),jobname(COLINNT))

Collect a trace

See Collecting and understanding a RACF GTF trace output.

Looking at the trace output

There is a trace record “before” (PRE) matching “after” (POST). Many parameters are the same (as you might expect).

The output of these traces is verbose, with unnecessary information and with extra blanks lines. For example for one parameter

   Area length:                  00000008 

Area value:
D6C6C6E2 C5E30000 | OFFSET.. |

Area length: 00000004

Area value:
00000000 | .... |

Area length: 00000008

In the description below, I’ve squashed this down to a single line

area length  8  OFFSET0000 length 4 00000000 

It looks like the field OFFSET0000 has the offset in hex from somewhere. I didn’t find this information useful.

A compressed trace record is given below. For the interpretation of the fields see the following trace record.

RTRACE
OMVSPRE
Service number 00000026
Parameters
area length x6c
area length 8 OFFSET0000 length 4 00000000
area length 8 OFFSET0004 length 4 00000000
area length 8 OFFSET0008 length 4 00000000
area length 8 OFFSET000C length 4 00000000
area length 8 OFFSET0010 length 4 40404040
area length 8 OFFSET0014 length 4 00000000
area length 8 OFFSET0018 length 4 40404040
area length 8 OFFSET001C length 1 01
area length 8 OFFSET0020 length 4 C4800000
area length 8 OFFSET0024 length 6 05c1c2c3c4C2 = .ABCDB
area length 8 OFFSET0028 length 4 00000000
area length 8 OFFSET002C length 9 08C7D7D4 E2C5D9E5 C5 = .GPMSERVE
Internal information
area length 8 OFFSET0034 length x37 = "Server Userid=IBMUSER created a full ACEE"
area length 8 OFFSET0048 length 4 00000000
area length 8 OFFSET0050 length 9 08000000 00000000 00
area length 8 OFFSET0054 length 1 00
area length 8 OFFSET0018 length 4 40404040
Internal data
area length xc0 ACEE
area length x50 userid information
area length x90 ACEX
area length x50 USP

hex dump of record

The RACF commands documentation says for a callable service, 0x00000026 is function number for IRRSIA00. This page gives the name of the function name IRRSIA00 and the description z/OS kernel on behalf of servers that use pthread_security_np servers or __login, or MVS servers that do not use z/OS UNIX services.

The parameters are for the initACEE (IRRSIA00) call.

The matching “after” record, OMVSPOST was (with the parameters of the the IRRSIA00 call format, parameters) are

RTRACE
OMVSPOST
Service number: 00000026
RACF Return code: 00000008
RACF Reason code: 00000020

Parameters
area length 8 OFFSET0000 length 4 00000000 >work_area<
area length 8 OFFSET0004 length 4 00000000 >ALET<
area length 8 OFFSET0008 length 4 00000008 >SAF_return_code<
area length 8 OFFSET000C length 4 00000000 >ALET<
area length 8 OFFSET0010 length 4 00000008 >RACF_return_code<
area length 8 OFFSET0014 length 4 00000000 >ALET<
area length 8 OFFSET0018 length 4 00000020 >RACF_reason_code<
area length 8 OFFSET001C length 1 01 >Function_code 1 = Create an ACEE<
area length 8 OFFSET0020 length 4 C4800000 >Attributes - see below<
area length 8 OFFSET0024 length 6 05c1c2c3c4C2 >Userid .ABCDB<
area length 8 OFFSET0028 length 4 00000000 >ACEE_pointer<
area length 8 OFFSET002C length 9 08C7D7D4 E2C5D9E5 C5 >APPLID = .GPMSERVE<
Internal information
area length 8 OFFSET0034 length x37 = "Server Userid=IBMUSER created a full ACEE"
area length 8 OFFSET0048 length 4 00000000
area length 8 OFFSET0050 length 9 08000000 00000000 00
area length 8 OFFSET0054 length 1 00
area length 8 OFFSET0018 length 4 40404040
Internal data
area length xc0 ACEE
area length x50 userid information
area length x90 ACEX
area length x50 USP

There the attributes C480000 mean

  • X80000000 – Create the ACEE
  • X40000000 – Createthe USP for the userid
  • X04000000 – Suppress any RACF Messages
  • X00800000 – Return an OUSP in the output area

Note the flag:X04000000 – Suppress any RACF Messages.

The return code

area length  8  OFFSET0008 length 4 00000008 >SAF_return_code<
...
area length 8 OFFSET0010 length 4 00000008 >RACF_return_code<
area length 8 OFFSET0014 length 4 00000000 >ALET<
area length 8 OFFSET0018 length 4 00000020 >RACF_reason_code<

8,8,32 means The user does not have appropriate RACF access to either the SECLABEL, SERVAUTH profile, or APPL specified in the parmlist.

For other RACF services the trace entries follow a similar format.

Python how do I convert a STCK to readable time stamp?

As part of writing a GTF trace formatter in Python I needed to covert a STCK value to a printable value. I could do it in C – but I did not find a Python equivalent.

from datetime import datetime
# Pass in a 8 bytes value
def stck(value):
value = int.from_bytes(value)
t = value/4096 # remove the bottom 12 bits to get value in micro seconds
tsm = (t /1000000 ) - 2208988800 # // number of seconds from Jan 1 1970 as float
ts = datetime.fromtimestamp(tsm) # create the timestamp
print("TS",tsm,ts.isoformat()) # format it

it produced

TS 1735804391.575975 2025-01-02T07:53:11.575975

Tracing encrypted data to z/OS

I had blogged Collecting a wire-shark trace with TLS active for a browser where you could specify an environment variable export SSLKEYLOGFILE=$HOME/sslkeylog.log. OpenSSL would write the key to this file, and Wireshark could decrypt the traffic using this data.

Unfortunately this only worked with RSA keys. I could not get it to work with modern Elliptic Curve keys.

I’ve updated my zWireshark program to capture AT-TLS application data in clear text from the z/OS side. It uses an IBM provided API, and captures the traffic between AT-TLS and the application.

You need to set up security profiles, for example

permit  EZB.TRCSEC.*.*.ATTLS            - 
CL(SERVAUTH) id(ADCDB) access(READ)
permit EZB.TRCCTL.S0W1.TCPIP.DATTRACE -
CL(SERVAUTH) id(ADCDB) access(READ)
permit EZB.TRCCTL.S0W1.TCPIP.OPEN -
CL(SERVAUTH) id(ADCDB) access(READ)
permit EZB.TRCCTL.*.*.* CL(SERVAUTH) id(ADCDB) access(READ)
SETROPTS RACLIST(SERVAUTH) refresh

and change the AT-TLS configuration to include CtraceClearText On

For my web browser traffic it produced (printed with the ASCII switch)

Data trace.  Data length 345.  ATTLS Clear Text.                             
<GPMSERVE 11:01:57.258946 Src 10.1.1.2 Port 8803 Dst 10.1.0.2 Port 45168
Warning: 199 RMF-DDS-Server SeverityCode(03) Data(0)
Content-Location: perform_20250102110157_20250102110157.xml
Cache-Control: max-age=30
Date: Thu, 02 Jan 2025 11:01:57 GMT
Connection: close
Content-Length: 1468
Content-Type: application/xml
X-UA-Compatible: IE=edge
X-Content-Type-Options: nosniff
X-XSS-Protection: 1; mode=block

When the AT-TLS option CtraceClearText was Off, the output was

Data trace.  Data length 0.  PTH_Flag.Confidential.                           
<GPMSERVE 11:27:48.733616 Src 10.1.1.2 Port 8803 Dst 10.1.0.2 Port 52860

So no confidential data was displayed.

The JCL is

//COLINC5    JOB 1,MSGCLASS=H,COND=(4,LE) 
// SET LOADLIB=COLIN.ZWIRESHA.LOAD
//RUN EXEC PGM=TCPDATA,REGION=0M,PARMDD=MYPARMS
//STEPLIB DD DISP=SHR,DSN=&LOADLIB
//SYSPRINT DD SYSOUT=*,DCB=(LRECL=200)
//SYSOUT DD SYSOUT=*
//SYSERR DD SYSOUT=*
//* MYPARMS needs a / to split LE parms from program parms
//* needs a blank before each parm, because trailing blanks removed
//MYPARMS DD *
/
--IP 10.1.0.2
--WAIT 60
--DEBUG 0
--DISCARD 1
--PRINT A
/*

Please let me know of any problems or suggestions

Collecting and understanding a RACF GTF trace output

A RACF request can be

  • A callable service, such as IRRSDL00 which is used to extract information about keyrings. This can be used by High Level Languages. These have a trace type of OMVS
  • A RACROUTE request, an assembler macro interface, such as FASTAUTH. These have a trace type of RACF.

You get a trace entry PRE the call, and and a trace entry POST the call. A callable service would have an OMVSPRE, and an OMVSPOST entry.

When setting up your RACF trace you need to define which callable service, or which RACROUTE requests you want to trace.

Example of a callable server

Entry trace

Trace Identifier:             00000036 
Record Eyecatcher: RTRACE
Trace Type: OMVSPRE
Ending Sequence: ........
Calling address: 00000000 20801A0F
Requestor/Subsystem: ........ ........
Primary jobname: IBMINC5
Primary asid: 00000029
Primary ACEEP: 00000000 008FA8A8
Home jobname: IBMINC5
Home asid: 00000029
Home ACEEP: 00000000 008FA8A8
Task address: 00000000 008D6A88
Task ACEEP: 00000000 00000000
Time: DFF36A27 5460CC00
Error class: ........
Service number: 00000029
RACF Return code: 00000000
RACF Reason code: 01000000
Return area address: 00000000 00000000
Parameter count: 00000021

Interesting information

  • Trace Type: OMVSPRE this is an entry trace
  • Primary jobname: IBMINC5
  • Home jobname: IBMINC5
  • The RACF return and reason code have no meaning on entry
  • Service number: 00000029 is hex so the service is 41. The RACF commands documentation says for a callable service, 41 is function number IRRSDL00.

Exit trace

Trace Identifier:             00000036 
Record Eyecatcher: RTRACE
Trace Type: OMVSPOST
Ending Sequence: ........
Calling address: 00000000 20801A0F
Requestor/Subsystem: ........ ........
Primary jobname: IBMINC5
Primary asid: 00000029
Primary ACEEP: 00000000 008FA8A8
Home jobname: IBMINC5
Home asid: 00000029
Home ACEEP: 00000000 008FA8A8
Task address: 00000000 008D6A88
Task ACEEP: 00000000 00000000
Time: DFF36A27 5468DC40
Error class: ........
Service number: 00000029
RACF Return code: 00000008
RACF Reason code: 00000014
Return area address: 00000000 00000000
Parameter count: 00000021

Interesting fields

  • Trace Type: OMVSPOST this is the exit trace
  • As above service number 41 is function
  • The RACF Return code is 8
  • The RACF reason code is 14

Example of a RACROUTE trace

Entry

Trace Identifier:             00000036 
Record Eyecatcher: RTRACE
Trace Type: RACFPRE
Ending Sequence: ........
Calling address: 00000000 A096A08A
Requestor/Subsystem: CRYPTO CRYPTO
Primary jobname: CSF
Primary asid: 00000038
Primary ACEEP: 00000000 008F6B98
Home jobname: IBMEXP
Home asid: 00000029
Home ACEEP: 00000000 008FA8A8
Task address: 00000000 008D6A88
Task ACEEP: 00000000 00000000
Time: DFF37DD1 AB7FF040
Error class: ........
Service number: 00000002
RACF Return code: 00000000
RACF Reason code: 00000000
Return area address: 00000000 00000000
Parameter count: 00000009

This trace entry was from a batch job IBMEXP, using an ICSF service to access an encryption key.

Interesting fields

  • RACFPRE this is a RACROUTE before entry
  • Requestor/Subsystem: CRYPTO which z/OS component
  • Primary jobname: CSF This was the address space which issued the RACROUTE request
  • Home jobname: IBMEXP This was the job requesting an ICSF service
  • Service number: 2. The RACF commands documentation says for a RACROUTE service, 2 is FASTAUTH
  • Ignore RACF return and RACF reason codes. They are set on exit,

Exit

Trace Identifier:             00000036 
Record Eyecatcher: RTRACE
Trace Type: RACFPOST
Ending Sequence: ........
Calling address: 00000000 A096A08A
Requestor/Subsystem: CRYPTO CRYPTO
Primary jobname: CSF
Primary asid: 00000038
Primary ACEEP: 00000000 008F6B98
Home jobname: IBMEXP
Home asid: 00000029
Home ACEEP: 00000000 008FA8A8
Task address: 00000000 008D6A88
Task ACEEP: 00000000 00000000
Time: DFF37DD1 AB85DE40
Error class: ........
Service number: 00000002
RACF Return code: 00000008
RACF Reason code: 00000000
Return area address: 00000000 00000000
Parameter count: 00000009

This trace entry was from a batch job IBMEXP, using an ICSF service to access an encryption key.

Interesting fields

  • RACFPOST this is a RACROUTE after exit entry
  • Requestor/Subsystem: CRYPTO which z/OS component
  • Primary jobname: CSF This was the address space which issued the RACROUTE request
  • Home jobname: IBMEXP This was the job requesting an ICSF service
  • Service number: 2. The RACF commands documentation says for a RACFROUTE service, 2 is FASTAUTH
  • RACF Return code: 00000008, RACF Reason code: 00000000 This shows there was a problem. The documentation for FASTAUTH shows
    • RC=8 -> the user or group is not authorized to use the resource.
    • RS=0 -> The invoker does not need to log the request.

The RACF trace command for this was

#SET TRACE(JOBNAME(csf),callable(none),RACROUTE(type(2)))

and the following commands

S GTF.GTF
R 1,trace=usrp
R 2,USR=(F44)
R 3,END
R 4,U

To trace a callable I used

#SET TRACE(JOBNAME(colinc5),callable(type(53)),RACROUTE(none))

Fast start GTF

I can use

S GTF.GTF,M=GTFRACF

Where my GTF proc has

//GTFNEW  PROC M=GTFPARM,ID=SYS1 
//DELETE EXEC PGM=IEFBR14
//IEFRDER DD DSNAME=&ID..TRACE,UNIT=SYSDA,SPACE=(TRK,20),
// DISP=(MOD,DELETE)
//IEFPROC EXEC PGM=AHLGTF,PARM='MODE=EXT,DEBUG=NO,TIME=YES,NP',
// TIME=1440,REGION=2880K
//IEFRDER DD DSNAME=&ID..TRACE,UNIT=SYSDA,SPACE=(TRK,20),
// DISP=(NEW,KEEP)
//SYSLIB DD DSNAME=USER.Z24C.PROCLIB(&M),DISP=SHR

I have a member in USER.Z24C.PROCLIB(GTFRACF)

TRACE=USRP 
USR=(F44)
END

I want to be someone else – or using pthread_security_np

There are times when you are running a job, and you want to do some work as a different userid.

You can configure a userid so it acts as a surrogate – I can submit jobs with your userid, without knowing your password.

You can have a server and have a thread within the server run as different userid by specifying a userid and password, using the pthread_security_applid_np function. You can also have it logon using a certificate, and it looks up the userid from the certificate.

You can extend/restrict this by specifying an RACF APPL which the user must have access to.

RDEFINE APPL YYYY 
PERMIT  CLASS(APPL) YYYY ID(COLIN) ACCESS(READ) 
SETROPTS RACLIST(appl ) REFRESH

My job running with userid IBMUSER specifies a userid COLIN, and COLIN’s password. If applid YYYY is specified, userid COLIN must have read access to the YYYY in CLASS(APPL).

If you use pthread_security_np() is the same as pthread_security_applid_np() and uses the default OMVSAPPL.

My example C program is

Main

int main( int argc, char *argv[]) 
{ 
  pthread_t thid; 
  int rc; 
  void *ret; 
  if (pthread_create(&thid, NULL, thread, "thread 1") != 0) { 
    perror("pthread_create() error"); 
    exit(1); 
  } 
  rc =pthread_join(thid, &ret); 
  printf("Pthread join %d\n",rc); 
  if (rc  != 0) { 
 // perror("pthread_create() error"); 
 // return(3); 
  } 
  printf("thread exited with '%s'\n", ret); 
  return 0  ; 
} 

  #pragma runopts(POSIX(ON)) 
  /*Include standard libraries */ 
  #include <stdio.h> 
  #include <stdlib.h> 
  #include <string.h> 
  #include <stdarg.h> 
  #include <errno.h> 
  #define _OPEN_SYS 1 
  #include <pthread.h> 
                                                                    
  void *thread(void *arg) { 
  char *ret = ""; 
  int rc; 
  printf("thread() entered with argument '%s'\n", arg); 
  rc = pthread_security_applid_np(__CREATE_SECURITY_ENV, 
              __USERID_IDENTITY, 
              5, 
              "COLIN", 
              "PASSW0RD", 
              0,"AAAA"); 
  perror("perror"); 
  printf("colin rc = %d errno %d\n",rc,errno); 
  if (rc == 0) 
  { 
     FILE * f2  ; 
     f2 = fopen("DD:TEST","r                      ") ; 
     printf("DD:TEST  %d\n",f2); 
   } 
  pthread_exit(ret); 
  } 



int main( int argc, char *argv[]) 
{ 
  pthread_t thid; 
  int rc; 
  void *ret; 
  if (pthread_create(&thid, NULL, thread, "thread 1") != 0) { 
    perror("pthread_create() error"); 
    exit(1); 
  } 
  rc =pthread_join(thid, &ret); 
  printf("Pthread join %d\n",rc); 
  if (rc  != 0) { 
    perror("pthread_create() error"); 
   return(3); 
  } 
  printf("thread exited with '%s'\n", ret); 
  return 0  ; 
} 

When the above program ran it gave me

ICH408I USER(COLIN   ) GROUP(SYS1    ) NAME(COLIN PAICE         ) 
  IBMUSER.TRY.PEM CL(DATASET ) VOL(C4USR1)                        
  INSUFFICIENT ACCESS AUTHORITY                                   
  ACCESS INTENT(READ   )  ACCESS ALLOWED(NONE   )                 

which shows that the userid COLIN was used.

EDC5167I Access to the UNIX System Services version of the C RTL is denied. (errno2=0xC10F0001 C10F0001) .

You cannot use the pthread_security… functions in the main thread. You have to attach a subtask using pthread_create().

EDC5143I No such process. (errno2=0x0BE80000 0BE80000)

Invalid userid specified.

EDC5111I Permission denied. (errno2=0x0BE80000 0BE80000)
Invalid password.

EDC5163I SAF/RACF extract error. (errno2=0x0BE8081C 0BE8081C)

Revoked userid.

EDC5168I Password has expired. (errno2=0x0BE80000 0BE80000)

Obvious.

EDC5163I SAF/RACF extract error. (errno2=0x0BE80820 0BE80820)

No OMVS segment, or the new user does not have access to the specified applid class.

rc = pthread_security_np() defaults to pthread_security_applid_np() with an applid of OMVSAPPL

ICH70004I 
ATTEMPTED 'READ' ACCESS OF USER(ADCDE) GROUP(TEST) NAME(ADCDE) ENTITY 'OMVSAPPL' IN CLASS 'APPL' 

EDC5139I Operation not permitted. (errno2=0x0BE802AF 0BE802AF)

ICH420I PROGRAM CERT FROM LIBRARY COLIN.LOAD CAUSED THE ENVIRONMENT TO
BECOME UNCONTROLLED.
BPXP014I ENVIRONMENT MUST BE CONTROLLED FOR SERVER (BPX.SERVER)
PROCESSING.

Needed

RALTER PROGRAM CERT ADDMEM('COLIN.LOAD'//NOPADCHK)
SETROPTS WHEN(PROGRAM) REFRESH

Easy AT-TLS configuration and reporting

This blog post discusses AT-TLS configuration, and my EasyAT-TLS code on GitHub to make it easy to configure AT-TLS and give an compact and useful configuration report.

A lot of documentation is written by experts for experts, from the developer’s viewpoint; rather than by experts for people who just want to get the job done. It feels that AT-TLS exposes to much of the structure of how data is held internally. I found it was very easy to get lost, and configuring it is hard. z/OSMF has a “configure AT-TLS” workflow – but it just creates a workflow of the steps. It did not make it easier to configure AT-TLS.

As a general philosophy, rather than the traditional approach of “here are all the keywords in alphabetical order”, I would provide information in different topics.

  • Keywords that everyone uses, and what you will need to get started,
  • More advanced keywords that most people might use,
  • Even more advanced keywords that only experts would use.

and then list the keywords alphabetically within topic. For an inexperienced user this makes it very clear what options they need to specify to get started. If you have a configuration tool (or online documentation) have a button which allows you to specify what your experience level is, and display or configure the information at the appropriate level.

The problem

What does an AT-TLS configuration file look like?

Part of the definition for one rule is

TTLSRule                      COLATTLJ 
{
LocalPortRange 4000
Jobname COLCOMPI
Userid COLIN
Direction BOTH
RemoteAddr 10.1.2.2/32
TTLSGroupActionRef TNGA
TTLSEnvironmentActionRef TNEA
TTLSConnectionActionRef TNCA
}
TTLSGroupAction TNGA
{
TTLSEnabled ON
}
TTLSEnvironmentAction TNEA
{
HandshakeRole ServerWithClientAuth
TTLSKeyringParms
{
Keyring start1/TN3270
}
TTLSSignatureParmsRef TNESigParms
}
TTLSSignatureParms TNESigParms
{
CLientECurves Any
SignaturePairs 060305030403

}......

There is a lot of structure, with values beginning with TTLS.

Without the structure the data looks like

policyRule : COLATTLJ
LocalAddr : All
RemoteAddr : '10.1.1.2/32'
LocalPortRange : 4000-4000
JobName : COLCOMPI
UserId : COLIN
Direction : Both
TTLSEnabled : On
Trace : 255
HandshakeRole : ServerWithClientAuth
Keyring : start1/TN3270
TLSv1.1 : Off
TLSv1.2 : On
TLSv1.3 : Off
HandshakeTimeout : 3
ClientECurves : Any
ServerCertificateLabel : NISTECCTEST
V3CipherSuites : [
003D TLS_RSA_WITH_AES_256_CBC_SHA256,
C02C TLS_ECDHE_ECDSA_WITH_AES_256_GCM_SHA384,
]

Which I find much clearer.

Once you have configured AT-TLS, you can use the Unix command pasearch to report the configuration. This gives output like

policyRule:             AZFClientRule                                  
Rule Type: TTLS
...
Time Periods:
Day of Month Mask:
First to Last: 1111111111111111111111111111111
Last to First: 1111111111111111111111111111111
Month of Yr Mask: 111111111111
Day of Week Mask: 1111111 (Sunday - Saturday)
Start Date Time: None
End Date Time: None
...
TTLS Condition Summary: NegativeIndicator: Off
Local Address:
FromAddr: All
ToAddr: All
Remote Address:
FromAddr: 0.0.26.137
ToAddr: 0.0.26.137
LocalPortFrom: 0 LocalPortTo: 0
RemotePortFrom: 0 RemotePortTo: 0
JobName: AZF* UserId:
ServiceDirection: Outbound
...

The output for one rule is over 180 lines long, and gives the configuration, including any defaults; not what is actually used. (Some fields can be configured in more than one place, so you need to know which is actually used). Compare this with my tool’s output above.

Easy AT-TLS on GitHub

Displaying a concise view of the data.

On GitHub is code which removes all of the “structure” from the pasearch report to leave just the non default data. You get, for one rule

policyRule : COLATTLJ
LocalAddr : All
RemoteAddr : '10.1.1.2/32'
LocalPortRange : 4000-4000
JobName : COLCOMPI
UserId : COLIN
Direction : Both
TTLSEnabled : On
Trace : 255
HandshakeRole : ServerWithClientAuth
Keyring : start1/TN3270
TLSv1.1 : Off
TLSv1.2 : On
TLSv1.3 : Off
HandshakeTimeout : 3
ClientECurves : Any
ServerCertificateLabel : NISTECCTEST
V3CipherSuites : [
003D TLS_RSA_WITH_AES_256_CBC_SHA256,
C02C TLS_ECDHE_ECDSA_WITH_AES_256_GCM_SHA384,
]

formatted as a YAML file, which I find much easier to use!

The output of the pasearch file was 2296 lines, the output file from my Python program was 184 lines!

Generating AT-TLS definitions

I found it hard to generate AT-TLS definitions because you need to know the AT-TLS structure; or have an existing entry to copy. For example a keyring is defined in a TTLSKeyringParms definition in TTLSEnvironmentAction statement which is pointed to by a high level TTLSRule. See the partial example above. You should not need to know this information.

As an end user I just want to define a keyring, and let the computer put it into the “internal format”.

With AT-TLS, you can group common information into a unit, and refer to that. I found I got lost trying to work out what I was using, and what else used it. Making one small change is complex as you had to do lots of copying to generate new groups, and then you have these groups lying around which may not be used again.

Generating AT-TLS definitions the easy way

The code to extract the key information from a pasearch output produces a YAML file. The genATTLS code takes a YAML file and generates the AT-TLS input file with all of the structure that AT-TLS requires. The output file can then be used by the Policy Agent on z/OS.

I’ve written the code from an end user perspective. For example I have a YAML file which defines two rules.

---
policyRule : Rule1
Priority : 255
LocalAddr : All
RemoteAddr : All
LocalPortRange : 6794-6794
Direction : Inbound
TTLSEnabled : On
Trace : 255
HandshakeRole : ServerWithClientAuth
Keyring : start1/TN3270
TLSv1.1 : Off
TLSv1.2 : On
TLSv1.3 : Off
HandshakeTimeout : 120
ServerCertificateLabel : RSA2048
---
policyRule : Rule2
Priority : 255
LocalAddr : All
RemoteAddr : All
LocalPortRange : 6793-6793
Direction : Inbound
TTLSEnabled : On
Trace : 255
HandshakeRole : Server
Keyring : start1/TN3270
TLSv1.1 : Off
TLSv1.2 : On
TLSv1.3 : Off
CertificateLabel : RSA2048
ServerCertificateLabel : RSA2048

This can be simplified by using “BasedOn” to refer to other sections, such as the “common” rule

---
# This first section is common to the others
policyRule : common
TLSv1.1 : Off
TLSv1.2 : On
TLSv1.3 : Off
Keyring : start1/TN3270
Priority : 255
LocalAddr : All
RemoteAddr : All
Direction : Inbound
TTLSEnabled : On
Trace : 255
CertificateLabel : RSA2048
ServerCertificateLabel : RSA2048
---
policyRule : Rule1
BasedOn : common
LocalPortRange : 6794-6794
HandshakeRole : ServerWithClientAuth
HandshakeTimeout : 120
---
policyRule : Rule2
BasedOn : common
LocalPortRange : 6793-6793
HandshakeRole : Server
---

The Python script generates the definitions from this file, 110 lines of output from a 28 line input file!

It is easy to extend. You just specify the overrides

policyRule : Rule2A
BasedOn : Rule2
TLSv1.3 : On
Priority: 300

The priority:300 says this definition should be used over Rule2, because Rule2 only has priority 255.

Testing multi system ICSF on a single system.

I only have a single z/OS image, and as part of testing ICSF in a multi system environment, I needed a second ICSF environment. I am the only user on my system, so I can stop and start ICSF as I please.

I set up a second set of ICSF datasets, and a second CSPRMxx parmlib member. I then used

P CSF
S CSF,PRM=C2
D ICSF,KDS

The D ICSF,KDS command, tells you what ICSF data sets are being used.

When using CSFKGUP I was careful to specify the correct dataset

//STEP10 EXEC PGM=CSFKGUP
// SET CKDS=COLIN.SCSFCKDS.NEW

To go back to my primary system.

P CSF
S CSF

What ICSF APIs do I need to use for AES CIPHER keys

I’ve done a lot of work using ICSF to generate keys programmatically for data set encryption, and I found it hard to remember what the different APIs are,and what options to specify. This blog post covers the main APIs.

AES keys are better than DES keys, because they are more secure (take longer to break) and use less CPU.

There are different key tokens formats within ICSF. AES uses a variable key format. Within AES keys you can have types

  • CIPHER used for encrypting data – such as data set encryption
  • EXPORTER and IMPORTER, these are used to encrypt keys for sending to remote systems. They are known as Key Encrypting Keys.
  • DATA – CIPHER keys are better than DATA because they have more granular control. You can say a CIPHER key can be used to encrypt an AES key – but not a DES key. With DATA it is all or nothing.
  • Others – I haven’t played with these

For many APIs you can give the name of a token in the CKDS data set. The names are 64 characters long. You can also use a copy of a token in memory, for example your program reads a token from the CKDS, or you have been sent a token from a remote systems. These will have a length typically shorter than 64 characters. ICSF uses the length you specified to determine what information is being passed in.

Generating a key token

You can use CSNBKGN2 (Key Generate2) to generate a key token.

You can specify type CIPHER. This generates a key token using some defaults. The defaults are not suitable for generating keys for data set encryption.

You can specify type TOKEN. You pass a skeleton token key with some of the options specified. You could generate this yourself, setting all of the bits you need, but it is easier to use the Key Token Build2 (CSNBKTB2) API to generate the key token. For example you specify rule options as character strings such as “CIPHER”,”XPRTCPAC”,”ANY-MODE”, which are needed for data set encryption.

You can specify the hex string to be used as the key value, by using the skeleton. You specify clear_key_bit_length and the clear_key_value. If clear_key_bit_length is zero in the skeleton, no key value is used and the system generates a random value. You will need to specify a valid value in the Key Generate2 field clear_key_bit_length. If you want to specify the hex value, set the two clear_key_bit_length fields to the same value in CSNBKTB2 and CSNBKGN2.

You can create a key token for use

  • on the current system. Once you have successfully generated your key token, you can use it to encrypt data, but you are more likely to store it in the local PKDS service using the PKDS Key Record Create (CSNDKRC) API. You might need to use the Key Record Delete (CSNDKRD) API if the key already exits in the PKDS. There is no write with replace service.
    • At a later date you can use the ICSF API to extract the key and encrypt with an EXPORTER or IMPORTER key to send to a remote site
  • and/or you can get a copy encrypted with a Key Exporting Key (EXPORTER or IMPORTER) so it can be securely sent to another site. You can
    • export it using an EXPORTER Key Encrypting Key to send to a remote site.
    • export it using an IMPORTER key, so it can be imported into the local system at a later date. I do not think this is a common scenario.

List the CKDS

You can search the CKDS passing various criteria using function Key Data Set List (CSFKDSL). It returns the label of each record matching the criteria. For each record you can use Key Data Set Metadata Read (CSFKDMR) which displays information such as create time, and if it is archived or not. You can read the record from the CKDS using CKDS Key Record Read2 (CSNBKRR2). The actual key value is encrypted, but there are many fields which are available, such as this is an AES CIPHER key.

You can get the size of the key value within the token using the Key Test2 (CSNBKYT2) API, with the KEY-LEN option.

Get the checksum of the token

There is an API to get the checksum of a token. (I do not think this checksum is stored in the token, as you can specify different algorithms.) If you send the token to a remote system, you can check the checksum is the same as the original, and that the token has not been changed. The Key Test2 (CSNBKYT2) API, with the GENERATE option, returns the checksum value. You can specify which algorithm to use, for example SHA-256 or ENC-ZERO. There is a VERIFY option which seems to compare the value you pass with the calculated value, and if they do not match gives reason code 9000.

Reading a key from the CKDS

You can use the CKDS Key Record Read2 (CSNBKRR2) API to read a record from the CKDS into memory. There is an API CKDS Key Record Read (CSNBKRC2) which reads old format records from the CKDS.