With traditional TLS applications, the application code has to issue the requests to use TLS, for example specify the keystore, and which cipher specs to use and does the encryption and decryption of the data. The application then issues TCP send and receive request as usual.
With AT-TLS, the TLS work is moved out of the application and into the TCPIP subsystem. The application just does the normal sends and receives, and TCPIP does the work of establishing the session and handling the encryption. There are rules and policies to define how the session should be established. It uses the PAGENT address space (Policy Agent) to manage the configuration.
Is it easier than having MQ or WAS Liberty do the TLS stuff? – I don’t think so. When it works it is fine. Getting it working is a challenge, because the trace and diagnostics are poor.
My other blog posts on PAGENT and AT-TLS
- Setting up syslogd on z/OS
- Configuring PAGENT for AT-TLS
- Trace PAGENT and AT-TLS
- Netstat, TTLS and AT-TLS
What is PAGENT?
Having used PAGENT to configure AT-TLS with TCPIP, I see PAGENT is a program which reads configuration information from a file – and gives the configuration to TCPIP. TCP then does the work.
It feels that the PAGENT setup and configuration was not designed with the z/OS environment in mind. It “breaks” so many things.
- You can have only one PAGENT running per LPAR – even with different name. This means you cannot have a “test” and production PAGENT in the same LPAR.
- PAGENT can be configured to have information on:
- Common Intrusion detection services (IDS).
- Common IP filtering, and manual and dynamic virtual private network (VPN) tunnels (IPSEC).
- Common Routing (Policy-based routing enables the TCP/IP stack to make routing decisions that take into account criteria other than just the destination IP address. The additional criteria can include job name, source port, destination port, protocol type (TCP or UDP), source IP address, NetAccess security zone, and multilevel secure environment security label).
- AT-TLS Common definitions.
- AT-TLS for TCPIP Image level which can have sections on
- As there is only one active PAGENT allowed per LPAR, you have to make your configuration changes to the production PAGENT, refresh it, and fix any configuration errors. The documentation says “make a change to production, if it doesn’t work back out the changes”!
- There is one initial configuration file per PAGENT, which can “include” other files. You cannot have a concatenated list of files.
- You cannot validate definitions before making them active. The configuration is processed only when the referenced TCPIP stack is active.
- Error messages do not have error message numbers, so there is no ability to look up the errors messages.
- It lacks good diagnostics. For example
- I got error message “Resource temporarily unavailable” when it could not find the security profile “EZB.INITSTACK.*.TCPIP2” on my system. The PAGENT code checks to see if the profile exists and if not, it dies quietly. It does not actually use the security profile which would cause RACF to produce a message saying missing profile.
- I deliberately misconfigured a file to use a file that does not exist. It just reported …processing_Stmt_TTLSConfig: processing: ‘ TTLSConfig //’USER.Z24C.TCPPARMS(BLAHBLAH)’ . It should report file not found. Some missing files get “Cannot get FILE handle for information.”
My set up
I could not find any good guidance on setting up PAGENT and AT-TLS, so I’ve documented what I did. It may not be correct…
It took about a day to understand the AT-TLS setup – as I was a typical user with typos etc which slowed me down.
I naively assumed errors would be reported in //SYSPRINT. On my system they were in /tmp/pagent.log. This file location can be configured with an Environment variable.
The output can be verbose, so I use oedit, and ISPF search
f err 15 25
to find the errors. You may find fields SYSERR or OBJERR.
When errors occur, you do not get file and line number of the error. You have to hunt around. Invalid statements are often just ignored.
With a configuration error the PAGENT job gave me a message on syslog
EZZ8438I PAGENT POLICY DEFINITIONS CONTAIN ERRORS FOR TCPIP : TTLS
In the /tmp/pagent.log file I had
05/30 07:21:12 EVENT :005: pinit_fetch_TTLS_policy_profile: Processing Image TTLS config file: ‘//’USER.Z24C.TCPPARMS(TTLS)” for image ‘TCPIP’
05/30 07:21:12 OBJERR :005: process_TTLS_attribute_table: Unknown attribute ‘ZocalAddr’ for TTLSRule
My common mistakes were
- spelling errors for example TLSConfig instead of TTLSConfig. (I commented, then uncommented a line and lost the initial T)
- incorrect dataset names, either the data set, or the member.
In the PAGENT configuration file, the AT-TLS specific stuff is like
tcpImage TCPIP //’USER.Z24C.TCPPARMS(PAGENT)’
TcpImage TCPIP2 //’USER.Z24C.TCPPARMS(PAGENTT2)’
This defines common stuff for AT-TLS in //’USER.Z24C.TCPPARMS(TTLSCOM)’, and specific TCPIP image in its own file.
The TCPIP specific file has
TTLSConfig //’USER.Z24C.TCPPARMS(TTLS2)’ FLUSH PURGE
This says the TTLS stuff is in the member TTLS2.
You can have the entry without a file or dataset name.
TTLSConfig FLUSH PURGE
This says use the definition in the CommonTTLSConfig.
You need a TTLSConfig, statement, to get AT-TLS definitions configured in the LPAR.
How to update definitions
So I did not break “production” I created a second TCPIP stack (TCPIP2), and created a configuration within PAGENT for the TCPIP2 stack. (This seems a lot of work just to validate some definitions. I raised an RFE on this, but it was declined).
When I was happy with the definitions, I merged them with the the common/production ones.
When I defined a second TCPIP (TCPIP2), the configuration statements were only parsed, when TCPIP2 was started, and so PAGENT produced the error messages once TCPIP2 was active
PAGENT has started – what next?
Pagent operator commands
You can “modify” the PAGENT address space
- f pagent,loglevel,level=n
- f pagent,trace,level=m
- f pagent,debug,level=d
- f pagent,query
- f pagent,update
What is my configuration?
Once you have configured PAGENT you can use the Unix command
pasearch -c 1>a
to give output like
TCP/IP pasearch CS V2R4 Image Name: TCPIP1 Date: 05/23/2022 Time: 17:34:44 PAPI Version: 14 DLL Version: 14 TTLS Policy Object: ConfigLocation: Local LDAPServer: False CommonFileName: //'USER.Z24C.TCPPARMS(TTLSCOM)' ImageFileName: TCP/IP pasearch CS V2R4 Image Name: TCPIP2 Date: 05/23/2022 Time: 17:34:44 PAPI Version: 14 DLL Version: 14 TTLS Policy Object: ConfigLocation: Local LDAPServer: False CommonFileName: //'USER.Z24C.TCPPARMS(TTLSCOM)' ImageFileName: //'USER.Z24C.TCPPARMS(TTLS2)'
pasearch -p TCPIP2 1>a
gave the configuration for just the TCPIP stack TCPIP2, including
... policyRule: TLSCOM Rule Type: TTLS ... policyRule: TLSCP3 Rule Type: TTLS ... policyRule: TLSCP4 Rule Type: TTLS ...
You get the definitions – but you do not know where they came from. I happen to know that TLSCOM comes from the common definition.
A definition can be in both Common and TCPIP Image files.
Instead of relying on PAGENT to report configuration errors I used the Unix command pasearch to display the configuration.
Display the configuration for a TCPIP image
Use the Unix command pasearch to display the configuration.
pasearch -p TCPIP2 >a
Display the object types configured to PAGENT
pasearch -c 1>a
TCP/IP pasearch CS V2R4 Image Name: TCPIP
Qos Policy Object:…
Ids Policy Object:…
IPSec Policy Object:…
IpFilter Policy Object:…
KeyExchange Policy Object:…
LocalDynVpn Policy Object:…
Routing Policy Object:…
TTLS Policy Object:…
TCP/IP pasearch CS V2R4 Image Name: TCPIP2
TTLS Policy Object:
ConfigLocation: Local LDAPServer: False
ApplyFlush: True PolicyFlush: True
ApplyPurge: True PurgePolicies: True
AtomicParse: True DeleteOnNoflush: False
DummyOnEmptyPolicy: True ModifyOnIDChange: False
Configured: True UpdateInterval: 1800
TTLS Enabled: True
LastPolicyChanged: Tue May 24 07:55:46 2022
PAGENT feels like it not of the standard that I would expect z/OS products to have. For example, you cannot validate changes before making them live, and the changes are only validated when the TCPIP stack is active.
This means you are making unvalidated changes to your production system!