Easy AT-TLS configuration and reporting

This blog post discusses AT-TLS configuration, and my EasyAT-TLS code on GitHub to make it easy to configure AT-TLS and give an compact and useful configuration report.

A lot of documentation is written by experts for experts, from the developer’s viewpoint; rather than by experts for people who just want to get the job done. It feels that AT-TLS exposes to much of the structure of how data is held internally. I found it was very easy to get lost, and configuring it is hard. z/OSMF has a “configure AT-TLS” workflow – but it just creates a workflow of the steps. It did not make it easier to configure AT-TLS.

As a general philosophy, rather than the traditional approach of “here are all the keywords in alphabetical order”, I would provide information in different topics.

  • Keywords that everyone uses, and what you will need to get started,
  • More advanced keywords that most people might use,
  • Even more advanced keywords that only experts would use.

and then list the keywords alphabetically within topic. For an inexperienced user this makes it very clear what options they need to specify to get started. If you have a configuration tool (or online documentation) have a button which allows you to specify what your experience level is, and display or configure the information at the appropriate level.

The problem

What does an AT-TLS configuration file look like?

Part of the definition for one rule is

TTLSRule                      COLATTLJ 
{
LocalPortRange 4000
Jobname COLCOMPI
Userid COLIN
Direction BOTH
RemoteAddr 10.1.2.2/32
TTLSGroupActionRef TNGA
TTLSEnvironmentActionRef TNEA
TTLSConnectionActionRef TNCA
}
TTLSGroupAction TNGA
{
TTLSEnabled ON
}
TTLSEnvironmentAction TNEA
{
HandshakeRole ServerWithClientAuth
TTLSKeyringParms
{
Keyring start1/TN3270
}
TTLSSignatureParmsRef TNESigParms
}
TTLSSignatureParms TNESigParms
{
CLientECurves Any
SignaturePairs 060305030403

}......

There is a lot of structure, with values beginning with TTLS.

Without the structure the data looks like

policyRule : COLATTLJ
LocalAddr : All
RemoteAddr : '10.1.1.2/32'
LocalPortRange : 4000-4000
JobName : COLCOMPI
UserId : COLIN
Direction : Both
TTLSEnabled : On
Trace : 255
HandshakeRole : ServerWithClientAuth
Keyring : start1/TN3270
TLSv1.1 : Off
TLSv1.2 : On
TLSv1.3 : Off
HandshakeTimeout : 3
ClientECurves : Any
ServerCertificateLabel : NISTECCTEST
V3CipherSuites : [
003D TLS_RSA_WITH_AES_256_CBC_SHA256,
C02C TLS_ECDHE_ECDSA_WITH_AES_256_GCM_SHA384,
]

Which I find much clearer.

Once you have configured AT-TLS, you can use the Unix command pasearch to report the configuration. This gives output like

policyRule:             AZFClientRule                                  
Rule Type: TTLS
...
Time Periods:
Day of Month Mask:
First to Last: 1111111111111111111111111111111
Last to First: 1111111111111111111111111111111
Month of Yr Mask: 111111111111
Day of Week Mask: 1111111 (Sunday - Saturday)
Start Date Time: None
End Date Time: None
...
TTLS Condition Summary: NegativeIndicator: Off
Local Address:
FromAddr: All
ToAddr: All
Remote Address:
FromAddr: 0.0.26.137
ToAddr: 0.0.26.137
LocalPortFrom: 0 LocalPortTo: 0
RemotePortFrom: 0 RemotePortTo: 0
JobName: AZF* UserId:
ServiceDirection: Outbound
...

The output for one rule is over 180 lines long, and gives the configuration, including any defaults; not what is actually used. (Some fields can be configured in more than one place, so you need to know which is actually used). Compare this with my tool’s output above.

Easy AT-TLS on GitHub

Displaying a concise view of the data.

On GitHub is code which removes all of the “structure” from the pasearch report to leave just the non default data. You get, for one rule

policyRule : COLATTLJ
LocalAddr : All
RemoteAddr : '10.1.1.2/32'
LocalPortRange : 4000-4000
JobName : COLCOMPI
UserId : COLIN
Direction : Both
TTLSEnabled : On
Trace : 255
HandshakeRole : ServerWithClientAuth
Keyring : start1/TN3270
TLSv1.1 : Off
TLSv1.2 : On
TLSv1.3 : Off
HandshakeTimeout : 3
ClientECurves : Any
ServerCertificateLabel : NISTECCTEST
V3CipherSuites : [
003D TLS_RSA_WITH_AES_256_CBC_SHA256,
C02C TLS_ECDHE_ECDSA_WITH_AES_256_GCM_SHA384,
]

formatted as a YAML file, which I find much easier to use!

The output of the pasearch file was 2296 lines, the output file from my Python program was 184 lines!

Generating AT-TLS definitions

I found it hard to generate AT-TLS definitions because you need to know the AT-TLS structure; or have an existing entry to copy. For example a keyring is defined in a TTLSKeyringParms definition in TTLSEnvironmentAction statement which is pointed to by a high level TTLSRule. See the partial example above. You should not need to know this information.

As an end user I just want to define a keyring, and let the computer put it into the “internal format”.

With AT-TLS, you can group common information into a unit, and refer to that. I found I got lost trying to work out what I was using, and what else used it. Making one small change is complex as you had to do lots of copying to generate new groups, and then you have these groups lying around which may not be used again.

Generating AT-TLS definitions the easy way

The code to extract the key information from a pasearch output produces a YAML file. The genATTLS code takes a YAML file and generates the AT-TLS input file with all of the structure that AT-TLS requires. The output file can then be used by the Policy Agent on z/OS.

I’ve written the code from an end user perspective. For example I have a YAML file which defines two rules.

---
policyRule : Rule1
Priority : 255
LocalAddr : All
RemoteAddr : All
LocalPortRange : 6794-6794
Direction : Inbound
TTLSEnabled : On
Trace : 255
HandshakeRole : ServerWithClientAuth
Keyring : start1/TN3270
TLSv1.1 : Off
TLSv1.2 : On
TLSv1.3 : Off
HandshakeTimeout : 120
ServerCertificateLabel : RSA2048
---
policyRule : Rule2
Priority : 255
LocalAddr : All
RemoteAddr : All
LocalPortRange : 6793-6793
Direction : Inbound
TTLSEnabled : On
Trace : 255
HandshakeRole : Server
Keyring : start1/TN3270
TLSv1.1 : Off
TLSv1.2 : On
TLSv1.3 : Off
CertificateLabel : RSA2048
ServerCertificateLabel : RSA2048

This can be simplified by using “BasedOn” to refer to other sections, such as the “common” rule

---
# This first section is common to the others
policyRule : common
TLSv1.1 : Off
TLSv1.2 : On
TLSv1.3 : Off
Keyring : start1/TN3270
Priority : 255
LocalAddr : All
RemoteAddr : All
Direction : Inbound
TTLSEnabled : On
Trace : 255
CertificateLabel : RSA2048
ServerCertificateLabel : RSA2048
---
policyRule : Rule1
BasedOn : common
LocalPortRange : 6794-6794
HandshakeRole : ServerWithClientAuth
HandshakeTimeout : 120
---
policyRule : Rule2
BasedOn : common
LocalPortRange : 6793-6793
HandshakeRole : Server
---

The Python script generates the definitions from this file, 110 lines of output from a 28 line input file!

It is easy to extend. You just specify the overrides

policyRule : Rule2A
BasedOn : Rule2
TLSv1.3 : On
Priority: 300

The priority:300 says this definition should be used over Rule2, because Rule2 only has priority 255.

Using Activity Trace to show a picture of which queues and queue managers your application used.

I used the midrange MQ activity trace to show what my simple application, putting a request to a cluster server queue and getting the reply, was doing.  As a proof of concept (200 lines of Python), I  produced the following

This output is a .png format.   You can create it as an HTML image, and have the nodes and links as clickable html links.

Ive ignored any SYSTEM.* queues, so the SYSTEM.CLUSTER.TRANSMIT.QUEUE does not appear.

The red arrows show the “high level” flow between queue managers at the “architectural”, hand waving level.

  • The application oemput on QMA did a put to a clustered queue CSERVER, there is an instance of the queue on QMB and QMC.   There is a red line from QMA.oemput to the queue CSERVER on QMB and QMC
  • The server programs, server running on QMB and QMC put the reply message to queue CP0000 on queue manager A

The blue arrows show puts to the application specified queue name – even though this may map to the S.C.T.Q.  There are two blue lines from QMA.oemput because one message went to QMC.CSERVER, and another went to QMB.CSERVER

The yellow lines show the path the message took between queue managers.  The message was put by QMA.oemput to queue CSERVER; under the covers this was put to the SCTQ.  From the accounting trace record this shows the remote queue manager and queue name:  the the yellow line links them.

The black line is getting from the local queue

The green line is the get from the underlying queue.  So if I had a reply queue called CP0000, with a QAlias of QAREPLY. If the application does a get from QAREPLY,  There would be a black line to CP0000, and a green line to QAREPLY

How did I get this?

I used the midrange activity trace.

On QMA I had in mqat.ini

applicationTrace:
ApplClass=USER # Application type
ApplName=oemp* # Application name (may be wildcarded)
Trace=ON # Activity trace switch for application
ActivityInterval=30 # Time interval between trace messages
ActivityCount=10 # Number of operations between trace msgs
TraceLevel=MEDIUM  # Amount of data traced for each operation
TraceMessageData=0 # Amount of message data traced

I turned on the activity trace using the runmqsc command

ALTER QMGR ACTVTRC(ON)

I ran some work load, and turned the trace off few seconds later.

I processed the trace data into a json file using

/opt/mqm/samp/bin/amqsevt -m QMA -q SYSTEM.ADMIN.TRACE.ACTIVITY.QUEUE -w 1 -o json > aa.json

I captured the trace on QMB, then on QMC, so I had three files aa.json, bb.json, cc.json.  Although I captured these at different times, I could have collected them all at the same time.

jq is a “sed” like processor for processing json data.   I used it to process these json files and produce one output file which the Python json support can handle.

jq . --slurp aa.json bb.json cc.json  > all.json

The two small python files are zipped here. AT.

I used ATJson.py python script to process the all.json file and extract out key data in the following format:

server,COLIN,127.0.0.1,QMC,Put1,CP0000,SYSTEM.CLUSTER.TRANSMIT.QUEUE,QMC,CP0000,QMA, 400

  • server, the name of the application program
  • COLIN, the channel name, or “Local”
  • 127.0.0.1, the IP address, or “Local”
  • QMC, on this queue manager
  • Put1, the verb
  • CP0000, the name of the object used by the application
  • SYSTEM.CLUSTER.TRANSMIT.QUEUE, the queue actually used, under the covers
  • QMC, which queue manager is the SCTQ on
  • CP0000, the destination (remote) queue name
  • QMA, the destination queue manager
  • 400 the number of times this was used, so 400 puts to this queue.

I had another python program Process.py which took this table and used python graphviz to draw the graph of the contents.  This produces a file with DOT  (graph descriptor language)parameters, and used one of the many programs to draw the chart.

This shows you what can be done, it is not a turn-key solution, but I am willing to spend a bit of time making it easier to use, so you can automate it.  If so please send me your Activity Trace data, and I’ll see what I can do.