Setting mqweb trace on z/OS and other useful hints on tracing

I spent time trying to track down in the MQWEB server, why my JSON Web Token was not working as I expected . I found it hard to trace the problem, and tried many trace parameters till I found some which provided the information I needed. It was not an easy journey.

This blog entry covers

How to set the trace
The trace files did not have enough information in them
What level of trace do you need – too much info?
Putting your definitions in a separate file
Displaying the trace record so it fits in the window
How I do tracing

How to set the trace

I tried the trace

setmqweb properties -k traceSpec -v ...

described here, but this was slow and had the side effect in that it inserted blank lines to the mqwebuser.xml file, and made each xml entry one long line, and lost all of my nice formatting!

Alternatives

I used the z/OS console command
I edited the mqwebuser.xml file and this did all I wanted.

Using the z/OS console command

For example

f csq9web,LOGGING='*=info:zos.*=finest'

Issuing the command from the console means that the mqwebuser.xml file is not changed, and when you restart the MQWEB server it comes up with what you specified – rather than the last setmqweb command.

Editing the mqwebuser.xml manually

For MQ the trace definition is taken from the variable traceSpec.

For example

<variable 
name="traceSpec" 
value="*=info" 
/>

I just added some more definitions

<variable 
name="traceSpec" 
value="*=info" 
/> 

 <variable 
 name="traceSpec" 
 value="*=info:com.ibm.ws.webcontainer.security.ProviderAuthenticationResult=all" 
 />

The last value is used – so my trace definition was used.

I could easily move these around to get different traces.

When the MQWEB server is restarted, the last definition will be used – so remember to move the normal entry to the end of the definitions (with value=”=info” or similar).

Using a more complex value

The trace entry I was given was over 150 characters long, and was difficult to enter into the file or on the command line. I had to enter it in pieces, and kept getting blanks in the wrong place. You can use symbol substitution

<variable 
name="t1" 
value="*=info" 
/> 
                                                      
<variable 
name="t2" 
value="zos.*=all" 
/> 
<variable 
name="traceSpec" 
value="${t1}:${t2}" 
/>

This produced a trace *=info:zos.=all

Whenever you change the trace, check in the message.log or trace.log file, and fix any problems.

If the update is successful, there should be an entry in the job log, such as

[AUDIT ] CWWKG0017I: The server configuration was successfully updated in 0.265 seconds.

If you do not get this, then check the log files (again).

The trace files did not have enough information in them.

Some of the traces I was given to solve my problem produced thousands of lines of output. It was hard to find the records of interest.

One record was

∇8/9/25, 16:18:52:470 GMT   ∆ 00000102 Authenticatio <  getStatus Exit 
                                 FAILURE

Where Authenticatio is a small part of the trace id.

I specified

 <logging traceFormat="ENHANCED" />

See traceFormat and value ENHANCED.

This gave me

8/9/25, 16:22:37:516 GMT ∆ 000000f1 id=dd803e1d com.ibm.ws.webcontainer.security.AuthenticationResult < getSta…
FAILURE

You can see a more complete trace entry (com.ibm.ws.webcontainer.security.AuthenticationResult ) for the record.

You can now specify this trace entry in the <variable name=”t2″ value=”…” . And restrict which entries you want, for example value=”com.ibm.ws.webcontainer.security.*=all” .

Once I had found which trace records I wanted, I went back traceFormat=”SIMPLE” because the output was easier to read.

Useful trace entries

To get JWT information

io.openliberty.security.*=all

Why JWT faildation failed

org.apache.http.client.*=all

My definitions for these were

name="t1" 
value="*=info" 
/> 

<variable 
name="t3" 
value="org.apache.http.client.*=all" 
/> 
                                                  
<variable 
name="t4" 
value="io.openliberty.security.*=all" 
/> 
                                                  
<variable 
name="traceSpec" 
value="${t1}:${t3}:${t4}" 
/> 

<logging traceFormat="SIMPLE" />

What level of trace do you need – too much info?

A trace entry with value=”io.openliberty.security.*=all” can produce a lot of output.

You may get enough to debug your problem using value=”io.openliberty.security.*=fine“, value=”io.openliberty.security.*=finer“, or value=”io.openliberty.security.*=finest“

Putting your definitions in a separate file

I put some of my definitions in a separate file trace.xml

<server> 
    <!-- always specify this one  --> 
    <variable 
    name="t1" 
    value="*=info" 
    /> 
   ...
</server>

You can incorporate these changes using

<include location="trace.xml"/> 

<!-- and use the definitions -->
<variable 
name="traceSpec2" 
value="${t1}:${t2}:${t3}" 
/>

If you change the trace.xml file, it will not cause MQWEB to reprocess it. You need to make a change (such as change a blank to a blank) to mqwebuser.xml for MQWEB to notice and process the file.

Displaying the trace record so it fits in the window

A trace record can be hundred of characters long – and requires to scroll sideways many pages.

The blog post Processing lines in ASCII files in ISPF edit macros has a useful ISPF edit macro which displays a row from the file, flowed so it fits into the screen width – and as the trace files are in ASCII, converts the output to displayable EBCDIC.

This made looking at the data much easier.

How I do tracing

I am the only person on my z/OS machine, which make debugging problems much easier. With all my attempts to resolve problems, I found my trace log and message log files were getting too large for me to use ISPF edit on them.
Below is how I use the trace files, it generally works.

I edit the file, delete most of the record to leave one or two records, then save it.
I run my test
I edit the file again and it should have only the entries added since the first step.

Note, if you delete the file the logging code, detects the file is deleted, and stops writing to it. If you delete rows, then records are written to the end of the current file.

z/OSMF: what traces are available

I could find no z/OSMF documentation on how to turn on traces. The documentation says “Contact IBM”. Someone pointed out the z/OSMF Diagnostic Assistant. This is an icon on the z/OSMF main screen (https://10.1.1.2:10443/zosmf/ for me).

You need to click on “Add Service” to list the components, before you can change the log level.

Fill in the details, use the Tick box at the start of the line, and the Log Level pull down. Once you have chose them, click on the “Set” button.

You can issue an operator command.

f server-name,logging='trace_specification'

Below is a list of the trace commands from the diagnostic Assistant . The granularity of the trace is warning, info, fine, finer, finest

For SAF security calls use zos.native=finest.

Ive seen this as zos.native.03=…

I also found these – but I dont know what they mean!

com.ibm.ccc.=ALL:
com.ibm.crypto.=all:
com.ibm.websphere.security.*=all:
com.ibm.ws.security.=ALL:
com.ibm.ws.webcontainer.=all:
com.ibm.wsspi.webcontainer.*=all:
HTTPChannel=all:
GenericBNF=all:
zos.native.03=all – this give information about a subset of z/OS calls
com.ibm.websphere.security.jwt=all

SAF security calls: zos.native=finest

Produces output like (formatted for display)

6/23/25, 16:39:20:192 GMT?] 00000028 id=00000000 zos.native.03.001 
 Trace: 6/23/25, 16:39:20:192 GMT? t=8c8140 key=S2 (300100f) 
  Description: RACROUTE REQUEST=FASTAUTH call 
  racrouteArea_p: 000000007e4b0c10 
6/23/25, 16:39:20:193 GMT?] 00000028 id=00000000 zos.native.03.001 
 Trace: 6/23/25, 16:39:20:193 GMT? t=8c8140 key=S2 (3001010) 
  Description: RACROUTE REQUEST=FASTAUTH return 
  returnCode: 0 
  safReturnCode: 0 
  racfReturnCode: 0 
  racfReasonCode: 4 
6/23/25, 16:39:20:194 GMT?] 00000028 id=00000000 zos.native.03.001 
 Trace: 6/23/25, 16:39:20:194 GMT? t=8c8140 key=S2 (3001012) 
  Description: Exit: checkAuthorizationFast 
  returnCode: 0 
6/23/25, 16:39:20:195 GMT?] 00000028 id=00000000 zos.native.02.008 
 Trace: 6/23/25, 16:39:20:195 GMT? t=8c8140 key=S2 (2008005) 
  Description: Entry: registrySetUnused 
  alreadyVerified: true 
  token:   data_address=00000051_2a6103b8, data_length=64 
 +--------------------------------------------------------------------------+ 
 |OSet| A=000000512a6103b8 Length=0000040 |     EBCDIC     |     ASCII      | 
 +----+-----------------------------------+----------------+----------------+ 
 |0000|C2C2C7E9 D9C5C7E3 00000001 00000040|BBGZREGT....... |...............@| 
 |0010|00000000 7DC72100 00000007 00000000|....'G..........|....}.!.........| 
 |0020|01000000 00000000 00000000 00000000|................|................| 
 |0030|00000000 00000000 00000000 00000000|................|................| 
 +--------------------------------------------------------------------------+

It is not very well written, because there are “safReturnCode:”, “SAF return code:”,”return code:” and “returnCode:”, and “rc:”. I found it easiest to use the following in an edit session to find non zero return codes.

x all
f ‘code: 0’ all
del all x
f ‘code:’ all

The trace entry ” token: data_address=00000051_2a6103b8, data_length=64 ” shows the value with name “token” at the specified address, and length. It is displayed in hex, EBCDIC and ASCII.

Trace points.

Taken from a trace file

zos.native.02.002 getConsoleCommand
zos.native.02.006 CommandProcessor.ntv_issueCommandResponse
zos.native.02.00d DeleteWorkUnit*
zos.native.02.00e wlm_enclave_*
zos.native.02.00e wlm_enclave_create leave
zos.native.03.001 RACROUTE REQUEST=FASTAUTH
zos.native.03.003 CertificateCredential…
zos.native.03.004 ntv_createCertificateCredential
zos.native.03.005 checkAccess
zos.native.03.006 ntv_checkAccess
zos.native.03.007 invokeIRRSIA00 createACEE
zos.native.03.008 PenaltyBox
zos.native.03.00b getGroupsForUser
zos.native.04.002 write_to_operator_response
zos.native.04.004 Latch
zos.native.04.007 getStck

Using Java internal trace

I was trying to understand a class load problem, and needed to use Java internal virtual machine trace to see what was happening.

You specify -Xtrace options at start up. Even with limiting options, I had over 2 millions lines of trace output – and the Java application had not fully started!

To get all of the parameters in a trace, you need to specify -Xtrace:maximal=..
To get the trace points, and not all of the parameters you need to specify -Xtrace:minimal=…
The trace output went to a file -Xtrace:…,output=/u/tmp/java/trace.bin

There are lots of components you can specify see here. I used

 -Xtrace:maximal={j9jcl,j9shr,j9prt,omrport},output=/u/tmp/java/trace.bin

where

j9jcl is Java class libraries
j9shr is shared classes libraries
j9prt is VM port libraries… the platform specific (ported) code
omrport is the OMR ported code. (OMR is a set of common components, such as threads and garbage collection)

I haven’t used it,but you can say do something when this trace point is triggered, stop (suspend) tracing.

-Xtrace:trigger=tpnid{abc.123,suspend}

In the example trace output

07:49:24.699384689 0x000000002167f200 omrport.657 Event J9 VM NLS message: Failed to find class org/springframework/boot/loader/launch/PropertiesLauncher in shared cache for class-loader id 0.

where the trace point is omrport.657

You can put the definitions in a file -Xtrace:properties=<filename>

I have a trace file – now what?

You can format the trace file on z/OS, but I downloaded the file (in binary) to my Linux machine, because a) I have more disk space, 2) the java formatting is faster, 3) I have better tools to look at the data. You need the save level of Java (or better) to format all the fields. If there is a mismatch, some fields will not be formatted. The traceformat command is described here.

Format the data

/home/colinpaice/Downloads/jdk-21.0.6+7/bin/traceformat -verbose trace.bin trace.txt

It produced

Writing formatted trace output to file trace.txt
Processing 245.45312Mb of binary trace data
Processed 10.0Mb (4%), burst speed: 1.9344624Mb/s, average: 1.9344624Mb/s
Processed 20.0Mb (8%), burst speed: 2.4675941Mb/s, average: 2.1687446Mb/s
Processed 30.0Mb (12%), burst speed: 3.4868424Mb/s, average: 2.4814205Mb/s
...

I wrote the blog post Finding the needle in a field of hay, to help extract the records of interest.

On the traceformat command you can specify

-indent: Indents trace messages at each Entry trace point and outdents trace messages at each Exit trace point. The default is not to indent the messages.
-threads: Filters the output for the given thread IDs only. thread id is the ID of the thread, which can be specified in decimal or hex (0x) format. Any number of thread IDs can be specified, separated by commas.

Looking at the data

I used the commands grep -n …., tail -n +999, head -n 500, less as described here. But for detailed analysis I piped the output into a file, and used an editor on the output. This meant the lines did not wrap, I could annotate (and save) the trace, and highlight interesting records.

Because some of the lines are very long, I split interesting lines so all of the data was displayed in the window, and I did not need to scroll.

Using and debugging RACF CLASS(APPL) and pthread_security_np

I’ve blogged I want to be someone else – or using pthread_security_np, how a server can be passed a userid or password to logon to a server to do some work. For example I was logging on from a web server to the RMF GPMSERVE for displaying RMF reports in a a web browser.

I had followed the documentation, but all my userids had access, when none of them should have done. This blog post covers the steps I used to dig into this.

The RACF calls used, have a flag set “Suppress any RACF Messages”, which makes it harder to diagnose problems.

With most RACF calls you can define a profile

ADDSD 'COLIN.PROT.DATASET' UACC(NONE) WARNING

where the WARNING option says “If the userid does not have access – give the userid access, but write a message on the console”. This allows you to find when you need to grant access. Once the profile is established, you can use the ALTSD … NOWARNING. Userids requesting access, but which are not permitted will now fail.

Defining a profile with CLASS(APPL), the warning has no effect. I had to use NOTIFY(COLIN) to get notified of problems.

Tracing the requests

To trace all of the RACF calls made by my job COLINNT I issued

#set trace(callable(all),racroute(all),jobname(COLINNT))

This gave me many hundreds of calls.

Once I had been through the whole process, I found the trace for the pthread_security_np etc calls was just

set trace(callable(type(38)),jobname(COLINNT))

Collect a trace

See Collecting and understanding a RACF GTF trace output.

Looking at the trace output

There is a trace record “before” (PRE) matching “after” (POST). Many parameters are the same (as you might expect).

The output of these traces is verbose, with unnecessary information and with extra blanks lines. For example for one parameter

   Area length:                  00000008 
                                                                                                                         
   Area value: 
   D6C6C6E2  C5E30000                                | OFFSET..                         | 
                                                                                                                         
   Area length:                  00000004 
                                                                                                                         
   Area value: 
   00000000                                         | ....                             | 
                                                                                                                         
   Area length:                  00000008

In the description below, I’ve squashed this down to a single line

area length  8  OFFSET0000 length 4 00000000

It looks like the field OFFSET0000 has the offset in hex from somewhere. I didn’t find this information useful.

A compressed trace record is given below. For the interpretation of the fields see the following trace record.

RTRACE
OMVSPRE
Service number 00000026
Parameters
area length x6c  
area length  8  OFFSET0000 length 4 00000000
area length  8  OFFSET0004 length 4 00000000
area length  8  OFFSET0008 length 4 00000000
area length  8  OFFSET000C length 4 00000000
area length  8  OFFSET0010 length 4 40404040
area length  8  OFFSET0014 length 4 00000000
area length  8  OFFSET0018 length 4 40404040
area length  8  OFFSET001C length 1 01
area length  8  OFFSET0020 length 4 C4800000
area length  8  OFFSET0024 length 6 05c1c2c3c4C2 =  .ABCDB
area length  8  OFFSET0028 length 4 00000000 
area length  8  OFFSET002C length 9 08C7D7D4  E2C5D9E5  C5   = .GPMSERVE
Internal information
area length  8  OFFSET0034 length x37 = "Server Userid=IBMUSER created a full ACEE"
area length  8  OFFSET0048 length 4 00000000
area length  8  OFFSET0050 length 9 08000000  00000000  00
area length  8  OFFSET0054 length 1 00 
area length  8  OFFSET0018 length 4 40404040
Internal data
area length xc0 ACEE
area length x50 userid information
area length x90 ACEX
area length x50 USP

hex dump of record

The RACF commands documentation says for a callable service, 0x00000026 is function number for IRRSIA00. This page gives the name of the function name IRRSIA00 and the description z/OS kernel on behalf of servers that use pthread_security_np servers or __login, or MVS servers that do not use z/OS UNIX services.

The parameters are for the initACEE (IRRSIA00) call.

The matching “after” record, OMVSPOST was (with the parameters of the the IRRSIA00 call format, parameters) are

RTRACE
OMVSPOST
Service number:               00000026 
RACF Return code:             00000008 
RACF Reason code:             00000020 
Parameters
area length  8  OFFSET0000 length 4 00000000 >work_area<
area length  8  OFFSET0004 length 4 00000000 >ALET<
area length  8  OFFSET0008 length 4 00000008 >SAF_return_code<
area length  8  OFFSET000C length 4 00000000 >ALET< 
area length  8  OFFSET0010 length 4 00000008 >RACF_return_code<
area length  8  OFFSET0014 length 4 00000000 >ALET<
area length  8  OFFSET0018 length 4 00000020 >RACF_reason_code<
area length  8  OFFSET001C length 1 01       >Function_code 1 = Create an ACEE<
area length  8  OFFSET0020 length 4 C4800000 >Attributes - see below<
area length  8  OFFSET0024 length 6 05c1c2c3c4C2  >Userid .ABCDB<
area length  8  OFFSET0028 length 4 00000000 >ACEE_pointer<
area length  8  OFFSET002C length 9 08C7D7D4  E2C5D9E5  C5  >APPLID = .GPMSERVE<
Internal information
area length  8  OFFSET0034 length x37 = "Server Userid=IBMUSER created a full ACEE"
area length  8  OFFSET0048 length 4 00000000
area length  8  OFFSET0050 length 9 08000000  00000000  00
area length  8  OFFSET0054 length 1 00 
area length  8  OFFSET0018 length 4 40404040
Internal data
area length xc0 ACEE
area length x50 userid information
area length x90 ACEX
area length x50 USP

There the attributes C480000 mean

X80000000 – Create the ACEE
X40000000 – Createthe USP for the userid
X04000000 – Suppress any RACF Messages
X00800000 – Return an OUSP in the output area

Note the flag:X04000000 – Suppress any RACF Messages.

The return code

area length  8  OFFSET0008 length 4 00000008 >SAF_return_code<
...
area length  8  OFFSET0010 length 4 00000008 >RACF_return_code<
area length  8  OFFSET0014 length 4 00000000 >ALET<
area length  8  OFFSET0018 length 4 00000020 >RACF_reason_code<

8,8,32 means The user does not have appropriate RACF access to either the SECLABEL, SERVAUTH profile, or APPL specified in the parmlist.

For other RACF services the trace entries follow a similar format.

Collecting and understanding a RACF GTF trace output

A RACF request can be

A callable service, such as IRRSDL00 which is used to extract information about keyrings. This can be used by High Level Languages. These have a trace type of OMVS
A RACROUTE request, an assembler macro interface, such as FASTAUTH. These have a trace type of RACF.

You get a trace entry PRE the call, and and a trace entry POST the call. A callable service would have an OMVSPRE, and an OMVSPOST entry.

When setting up your RACF trace you need to define which callable service, or which RACROUTE requests you want to trace.

Example of a callable server

Entry trace

Trace Identifier:             00000036 
Record Eyecatcher:            RTRACE 
Trace Type:                   OMVSPRE 
Ending Sequence:              ........ 
Calling address:              00000000  20801A0F 
Requestor/Subsystem:          ........  ........ 
Primary jobname:              IBMINC5 
Primary asid:                 00000029 
Primary ACEEP:                00000000  008FA8A8 
Home jobname:                 IBMINC5 
Home asid:                    00000029 
Home ACEEP:                   00000000  008FA8A8 
Task address:                 00000000  008D6A88 
Task ACEEP:                   00000000  00000000 
Time:                         DFF36A27  5460CC00 
Error class:                  ........ 
Service number:               00000029 
RACF Return code:             00000000 
RACF Reason code:             01000000 
Return area address:          00000000  00000000 
Parameter count:              00000021

Interesting information

Trace Type: OMVSPRE this is an entry trace
Primary jobname: IBMINC5
Home jobname: IBMINC5
The RACF return and reason code have no meaning on entry
Service number: 00000029 is hex so the service is 41. The RACF commands documentation says for a callable service, 41 is function number IRRSDL00.

Exit trace

Trace Identifier:             00000036 
Record Eyecatcher:            RTRACE 
Trace Type:                   OMVSPOST 
Ending Sequence:              ........ 
Calling address:              00000000  20801A0F 
Requestor/Subsystem:          ........  ........ 
Primary jobname:              IBMINC5 
Primary asid:                 00000029 
Primary ACEEP:                00000000  008FA8A8 
Home jobname:                 IBMINC5 
Home asid:                    00000029 
Home ACEEP:                   00000000  008FA8A8 
Task address:                 00000000  008D6A88 
Task ACEEP:                   00000000  00000000 
Time:                         DFF36A27  5468DC40 
Error class:                  ........ 
Service number:               00000029 
RACF Return code:             00000008 
RACF Reason code:             00000014 
Return area address:          00000000  00000000 
Parameter count:              00000021

Interesting fields

Trace Type: OMVSPOST this is the exit trace
As above service number 41 is function
The RACF Return code is 8
The RACF reason code is 14

Example of a RACROUTE trace

Entry

Trace Identifier:             00000036 
Record Eyecatcher:            RTRACE 
Trace Type:                   RACFPRE 
Ending Sequence:              ........ 
Calling address:              00000000  A096A08A 
Requestor/Subsystem:          CRYPTO    CRYPTO 
Primary jobname:              CSF 
Primary asid:                 00000038 
Primary ACEEP:                00000000  008F6B98 
Home jobname:                 IBMEXP 
Home asid:                    00000029 
Home ACEEP:                   00000000  008FA8A8 
Task address:                 00000000  008D6A88 
Task ACEEP:                   00000000  00000000 
Time:                         DFF37DD1  AB7FF040 
Error class:                  ........ 
Service number:               00000002 
RACF Return code:             00000000 
RACF Reason code:             00000000 
Return area address:          00000000  00000000 
Parameter count:              00000009

This trace entry was from a batch job IBMEXP, using an ICSF service to access an encryption key.

Interesting fields

RACFPRE this is a RACROUTE before entry
Requestor/Subsystem: CRYPTO which z/OS component
Primary jobname: CSF This was the address space which issued the RACROUTE request
Home jobname: IBMEXP This was the job requesting an ICSF service
Service number: 2. The RACF commands documentation says for a RACROUTE service, 2 is FASTAUTH
Ignore RACF return and RACF reason codes. They are set on exit,

Exit

Trace Identifier:             00000036 
Record Eyecatcher:            RTRACE 
Trace Type:                   RACFPOST 
Ending Sequence:              ........ 
Calling address:              00000000  A096A08A 
Requestor/Subsystem:          CRYPTO    CRYPTO 
Primary jobname:              CSF 
Primary asid:                 00000038 
Primary ACEEP:                00000000  008F6B98 
Home jobname:                 IBMEXP 
Home asid:                    00000029 
Home ACEEP:                   00000000  008FA8A8 
Task address:                 00000000  008D6A88 
Task ACEEP:                   00000000  00000000 
Time:                         DFF37DD1  AB85DE40 
Error class:                  ........ 
Service number:               00000002 
RACF Return code:             00000008 
RACF Reason code:             00000000 
Return area address:          00000000  00000000 
Parameter count:              00000009

This trace entry was from a batch job IBMEXP, using an ICSF service to access an encryption key.

Interesting fields

RACFPOST this is a RACROUTE after exit entry
Requestor/Subsystem: CRYPTO which z/OS component
Primary jobname: CSF This was the address space which issued the RACROUTE request
Home jobname: IBMEXP This was the job requesting an ICSF service
Service number: 2. The RACF commands documentation says for a RACFROUTE service, 2 is FASTAUTH
RACF Return code: 00000008, RACF Reason code: 00000000 This shows there was a problem. The documentation for FASTAUTH shows
- RC=8 -> the user or group is not authorized to use the resource.
- RS=0 -> The invoker does not need to log the request.

The RACF trace command for this was

#SET TRACE(JOBNAME(csf),callable(none),RACROUTE(type(2)))

and the following commands

S GTF.GTF
R 1,trace=usrp
R 2,USR=(F44) 
R 3,END
R 4,U

Fast start GTF

I can use

S GTF.GTF,M=GTFRACF

Where my GTF proc has

//GTFNEW  PROC M=GTFPARM,ID=SYS1 
//DELETE  EXEC PGM=IEFBR14 
//IEFRDER DD   DSNAME=&ID..TRACE,UNIT=SYSDA,SPACE=(TRK,20),           
//             DISP=(MOD,DELETE) 
//IEFPROC EXEC PGM=AHLGTF,PARM='MODE=EXT,DEBUG=NO,TIME=YES,NP', 
//  TIME=1440,REGION=2880K 
//IEFRDER DD   DSNAME=&ID..TRACE,UNIT=SYSDA,SPACE=(TRK,20),     
//             DISP=(NEW,KEEP) 
//SYSLIB  DD   DSNAME=USER.Z24C.PROCLIB(&M),DISP=SHR

I have a member in USER.Z24C.PROCLIB(GTFRACF)

TRACE=USRP 
USR=(F44) 
END

Tracing AT-TLS on z/OS

AT-TLS on z/OS provides TLS support for applications by magically inserting itself into an application using TCP/IP, without changing the application.

You can collect a trace of AT-TLS starting up, but I was interested in tracing the handshake.

If syslogd (system wide program for collecting log data) is active, then trace will be written to the Unix file system.
if syslogd is not active then the data is written to syslog.
You can configure it so errors get written to syslog and syslogs.

My server COLATTLS started task is a program acting as a TCP/IP program, with Program Control, so my application gets to interface with AT-TLS, extract information and control the connection.

AT-TLS definitions

In my AT-TLS definitions I had

TLSConnectionAction              TNCA 
{ 
  TTLSCipherParmsRef              TLS13TLS12 
  TTLSSignatureParmsRef           TNESigParms 
  TTLSConnectionAdvancedParmsRef  TNCOonAdvParms 
  CtraceClearText                 Off 
  Trace                          255 
}

This trace statement traces everything. See below for a description of what is traced.

Using syslogd

Syslogd is a daemon for applications, they write data to syslogd, and you configure syslogd to define where the output goes to.

My syslog JCL started task procedure is:

//SYSLOGD PROC 
//* Read parms from /etc/syslog.conf 
//CONFHFS EXEC PGM=SYSLOGD,REGION=4096K,TIME=NOLIMIT, 
//         PARM='ENVAR("_CEE_ENVFILE_S=DD:STDENV")/-c -i       ' 
//STDENV   DD DUMMY 
//SYSPRINT DD SYSOUT=* 
//SYSIN    DD DUMMY 
//SYSERR   DD SYSOUT=* 
//SYSOUT   DD SYSOUT=* 
//CEEDUMP  DD SYSOUT=*

This reads its control statements from /etc/syslog.conf (the default). See Configuring the syslog daemon. My file has

*.INETD*.*.*       /var/log/inetd 
auth.* /var/log/auth 
mail.* /var/log//mail -F 640 -D 770 
local1.err       /var/log/local1 
*.err            /var/log/errors 
*.CPAGENT.*.*       /var/log/CPAGENT.%Y.%m.%d  
*.TTLS*.*.*          /var/log/TTLS.%Y.%m.%d  
*.Pagent.*.*        /var/log/Pagent.%Y.%m.%d  
*.TCPIP.*.debug     /var/log/TCPIPdebug.%Y.%m.%d  
*.TCPIP.*.warning   /var/log/TCPIP.%Y.%m.%d  
*.TCPIP.*.err       /var/log/TCPIPerr.%Y.%m.%d  
*.TCPIP.*.info      /var/log/TCPIPinfo.%Y.%m.%d  
*.SYSLOGD*.*.*      /var/log/syslogd.%Y.%m.%d  
*.TN3270*.*.*       /var/log/tn3270 
*.SSHD*.*.*         /var/log/SSHD

The output for *.TCPIP.*.debug goes to a file like /var/log/TCPIPdebug.2023.04.03

The configuration says, for example,

the output from TCPIP, with priority code debug or less goes to file /var/log/TCPIPdebug…
the output from TCPIP, with priority code info or less goes to file /var/log/TCPINFO…
the output from TN3270 goes to /var/log/tn3270 – for all priorities.

Because “debug” is debug or lower, the file will also contain the “info” messages. Some messages are written to multiple files.

Note: although my application started task was called COLATTLS, the ATTLS trace came out from job TCPIP, not COLATTLS.

AT-TLS trace

The trace for application is configured with the TRACE option in definitions. The documentation says (TTLSGroupAction, and TLSEnvironmentAction):

Trace

Specifies the level of AT-TLS tracing. The valid values for n are in the range 0 – 255. The sum of the numbers associated with each level of tracing selected is the value that should be specified as n. If n is an odd number, errors are written to joblog and all other configured traces are sent to syslogd. If this value is specified on the TTLSEnvironmentAction statement, it is used instead of the value from the TTLSGroupAction statement referenced by the same TTLSRule statement.

0 – No tracing is enabled.
1 (Error) – Errors are traced to the TCP/IP joblog
2 (Error) – Errors are traced to syslogd. The messages are issued with syslogd priority code err.
4 (Info) – Tracing of when a connection is mapped to an AT-TLS rule and when a secure connection is successfully initiated is enabled. The messages are issued with syslogd priority code info.
8 (Event) – Tracing of major events is enabled. The messages are issued with syslogd priority code debug.
16 (Flow) – Tracing of system SSL calls is enabled. The messages are issued with syslogd priority code debug.
32 (Data) – Tracing of encrypted negotiation and headers is enabled. This traces the negotiation of secure sessions. The messages are issued with syslogd priority code debug.

This means that if tracing the negotiation, it will be written with priority debug. From the *.TCPIP.debug statement in my syslogd definitions, the output will be written to /var/log/TCPIPdebug… .

Info output

The information in the info output looks like (two records for one connections):

Apr 2 17:25:53 S0W1 TTLS[16842781]: 17:25:53 TCPIP
EZD1281I TTLS Map CONNID: 00000032 LOCAL: 10.1.1.2..4000
REMOTE: 10.1.0.2..60742 JOBNAME: COLATTLS USERID: START1
TYPE: InBound STATUS: Appl Control RULE: COLATTLJ ACTIONS:
TNGA TNEA TNCA

This gives information on which rule was selected. For example it gives the local and remote ip address and port; job name and userid. It shows that rule COLATTLJ was used with group TNGA, environment TNEA, and connection TNCA .

Apr 2 17:25:53 S0W1 TTLS[16842781]: 17:25:53 TCPIP
EZD1283I TTLS Event GRPID: 00000007 ENVID: 00000003 CONNID: 00000032
RC: 0 Initial Handshake 0000005011440BB0
0000005011422870 TLSV1.2 C02C

This shows that for the same session (TTLS[16842781]) the initial handshake agreed on the TLS level conversation was at TLS V1.2 and the cipher spec(C02C).

Debug output

For one connection, there were over 130 lines out output in the file.

Some example lines are

EZD1283I TTLS Event GRPID: 00000007 ENVID: 00000000 CONNID: 00000032
RC: 0 Connection Init

EZD1284I TTLS Flow GRPID: 00000007 ENVID: 00000004 CONNID: 00000032
RC: 0 Set GSK_KEYRING_FILE(201) start1/TN3270

…

EZD1282I TTLS Start GRPID: 00000007 ENVID: 00000003 CONNID:
00000032 Initial Handshake ACTIONS: TNGA TNEA TNCA
HS-ServerWithClientAuth

EZD1285I TTLS Data CONNID: 00000032 RECV CIPHER 1603010116
EZD1285I TTLS Data CONNID: 00000032 RECV CIPHER 0100011203031FDDC…
EZD1285I TTLS Data CONNID: 00000032 SEND CIPHER16030309BC0200005…

… RC: 0 Call GSK_SECURE_SOCKET_INIT – 0000005011440BB0
… RC: 0 Get GSK_CONNECT_SEC_TYPE(208) – TLSV1.2
… RC: 0 Get GSK_CONNECT_CIPHER_SPEC(207) – C02C

You get

Events – (trace 8 event)
the traffic data flowing up and down (trace 32 data)
the System SSL calls (with return code) (trace 16 flow)

Trace output on syslog – when syslogd not active

Having AT-TLS writing to syslog is not a good idea – it can produce a lot of output. It may be acceptable on a small, low activity, system, tracing the minimum amount of data.

IEF403I COLATTLS - STARTED - TIME=17.16.51                             
BPXF024I (TCPIP) Apr  2 17:17:03 TTLS 16842781 : 17:17:03 TCPIP 
EZD1281I TTLS Map   CONNID: 0000002F LOCAL: 10.1.1.2..4000 REMOTE:     
10.1.0.2..43012 JOBNAME: COLATTLS USERID: START1 TYPE: InBound         
STATUS: Appl Control RULE: COLATTLJ ACTIONS: TNGA TNEA TNCA            
BPXF024I (TCPIP) Apr  2 17:17:03 TTLS 16842781 : 17:17:03 TCPIP 
EZD1283I TTLS Event GRPID: 00000007 ENVID: 00000000 CONNID: 0000002F   
RC:    0 Connection Init                                               
BPXF024I (TCPIP) Apr  2 17:17:03 TTLS 16842781 : 17:17:03 TCPIP 
EZD1282I TTLS Start GRPID: 00000007 ENVID: 00000001 CONNID: 00000000   
Environment Create ACTIONS: TNGA TNEA **N/A**                          
BPXF024I (TCPIP) Apr  2 17:17:03 TTLS 16842781 : 17:17:03 TCPIP 
EZD1283I TTLS Event GRPID: 00000007 ENVID: 00000002 CONNID: 00000000   
RC:    0 Environment Master Create 00000001                            
BPXF024I (TCPIP) Apr  2 17:17:03 TTLS 16842781 : 17:17:03 TCPIP    
EZD1284I TTLS Flow  GRPID: 00000007 ENVID: 00000002 CONNID: 0000002F   
RC:    0 Call GSK_ENVIRONMENT_OPEN - 0000005011421D10
...

The output was produced with AT-TLS trace was enable, and ATTLS was not using the syslogd daemon.

The text in bold is the initial trace entry.

BPXF024I (TCPIP) Apr 2 17:17:03 TTLS 16842781 : 17:17:03 TCPIP is written because syslogd is not being used.
EZD1281I TTLS Map CONNID: 0000002F LOCAL: 10.1.1.2..4000 REMOTE: 10.1.0.2..43012 JOBNAME: COLATTLS USERID: START1 TYPE: InBound provides information about which AT-TLS rule is being used for the connection.
EZD1284I TTLS Flow GRPID: 00000007 ENVID: 00000002 CONNID: 0000002F RC: 0 Call GSK_ENVIRONMENT_OPEN – 0000005011421D10 shows you information about the system ssl call being used.

mqweb – how to get a chrome browser trace

How to get a chrome trace

See Troubleshooting Chrome network issues and the description here on how to collect a trace.

Open a tab with the chrome://net-export/ url.
Click start logging to disk
Select a file location
In another tab select the mqweb url
Click on the “stop” button in the window
If you select show file – it opens the json file. This has all the information you need to process the file, but it is much easier to use the provided tools
The filename is given for example “FILE: /home/colinpaice/Downloads/chrome-net-export-log.json“
Click on “The log file can be loaded using the netlog_viewer.” link. This gets you to a page which says
This app loads NetLog files generated by Chromium’s chrome://net-export. Log data is processed and visualized entirely on the client side (your browser). Data is never uploaded to a remote endpoint.
Select https://netlog-viewer.appspot.com/ to invoke the formatter.
Drag your netlog file, or use “choose file”
Select events, and this displays all of the traffic
In the search bar at the top enter your port 9443, or error
You get a list like

NONE	HOST_RESOLVER_IMPL_REQUEST
1083	URL_REQUEST	https://127.0.0.1:9443/ibmmq/console/
1084	DISK_CACHE_ENTRY
1085	HTTP_STREAM_JOB_CONTROLLER	https://127.0.0.1:9443/
1086	HTTP_STREAM_JOB	https://127.0.0.1:9443/

If the background is pale green – it is good. If it is pink (pale red) there was a problem.

Click on a line and it displays trace information in a window. For example the first URL_REQUEST gave

t= 8 [st= 8]        HTTP_STREAM_JOB_CONTROLLER_BOUND
                    --> source_dependency = 1089 (HTTP_STREAM_JOB_CONTROLLER)
t=65 [st=65]        HTTP_STREAM_REQUEST_BOUND_TO_JOB
                    --> source_dependency = 1090 (HTTP_STREAM_JOB)
t=65 [st=65]     -HTTP_STREAM_REQUEST
t=65 [st=65]      URL_REQUEST_DELEGATE_SSL_CERTIFICATE_ERROR  [dt=1]
t=66 [st=66]      CANCELLED
                  --> net_error = -200 (ERR_CERT_COMMON_NAME_INVALID)
t=66 [st=66]   -URL_REQUEST_START_JOB
                --> net_error = -200 (ERR_CERT_COMMON_NAME_INVALID)
t=66 [st=66]    URL_REQUEST_DELEGATE_RESPONSE_STARTED  [dt=0]
t=66 [st=66] -REQUEST_ALIVE

SSL_CONNECT_JOB gave me

1087: SSL_CONNECT_JOB
ssl/127.0.0.1:9443
Start Time: 2020-01-29 08:41:25.699
t= 1 [st= 0] +CONNECT_JOB  [dt=64]
t= 1 [st= 0]    SOCKET_POOL_CONNECT_JOB_CREATED
                --> backup_job = false
                --> group_id = "ssl/127.0.0.1:9443"
t= 1 [st= 0]   +SSL_CONNECT_JOB_CONNECT  [dt=64]
t= 1 [st= 0]     +TRANSPORT_CONNECT_JOB_CONNECT  [dt=0]
t= 1 [st= 0]        HOST_RESOLVER_IMPL_REQUEST  [dt=0]
                    --> address_family = 0
                    --> allow_cached_response = true
                    --> host = "127.0.0.1:9443"
                    --> is_speculative = false
t= 1 [st= 0]        CONNECT_JOB_SET_SOCKET
t= 1 [st= 0]     -TRANSPORT_CONNECT_JOB_CONNECT
t=65 [st=64]      CONNECT_JOB_SET_SOCKET
t=65 [st=64]   -SSL_CONNECT_JOB_CONNECT
                --> net_error = -200 (ERR_CERT_COMMON_NAME_INVALID)
t=65 [st=64] -CONNECT_JOB

Understanding Chromium trace and performance data

I found this link very useful to explain the developer information, such as trace, performance etc.

Can I dynamically turn on JMS trace? Yes!

I needed to do this, and Paul Titheridge from the MQ change team pointed me to some information on how to do this – thank you Paul!

From MQ V8 onwards it is possible to turn on trace dynamically using a utility called traceControl. This is documented here.

Paul has also written a blog post on this, to show how it is used in practice. See here.