Adding a data set to an existing DDNAME in TSO.

I wanted to add a data set to the already allocated ISPTLIB concatenation. You can use the TSO ALLOCate command to allocate a list of data sets, but not to add a data set to an existing definition.

Lionel B. Dyck pointed me to the TSO function bpxwdyn.

When I logon to TSO I invoke a userid.ZLOGON.REXX data set

/* Rexx */                                                              

address TSO
userid = userid()
dsn= userid".S0W1.ISPF.ISPPROF"
req = "ALLOC FI(tmp) DA('"dsn"') SHR "
if bpxwdyn(req ) =0 then
call bpxwdyn "concat ddlist(ISPTLIB,tmp) "

"ispf"
  • The bpxwdyn(req ) allocates the dataset to the DDNAME TMP.
  • The call bpxwdyn “concat ddlist(ISPTLIB,tmp) copies the data set(s) in the tmp DDNAME to the end of the ISPTLIB DDNAME
  • ispf starts ISPF.

The TSO ISRDDN command gave me

                          Current Data Set Allocations           Row 68 of 122
Command ===> Scroll ===> CSR

Volume Disposition Act DDname Data Set Name Actions: B E V M F C I Q
B3RES1 SHR,KEEP > ISPTLIB ISP.SISPTENU
...
A4USR1 SHR,KEEP > COLIN.S0W1.ISPF.ISPPROF

Easy once you know how.

On the CBTAPE are KONCAT and CONCAT which do a similar function.

Using the Java Health centre for looking into Z/OSMF, MQWEB and other Liberty products.

The Java Health centre has an agent running in the JVM of interest, and there is Eclipse plug-in to display the data.

A Java server such as Liberty ( as used in z/OSMF, z/OSMF and MQWEB) can provide information on how the server is running. I was running MQWEB with Openj9, Java 21 (Semeru).

You need to configure the Liberty server and have something to process the data such as Health Center running on Eclipse.

You can display information in graphical time line format, such as

  • CPU used, system and application as used by the JVM
  • Which classes are being used
  • The environment – such as the parameters used to start the JVM
  • Garbage collection activity
  • I/O – number of files open, and open activity
  • Method profiling
  • Threads in use.

Configure the Eclipse

I installed Health Center from the Market place.

How to collect the data

You can configure the JVM in different modes:

  • headless – data is collected and written to the local file system
  • collect from the start – and view in Eclipse, this means you get all of the Java class loading activity
  • start collecting only after Eclipse has started, and connected to the JVM. I use this method. I start my server, and run a workload to “warm up the JVM” then use Eclipse to show the activity due to my testing.

Configure the JVM server

The options are listed here.

You can specify the JVM options on the command line or the jvm.options file.

You can specify them on the -Xhealthcenter:… statement, or as

-Dcom.ibm.diagnostics.healthcenter...=... 

values. For example

-Xhealthcenter:level=off,readonly=off,jmx=on,port=1972 

or

-Xhealthcenter:level=off
-Dcom.ibm.java.diagnostics.healthcenter.agent.port=1972
-Dcom.ibm.diagnostics.healthcenter.jmx=on
-Dcom.ibm.diagnostics.healthcenter.readonly=on

To run headless

In the server

I added the following to my jvm.options

-Xhealthcenter:level=headless 
-Dcom.ibm.java.diagnostics.healthcenter.headless.delay.start=2
-Dcom.ibm.diagnostics.healthcenter.headless=on
-Dcom.ibm.java.diagnostics.healthcenter.data.collection.level=headless
-Dcom.ibm.java.diagnostics.healthcenter.headless.output.directory=/u/tmp/zowec/
-Dcom.ibm.diagnostics.healthcenter.readonly=on

Down load the files to your work station, and use File -> Load Data to process the files.

To run the Health centre in real time

In the server

-Xhealthcenter:level=off,readonly=off,jmx=on,port=1972 
-Dcom.ibm.diagnostics.healthcenter.logging.level=debug

Note the jmx=on and the port number. You need this for the Eclipse configuration. The level=off means do not start collecting data until the Health centre agent connects.

In Eclipse

File -> New Connection… -> Enable an application for monitoring -> Next.

On the Select connector panel I used

Once it worked, I enabled security.

Click Next

The Health Centre then starts searching at the specified port. I disable the Scan next 100 ports… When it manages to connect to the port, click Finish.

I initially had problems connecting to the server, see Why can’t I connect to a z/OS port?

It takes a few seconds to start the data collection, and start downloading the data.

Let the JVM warm up

The image below shows the CPU usage from the start of the server.

For the first 5 minutes, this is the JVM starting up with no workload. Afterwards the CPU used drops to a low value.

After 5 minutes, I started my workload. For the first 12 or so minutes the CPU is high, but after about 13 minutes it levels out. If you want to do any measurements of cost per transaction you should take them from this period. During the “warm up” period, the JVM is optimising the code etc.

The green line shows the system CPU usage. The red line (and grey area) shows the Application usage. We can see most of the CPU used is application usage.

The number of methods profiled is the JVM optimising the code. It takes the “hottest” classes and does those first… until all (most) of the classes are optimised.

Long term monitoring.

f

From this diagram you can see the JVM startup, the initial part of my test where the JVM was warming up, the remainder of the test, and the JVM overhead after the test.

You need to take all of these into consideration when running performance tests.

Running performance tests

I set up my Work Load Manager configuration to record the number of MQ transactions, and had a report class for the MQWEB server. From this I can calculate the cost per transaction.

Health centre agent logging

With

-Dcom.ibm.diagnostics.healthcenter.logging.level=finest

I had output in the STDERR output

[06:51:52] com.ibm.diagnostics.healthcenter.Agent FINE: System receiver, version 1.0 
[06:51:52] com.ibm.diagnostics.healthcenter.Agent FINE: /usr/lpp/java/J21.0_64//lib/libhcapiplugin.so, version 1.0
[06:51:52] com.ibm.diagnostics.healthcenter.java FINE: Health Center Agent 4.0.7
06:51:53com.ibm.java.diagnostics.healthcenter.agent.mbean.HCLaunchMBean <init>
INFO: Agent version "3.0.21.202109031203"
06:51:56 com.ibm.java.diagnostics.healthcenter.agent.mbean.HCLaunchMBean startAgent
INFO: Health Center agent running in off mode.
06:51:56 com.ibm.java.diagnostics.healthcenter.agent.mbean.HCLaunchMBean startAgent
INFO: Health Center agent started on port 1972.

and in STDOUT many

com.ibm.lang.management.OperatingSystemMXBean.getTotalPhysicalMemory() 

I can’t automatically allocate a data set, and my SMS set up is not helping.

I’m running my little zD&T z/OS system on my laptop. I am the only person on this system, so I have to do every thing myself.

I started my MQ system last week, and now it is complaining that it cannot allocate archive logs. From my experience with MQ, I know this is serious. I know I have lots of space on my disks, so why can’t MQ use it.
I’ll go through the diagnostic path I took, which shows the SMS commands I used, and give the solution.

The blog post One minute SMS covers many of the concepts (and commands used).

The error messages

CSQJ072E %CSQ9 ARCHIVE LOG DATA SET 'CSQARC2.CSQ9.B0000002' HAS BEEN ALLOCATED TO NON-TAPE DEVICE AND CATALOGUED, OVERRIDING CATALOG PARAMETER                                    
IGD17272I VOLUME SELECTION HAS FAILED FOR INSUFFICIENT SPACE FOR DATA SET CSQARC2.CSQ9.A0000002 JOBNAME (CSQ9MSTR) STEPNAME (CSQ9MSTR) PROGNAME (CSQYASCP)
REQUESTED SPACE QUANTITY = 120960 KB
STORCLAS (SCMQS) MGMTCLAS ( ) DATACLAS ( )
STORGRPS (SGMQS SGBASE SGEXTEAV )
IKJ56893I DATA SET CSQARC2.CSQ9.A0000002 NOT ALLOCATED+
IGD17273I ALLOCATION HAS FAILED FOR ALL VOLUMES SELECTED FOR DATA SET
CSQARC2.CSQ9.A0000002
IGD17277I THERE ARE (247) CANDIDATE VOLUMES OF WHICH (7) ARE ENABLED OR
QUIESCED
IGD17290I THERE WERE 3 CANDIDATE STORAGE GROUPS OF WHICH THE FIRST 3 814
WERE ELIGIBLE FOR VOLUME SELECTION.
THE CANDIDATE STORAGE GROUPS WERE:SGMQS SGBASE SGEXTEAV
IGD17279I 240 VOLUMES WERE REJECTED BECAUSE THEY WERE NOT ONLINE
IGD17279I 240 VOLUMES WERE REJECTED BECAUSE THE UCB WAS NOT AVAILABLE
IGD17279I 7 VOLUMES WERE REJECTED BECAUSE THEY DID NOT HAVE SUFFICIENT
SPACE (041A041D)

Why is it using the storage class SCMQS?

From the ISMF panels,

  • option 7 Automatic Class Selection
  • option 5 Display – Display ACS Object Information

Gives a panel

   Panel  Utilities  Help                                                       
──────────────────────────────────────────────────────────────────────────────
ACS OBJECT DISPLAY
Command ===>

CDS Name : ACTIVE

ACS Rtn Source Data Set ACS Member Last Trans Last Date Last Time
Type Routine Translated from Name Userid Translated Translated
-------- ----------------------- -------- ---------- ---------- ----------
DATACLAS SYS1.S0W1.DFSMS.CNTL DATACLAS IBMUSER 2019/12/17 15:21
MGMTCLAS ----------------------- -------- -------- ---------- -----
STORCLAS SYS1.S0W1.DFSMS.CNTL STORCLAS IBMUSER 2020/12/02 11:23
STORGRP SYS1.S0W1.DFSMS.CNTL STORGRP IBMUSER 2019/12/17 15:23

So the ACS routine is in SYS1.S0W1.DFSMS.CNTL(STORCLAS)

This file has

PROC STORCLAS 
FILTLIST MQS_HLQ INCLUDE(CSQ*.**,
CSQ.**,
MQS.**,
MQS*.**)
...
SELECT
...
WHEN (&DSN = &MQS_HLQ)
DO
SET &STORCLAS = 'SCMQS'
EXIT CODE(0)
END
...
END
END

This says for any data set name (&DSN) that match the list (&MQS_HLQ) whic has CSQ* or MQS*, then set the Storage class to ‘SCMQS’

What storage groups are connected with the MQ data set?

Member SYS1.S0W1.DFSMS.CNTL(STORGRP) has

...
WHEN (&STORCLAS= 'SCMQS')
DO
SET &STORGRP = 'SGMQS','SGBASE','SGEXTEAV'
EXIT CODE(0)
END
...

so these are the storage groups that MQ data sets will use.

What DASD volumes are in the storage group?

D SMS,SG(SGbase)                             
IGD002I 13:34:38 DISPLAY SMS 699

STORGRP TYPE SYSTEM= 1
SGBASE POOL +
SPACE INFORMATION:
TOTAL SPACE = 29775MB USAGE% = 98 ALERT% = 0
TRACK-MANAGED SPACE = 29775MB USAGE% = 98 ALERT% = 0

Hows there is 29775 M allocated -and it is 98% full.

D SMS,SG(SGMQS)                                                        
IGD002I 13:31:33 DISPLAY SMS 678

STORGRP TYPE SYSTEM= 1
SGMQS POOL +
SPACE INFORMATION:
NOT AVAILABLE TO BE DISPLAYED
***************************** LEGEND *****************************
. THE STORAGE GROUP OR VOLUME IS NOT DEFINED TO THE SYSTEM
+ THE STORAGE GROUP OR VOLUME IS ENABLED
- THE STORAGE GROUP OR VOLUME IS DISABLED
* THE STORAGE GROUP OR VOLUME IS QUIESCED
D THE STORAGE GROUP OR VOLUME IS DISABLED FOR NEW ALLOCATIONS ONLY
Q THE STORAGE GROUP OR VOLUME IS QUIESCED FOR NEW ALLOCATIONS ONLY
> THE VOLSER IN UCB IS DIFFERENT FROM THE VOLSER IN CONFIGURATION
SYSTEM 1 = S0W1

There are no volumes allocated to this storage group.

What volumes are in the storage group?

D SMS,SG(SGBASE),LISTVOL                                             
IGD002I 13:39:07 DISPLAY SMS 705

STORGRP TYPE SYSTEM= 1
SGBASE POOL +
SPACE INFORMATION:
TOTAL SPACE = 29775MB USAGE% = 98 ALERT% = 0
TRACK-MANAGED SPACE = 29775MB USAGE% = 98 ALERT% = 0

VOLUME UNIT MVS SYSTEM= 1 STORGRP NAME
B3USR1 0ADA ONRW + SGBASE
USER0A + SGBASE
USER0B + SGBASE
USER0C + SGBASE
USER0D + SGBASE
USER0E + SGBASE
USER0F + SGBASE
USER00 0A9C ONRW + SGBASE
USER01 + SGBASE
USER02 0AB0 ONRW + SGBASE
USER03 0ACE ONRW + SGBASE
USER04 0AB2 ONRW + SGBASE
USER05 0AB5 ONRW + SGBASE
USER06 0A83 ONRW + SGBASE
...
+ THE STORAGE GROUP OR VOLUME IS ENABLED

How do I see how much space is available in my disks?

ISMF,

  • option 2 – Volume
  • option 1 – DASD

This gives a panel

                          VOLUME SELECTION ENTRY PANEL              Page 1 of 3
Command ===>

Select Source to Generate Volume List . . 2 (1 - Saved list, 2 - New list)
1 Generate from a Saved List Query Name To
List Name . . COLIN Save or Retrieve
2 Generate a New List from Criteria Below
Specify Source of the New List . . 1 (1 - Physical, 2 - SMS)
Optionally Specify One or More:
Enter "/" to select option Generate Exclusive list
Type of Volume List . . . 1 (1-Online,2-Not Online,3-Either)
Volume Serial Number . . USER* (fully or partially specified)
Device Type . . . . . . . (fully or partially specified)
Device Number . . . . . . (fully specified)
To Device Number . . . (for range of devices)
Acquire Physical Data . . Y (Y or N)
Acquire Space Data . . . Y (Y or N)
Storage Group Name . . . (fully or partially specified)
CDS Name . . . . . . .
(fully specified or 'Active')
Use ENTER to Perform Selection; Use DOWN Command to View next Selection Panel;
Use HELP Command for Help; Use END Command to Exit.

or

        Enter "/" to select option      Generate Exclusive list                 
Type of Volume List . . . 1 (1-Online,2-Not Online,3-Either)
Volume Serial Number . . * (fully or partially specified)
Device Type . . . . . . . (fully or partially specified)
Device Number . . . . . . (fully specified)
To Device Number . . . (for range of devices)
Acquire Physical Data . . Y (Y or N)
Acquire Space Data . . . Y (Y or N)
Storage Group Name . . . SGBASE (fully or partially specified)
CDS Name . . . . . . . 'ACTIVE'
(fully specified or 'Active')

You can specify a Volume Serial prefix, a Storage Group Name, or a combination of both.

You need to select Acquire Physical Data, and Acquire Space Data.

You get output like

 LINE       VOLUME FREE       %     ALLOC      FRAG   LARGEST    FREE     
OPERATOR SERIAL SPACE FREE SPACE INDEX EXTENT EXTENTS ... ...
---(1)---- -(2)-- ---(3)--- (4)- ---(5)--- -(6)- ---(7)--- --(8)--
B3USR1 149186K 2 8165315K 375 34032K 36
USER00 67067K 1 8247434K 718 2490K 133
USER02 30601K 1 2740899K 412 11621K 31
USER03 3209K 0 2768291K 333 2213K 6
USER04 146198K 5 2625302K 280 42332K 19
USER05 64466K 2 2707034K 9 63802K 3
USER06 273304K 10 2498196K 177 105581K 14

Which shows I do not have much free space.

Add more space

As it looks like my storage group pools are low on disk space, I need to allocate more volumes.

See Adding more disk space to z/OS, creating volumes and adding them to SMS.

Once I added the volume to the SGBASE storage group, it usage went from

TOTAL SPACE = 29775MB USAGE% = 98 ALERT% = 0                      
TRACK-MANAGED SPACE = 29775MB USAGE% = 98 ALERT% = 0

to

TOTAL SPACE = 32482MB USAGE% = 89 ALERT% = 0                      
TRACK-MANAGED SPACE = 32482MB USAGE% = 89 ALERT% = 0

What CEA TSO operator commands are there?

Part of the CEA facility service on z/OS, provides the capability for an application to start TSO address spaces, send it TSO commands, and receive the responses. This is used by products lie z/OSMF. You can have a CEA TSO address spaces for a user, as well as a “normal” TSO userid, where you logon and use ISPF.

More information about the commands

Change the CEA parameters F CEA,CEA=(x1,x2,…xN)

Display the CEA configuration parameters F CEA,D,P

STATUS: ACTIVE-FULL      CLIENTS: 0  INTERNAL: 0            
CEA = (00)
SNAPSHOT = N
HLQLONG = CEA HLQ =
BRANCH = COUNTRYCODE =
CAPTURE RANGE FOR SLIP DUMPS:
LOGREC = 01:00:00 LOGRECSUMMARY= 04:00:00
OPERLOG = 00:30:00
CAPTURE RANGE FOR ABEND DUMPS:
...
CAPTURE RANGE FOR CONSOLE DUMPS:
...
TSOASMGR:
RECONSESSIONS = 0 RECONTIME = 00:00:00
MAXSESSIONS = 50 MAXSESSPERUSER= 10

Display a summary of CEA TSO regions F CEA,D,S

STATUS: ACTIVE-FULL      CLIENTS: 0  INTERNAL: 0         
EVENTS BY TYPE: #WTO: 0 #ENF: 0 #PGM: 0
TSOASMGR: ALLOWED: 50 IN USE: 1 HIGHCNT: 0

Display client summary F CEA,D,CLIENTSUMMARY and D CEA,CLIENT=*

STATUS: ACTIVE-FULL      CLIENTS: 0  INTERNAL: 0                   
EVENTS BY TYPE: #WTO: 0 #ENF: 0 #PGM: 0
TSOASMGR: ALLOWED: 50 IN USE: 1 HIGHCNT: 0
NO CLIENTS KNOWN TO CEAS AT THIS TIME
12I CN=L700 DEVNUM=0700 SYS=S0W1

Display the session information F CEA,DIAG,SESSTABLE

INDEX=0001 USERID=COLIN    APPID=IZUCONAP ASID=004E MSGQID=00060018                       
COUNT=0001 ASCBADDR=FC3B80 STOKEN=0000013800000009 STTIME=15:34:43.966
LRTIME=15:34:43.967 LOGONPROC=IZUFPROC GROUP= REGION=50000
CODEPG=1047 CHARSET=697 ROWS=204 COLS=160 RECONN=N RCTIME=00:00:00.000
ACCT=ACCT#
HOST REMOTESYS= REMOTEQID=00000000 CALLERSYS=

This shows information like the TSO LOGON procedure used, the screen size,the region size and the account number.

Mapping a certificate to a userid and so avoid needing a password is good – but…

You can use the RACDCERT MAP command to map a certificate to a userid, and so avoid the need for specifying a password. Under the covers code uses the pthread_security_np and pass a certificate, or a userid and password, and if validated, the thread becomes that userid, just the same as if the userid was logged on.

Is this secure?

If you store a userid and password on your laptop, even though the data may be “protected” someone who has access to your machine may be able to copy the file and so impersonate you.

With a public certificate and private key, if someone can access your machine, they may be able to copy these files and so impersonate you.

You can get dongles which you plug into your laptop on which you can store protected data. In order to use the data, you need the physical device.

You need to protect the RACF command

Because the RACFCERT command has the power to be dangerous, you need to protect it.

You do not want someone to specify their certificate maps to a powerful userid, such as SYS1. The documentation says

To issue the RACDCERT MAP command, you must have the SPECIAL attribute or sufficient authority to the IRR.DIGTCERT.MAP resource in the FACILITY class for your intended purpose.

For a general user to create a mapping associated with their own user ID they need READ access to IRR.DIGTCERT.MAP.

For a general user to create a mapping associated with another user ID or MULTIID, they need need UPDATE access to IRR.DIGTCERT.MAP.

What’s the best way to set this up?

I think that as part of your process for setting up userids, the process should create the mapping for the certificate to a userid. This way you do not have people creating the mapping. If a mapping already exists, you cannot create another mapping.

You may want an automated process which checks the approval, and issues the commands, and so you do not have humans with the authority to issue the commands.

Of course you’ll have a break-glass all powerful userid in case of emergencies.

But….


Even though the password had expired, I could logon using the certificate. If I revoked the userid the logon failed.

I used certificate logon from z/OSMF and issued console commands. The starts a TSO address space, and z/OSMF passes the commands and responses to the tso address space.

Once a TSO address space has been started, there are no more checks to see if the userid is still valid.

If you want to inactivate the userid, you’ll need to revoke it, and then cancel all the TSO address spaces running on behalf of the userid. Walking someone off site is not good enough. There may be scripts which are automated, and will logon with no human intervention.
TSO address spaces may be configured to be cancelled if there is no activity. If the TSO address space is kept busy, (for example by sending it requests) it may never be forced off.

Getting a CTRACE

Component TRACE (CTRACE) is the z/OS system trace capability for z/OS components. Most z/OS components use it.

From “capturing a trace” perspective, there are two aspects.

  • Capturing the trace data
    • The trace can be an in-memory trace, which is available when a dump is taken. This is is often the default. For example by default a trace is enabled to capture errors, and the in-memory trace is used.
    • You can have a trace writer started task which writes to a data set. When you start the trace you give the name of the started task. Data is passed to the trace writer job. You can then use the trace data set in IPCS.
  • Enabling the trace for the component. Usually there are options you can specify, for example all entries, or just error entries and how big the in-memory trace should be.

To trace a z/OS component, you need to know the CTRACE component name, and what you want to trace.

I tried to capture a CTRACE of a z/OS component, and struggled, because I didn’t know the name of the component.

What are the trace component names?

The z/OS command

TRACE STATUS         

gave

IEE843I 16.10.22  TRACE DISPLAY 940                               
SYSTEM STATUS INFORMATION
ST=(ON,0001M,00005M) AS=ON BR=OFF EX=ON MO=OFF MT=(ON,064K)
COMPONENT MODE COMPONENT MODE COMPONENT MODE COMPONENT MODE
--------------------------------------------------------------
CSF ON NFSC ON SYSGRS MIN SYSANT00 MIN
SYSJES2 SUB SYSRRS MIN SYSIEAVX MIN SYSSPI OFF
SYSJES SUB SYSHZS MIN SYSSMS OFF SYSAXR MIN
SYSDLF MIN SYSOPS MIN SYSXCF MIN SYSDUMP ON
SYSLLA MIN SYSXES ON SYSUNI OFF SYSCATLG MIN
SYSTTRC OFF SYSTCPDA SUB SYSRSM SUB SYSAOM MIN
SYSVLF MIN SYSTCPIP SUB SYSLOGR ON SYSOMVS MIN
SYSCEA MIN SYSWLM MIN SYSTCPIS SUB SYSTCPRE SUB
SYSIOS MIN SYSANTMN MIN SYSDMO MIN SYSIEFAL ON
SYSTCPOT SUB

I was after a CEA trace, and from the above, the name is SYSCEA. It is MIN, so is already active.

What is the trace’s status?

d trace,comp=SYSCEA

gave me

COMPONENT     MODE BUFFER HEAD SUBS                           
-------------------------------------------------------------
SYSCEA MIN 0002M
ASIDS *NONE*
JOBNAMES *NONE*
OPTIONS ERROR
WRITER *NONE*

So it is active, capturing errors, and writing to the in-memory trace (because there is no WRITER). I recognised the options as the defaults in parmlib member CTICEA00.

I had my own trace writer started task

Member CTWTR in proclib

//CTWTR PROC                                                                  
//DELETE EXEC PGM=IEFBR14
//TRCOUT01 DD DSNAME=IBMUSER.CTRACE1,
// SPACE=(CYL,(10),,CONTIG),DISP=(MOD,DELETE)
//*
//IEFPROC EXEC PGM=ITTTRCWR,TIME=999
//TRCOUT01 DD DSNAME=IBMUSER.CTRACE1,
// SPACE=(CYL,(10),,CONTIG),DISP=(NEW,CATLG)
//SYSPRINT DD SYSOUT=*

I started my CTRACE writer

TRACE CT,WTRSTART=CTWTR          

I created my own member CTICEACP in parmlib

TRACEOPTS 
ON
BUFSIZE(20m)
OPTIONS('ALL')
WTR(CTWTR)

The WTR ties up with my CTRACE writer started task name.

Stop the current trace

TRACE CT,OFF,COMP=SYSCEA

Start the CEA trace using my member

TRACE CT,ON,COMP=sysCEA,PARM=CTICEACP

Run the test

Stop the CEA trace

TRACE CT,OFF,COMP=SYSCEA

Stop the trace writer

TRACE CT,WTRSTOP=CTWTR     

The output from the CTWTR task gave me

IEF196I IEF142I CTWTR CTWTR - STEP WAS EXECUTED - COND CODE 0000         
IEF196I IGD104I IBMUSER.CTRACE1 RETAINED,
IEF196I DDNAME=TRCOUT01

which gives me the name of the data set IBMUSER.CTRACE1.

Use IPCS to look at the trace

  • option =0 to specify the name of the data set
  • =6
  • dropd
  • The above command clears out any old information about the data set
  • CTRACE COMP(SYSCEA) full

I had some data in the trace – but not for the problem I had…. so I need to try something else.

The advanced class.

You do not need to have a member in parmlib. You can use

TRACE CT,ON,COMP=SYSCEA

and do not specify a PARM. This will then prompt for the parameters, asid, jobname, writer and options.

Oh p*x, it didn’t copy across some files.

I had managed to mess up the files for a product, so I wanted to copy them across from an older system.

This worked for some of the files – but when I came to start the subsystem – it was missing some files! For example /u/my/zosmf/liberty/lib/native/zos/s390x/bbgzsrv

I copied the files across again – and they were still not there!

Once you know the answer it is obvious…

There is a directory /usr/lpp/zosmf/liberty – and it was this directory that was missing.

Once I looked into it more carefully – this was not a directory, but a symbolic link to another directory liberty -> ../liberty_zos/current

To fix this I used

# go to my version of zosmf
cd /u/my/zosmf
# remove the symbolic link
rm liberty
#make the new link
ln -s /usr/lpp/liberty_zos/current liberty

and now I could use ls /u/tmp/zosmfp/liberty/lib/native/zos/s390x/bbgzsrv and it found the file.

If I had checked this before I started, I would have save myself a half day of IPLing older systems!

How to get a file from z/OS to a different z/OS without using FTP

I have a userid on a z/OS production system, which does not support FTP. To run my tests, I needed to get some files on to this system. Getting the files there was a challange.

The 3270 emulator has support for transferring files. It uses the IND$FILE TSO command to send data packaged as 3270 datastream As far as I can tell, this only works with data sets, not Unix files.

Creating a portable file from a data set.

You can package a data set into a FB Lrecl 80 dataset using the TSO XMIT (TRANSMIT) command.

Create a portable dataset from a Unix file.

On my home system I created a PAX dataset from a file in a Unix directory.

Use cd to get into the directory you want to package. If you specify a file name like /tmp/mypackage, the unpax will store the output in /tmp/mypackage which may not be where you want to store the data.

If you use relative directories such as ‘.’ it will unpax into a relative directory. I used the cd command to get into my working directory

pax -W "seqparms='space=(cyl,(10,10))'" -wzvf  "//'COLIN.ZOWE.PAX'" -x os390  myfile

You need both the single and double quotes around the data set name.

This created a data set with record format FB, and Lrecl 80.

A 360 MB file became a 426 CYL data set.

If you run out of space ( B37-04 abend). Delete the dataset before you reissue the pax command, otherwise the space parameters on the pax command are ignored; and increase the amount of space in the pax command.

I FTPed this down to my Linux machine in binary mode.

Send the file to the remote z/OS over 3270 emulator

Because FTP was not available I had to use the TSO facility IND$FILE. One of the options from the “file” menu was “File Transfer”.

You fill in details of the local file name, the remote data set name, and data set attributes.

In theory you need to be in TSO option 6 – where you can enter TSO commands, but when I tried this I kept getting “input field too small”. I had to exit ISPF and get into native TSO before the command worked.

The transfer rate is very slow. It sends one block at a time, and waits for the acknowledgement. With TCP/IP you can send multiple blocks before waiting for the ack, and use big blocks. For a 300MB file, I achieved 47KB per second with a 16000 block size – so not very high.

With IND$FILE, pick the biggest block size you can. I think it supports a maximum size of 32767. I got 86 KB/second with a 32767 block size with DFT mode.

For a dataset packaged with TSO XMIT

Use the TSO command RECEIVE INDSN(…) to restore the data set.

Un PAX the file to recreate it

On the production system, I use went into Unix, and used the cd command to get to the destination directory.

pax -ppx -rf  "//'COLIN.ZOWE.PAX'"      

Programming shared memory – more head banging.

I was trying to use shared memory (to look at Java Shared Classes), and it took me a day to get it working – better documentation would have helped.

I had two basic problems

  1. Using smctl to display information about the shared memory, gave the size as 0 bytes, even though the ipcs command showed me there were megabytes of data in the shared memory area.
  2. Trying to attach the shared memory gave me “invalid parameters” return code – even though the documented reasons for this error code did not apply to my program.

I tried many things, from using different userids, to running with a different storage key, running APF authorised….

I eventually got it to work by compiling my C program in 64 bit mode rather than 31 bit mode. There is no discussion about 31 bit/64 bit in the documentation. If the shared memory in 64 bit mode, you will need 64 bit addressability, so you need a 64 bit program. But there is no way of determining that the shared memory is 64 bit!

My basic program

{ 
//struct shmid_ds buf;
struct shmid_ds64 buf;
memset(&buf ,0,sizeof(buf));
int shmid = 8197;
int rc = 0;
long l;
int cmd = IPC_STAT;
char * fn = "COLIN";
int shmflg =0;
shmflg = IPC_STAT;
// rc =shmctl(shmid, cmd, &buf);
rc =shmctl64(shmid, cmd, &buf);
perror("shmctl " );
printf("shctl rc %i\n",rc);
l = buf.shm_segsz;
printf("size %ld\n",l);
printHex(stdout,&buf,sizeof(buf));
///////////////////////////////////////////////
// shmat
///////////////////////////////////////////////
char * pData = NULL;
pData = shmat(shmid, NULL , 0 );
printf("Address %ld\n",pData);
printHex(stdout,pData+4096*1024,1024*1024);
int e = errno;
perror("shmat ");
printf("Errno: %s\n",strerror(e));
return 0;
}

Originally I was using EDCCB to compile and bind this.

The EINVAL error return code was (from the documentation) for cases where the pointer in shmat was non NULL. I was passing NULL – so none of this made sense.

The reason code 0717014A was

JRInvalidAmode: An incorrect access mode was specified on the access service
Action: The access mode specified on the access service has unsupported bits turned on. Reissue the request and specify a valid access mode.

It turned out that my program was 31 bit. When I used 64 bit – it magically started working.

I compiled it with EDCQCB, and had to change a few things to be 64 bit mode.

  • shmid_ds buf -> shmid_ds64
  • shmctl -> shmctl64

When I ran it in 31 bit mode, the length of the storage returned was 0. In 64 bit mode, it gave the correct length. This looks like a way of telling what mode the shared memory is!

Understanding spawn and _BPX_SHAREAS

You can use spawn() to create another thread to do work. It may be able to run in the same address space as the originator, or it may run in its own address space.

It is cheaper to run in the requester’s address space, as it just creates a new TCB. If it runs in a different address space, in one of the pool of OMVS BPXAS address spaces, there is additional overhead.

I set up a shell script to call a Rexx script which did a spawn of another shell script.

I used the Rexx script to display information about the threads.

With _BPX_SHAREAS=YES – share the address space

the output was

jobname  asid    ppid    pid    threadid  tcb  cmdline
COLIN 21 1 50397218 212A80003 8BEA50 OMVS
COLIN 21 50397218 16842787 212A68002 8B9C90 -sh
COLIN2 4C 16842787 50397295 212AA8000 8FB2F8 sh kk.sh
COLIN2 4C 50397295 33620080 212AB0000 8D6A88 ./r.rexx YES

We can see the following

  1. The top level process in OMVS (parent process id) 1 invoked a program OMVS with process id(pid) 50397218, in address space 0x21
  2. This process invoked a shell (-sh) with pid 16842787 in address space 0x21
  3. This executed a command “sh kk.sh” (my test script) with a process id in address space 0x4c, jobname COLIN2, and TCB 8FB2F8.
  4. This invoked shell script invoked a command “./r.rexx YES” in the same address space 0x4c, jobname COLIN2 with a different TCB 8D6A88. This is sharing the address space.

With _BPX_SHAREAS=NO – do not share the address space

the output was similar to the _BPX_SHAREAS=YES, but different

jobname  asid    ppid    pid    threadid  tcb    cmdline
COLIN 21 1 50397218 212A80003 8BEA50 OMVS
COLIN 21 50397218 16842787 212A68002 8B9C90 -sh
COLIN9 4B 16842787 83951726 212AA8000 8FB380 sh kk.sh
COLIN1 4D 83951726 16842864 212AB0000 8FB2F8 ./r.rexx NO
  • The sh k.sh ran in a different address space 0x4B, with a different jobname COLIN9.
  • Because _BPX_SHARESAS=NO, the command “./r.rexx YES” executed in a different address space 0x4d, jobname COLIN1.

Comparison between the two scenarios

  1. With _BPX_SHAREAS=YES, one address space was shared, with two (lightweight) TCBs in it.
  2. With _BPX_SHAREAS=NO, the address spaces were not shared, and one of the pool of BPXAS address spaces were used.

When do you get not shared, even when _BPX_SHAREAS=YES was specified?

There are several cases when the system will not run a program in a shared address space.

Integrity

If there is a mismatch between APF states, for example

  • the caller is APF authorised; you do not want an unauthorised program access the memory in the shared address space of the APF authorised thread.
  • the caller is not APF authorised, but you are calling an APF authorised program.

The called program may change the userid or the group.

If your program changes the userid or group it is running under, for example a web server doing work for different userids. You can set a flag (s) on a file chmod to indicate that this program may change userid or group. See set-user-ID and Set-group-ID in chmod. You can use the ls -ltr command

-rwsr-sr-x   1 OMVSKERN ZWEADMIN    1336 Feb 26 16:41 r.rexx       

Where the first s is for set-user-ID. The second s is for set-group-ID.

When the file had u+s, I got the following error message

FSUM9209 cannot execute: reason code = 0b1b0473: EDC5157I An internal error has occurred.