Some of the mysteries of Java shared classes

When Java executes a program it read in the jar file, breaks it into the individual classes, converts the byte codes into instructions, and when executing it may replace instructions with more efficient instructions (Jitting). It can also convert the byte codes into instructions ahead of time, so called Ahead Of Time (AOT) compilation.

With shared classes, the converted byte codes, any Jitted code, and any AOT code can be saved in a data space.

  • When the java program runs a second time, it can reuse the data in the dataspace, avoid the overhead of the reading the jar file from the file system, and coverting the byte codes into instructions.
  • The data space can be hardened to a file, and restored to a data space, so can be used across system IPLs.

Using this, it reduced the start-up time of my program by over 20 seconds on my slow zPDT system. The default size of the cache is 16MB – one of my applications needed 100 MB, so most of the benefits of he shared classes could not be exploited if the defaults were used.

This blog post describes more information about this, and what tuning you can do.

Issuing commands to manage the shared classes cache

Commands to manage the shared classes cache are issued like

java -Xshareclasses:cacheDir=/tmp,name=client6,printStats

which can be done using JCL

// SET V=’listAllCaches’
// SET V=’printStats’
// SET C=’/tmp’
// SET N=’client6′
//S1 EXEC PGM=BPXBATCH,REGION=0M,
// PARM=’SH java -Xshareclasses:cacheDir=&C,name=&N,verbose,&V’
//STDERR DD SYSOUT=*
//STDOUT DD SYSOUT=*

Enabling share classes

You specify -Xsharedclasses information as a parameter to the program, for example in the command line or in a jvm properties file.

To use the shared classes capability you have to specify all of the parameters on one line, like

-Xshareclasses:verbose,name=client6,cacheDirPerm=0777,cacheDir=/tmp

Having it like

-Xshareclasses:name=client6,,cacheDirPerm=0777,cacheDir=/tmp
-Xshareclass:verbose

means the name, etc all take their defaults. Only shareclass:verbose would be used.

Changing share classes parameters

You can have more than one cache; you specify a name. You specify a directory were an image is stored when the cache is hardened to disk.

Some of the options like name= and cacheDir= are picked up when the JVM starts, Other parameters like cacheDirPerm are only used when the cache is (re-)created.

You can delete the cache in two ways.

Delete the cache from your your Java program

When you are playing around, you can add reset to the end of the -Xshareclasses string to caused the cache to be deleted and recreated.This gives output like

JVMSHRC010I Shared cache “client6” is destroyed
JVMSHRC158I Created shared class cache “client6”
JVMSHRC166I Attached to cache “client6”, size=20971328 bytes

This was especially useful when tuning the storage allocations.

Delete the cache independently

java -Xshareclasses:cacheDir=/tmp,name=client6,destroy

How to allocate the size of the cache

You specify the storage allocations using -Xsc.. (where sc stands for shareclasses)

If you have -Xsharedcache:verbose… specified then when the JVM shuts down you get

JVMSHRC168I Total shared class bytes read=11660. Total bytes stored=5815522
JVMSHRC818I Total unstored bytes due to the setting of shared cache soft max is 0.
Unstored AOT bytes due to the setting of -Xscmaxaot is 1139078.
Unstored JIT bytes due to the setting of -Xscmaxjitdata is 131832.

This shows the values of maxaot and maxjitdata are too small they were

-Xscmx20m
-Xscmaxaot2k
-Xscmaxjitdata2k

Whem the values were big enough I got

JVMSHRC168I Total shared class bytes read=12960204. Total bytes stored=8885038
JVMSHRC818I Total unstored bytes due to the setting of shared cache soft max is 0.
Unstored AOT bytes due to the setting of -Xscmaxaot is 0.
Unstored JIT bytes due to the setting of -Xscmaxjitdata is 0.

How big a cache do I need?

If you use -Xshareclasses:verbose… it will display messages

for example

JVMSHRC166I Attached to cache “client6”, size=2096960 bytes
JVMSHRC269I The system does not support memory page protection

JVMSHRC096I Shared cache “client6” is full. Use -Xscmx to set cache size.
JVMSHRC168I Total shared class bytes read=77208. Total bytes stored=2038042

Message JVMSHRC096I Shared cache “client6” is full. Use -Xscmx to set cache size, tells you the cache is full – but no information about how big it needs to be.

You can use

java -Xshareclasses:cacheDir=/tmp,name=client6,printStats

to display statistics like

-Xshareclasses persistent cache disabled]                                         
[-Xshareclasses verbose output enabled]                                            
JVMSHRC159I Opened shared class cache "client6"                                    
JVMSHRC166I Attached to cache "client6", size=2096960 bytes                        
JVMSHRC269I The system does not support memory page protection                     
JVMSHRC096I Shared cache "client6" is full. Use -Xscmx to set cache size.          
                                                                                   
Current statistics for cache "client6": 
cache size                           = 2096592                       
softmx bytes                         = 2096592                       
free bytes                           = 0                             
ROMClass bytes                       = 766804                        
AOT bytes                            = 6992                          
Reserved space for AOT bytes         = -1                            
Maximum space for AOT bytes          = 1048576                       
JIT data bytes                       = 212                           
Reserved space for JIT data bytes    = -1                            
Maximum space for JIT data bytes     = 1048576                       
Zip cache bytes                      = 1131864                       
Startup hint bytes                   = 0                             
Data bytes                           = 13904                         
Metadata bytes                       = 12976                         
Metadata % used                      = 0%                            
Class debug area size                = 163840                        
Class debug area used bytes          = 119194                        
Class debug area % used              = 72%

Cache is 100% full  
                                                                             

This show the cache is 100% full, and how much space is used for AOT and JIT. The default value of -Xscmx I had was almost 16MB. I made it 200MB and this was large enough.

I could not find a way of getting my program to issue printStats.

How do I harden the cache?

You can use use the

java -Xshareclasses:cacheDir=/tmp,name=zosmf,verbose,snapshotCache

command to create the cache on disk. Afterwards the listAllCaches command gave

Cache name level        cache-type     feature 
client6    Java8 64-bit non-persistent cr
client6    Java8 64-bit snapshot       cr

Showing the non persistent data space, and the snapshot file.

You can use the restoreFromSnapshot to restore from the file to the data cache; before you start your Java program. You would typically do this after an IPL.

How can I tell what is going on and if shared classes is being used?

The java options “-verbose:dynload,class

reports on the

  • dynamic loading of the files, and processing them,
  • what classes are being processed.

For example

<Loaded java/lang/reflect/AnnotatedElement from /Z24A/usr/lpp/java/J8.0_64/lib/rt.jar>
< Class size 3416; ROM size 2672; debug size 0>
< Read time 1196 usec; Load time 330 usec; Translate time 1541 usec>
class load: java/lang/reflect/AnnotatedElement from: /Z24A/usr/lpp/java/J8.0_64/lib/rt.jar
class load: java/lang/reflect/GenericDeclaration from: /Z24A/usr/lpp/java/J8.0_64/lib/rt.jar

dynload gave

<Loaded java/lang/reflect/AnnotatedElement from /Z24A/usr/lpp/java/J8.0_64/lib/rt.jar>
< Class size 3416; ROM size 2672; debug size 0>
< Read time 1196 usec; Load time 330 usec; Translate time 1541 usec>

this tells you a jar file was read from the file system, and how long it took to process it.

class gave

class load: java/lang/reflect/AnnotatedElement from: /Z24A/usr/lpp/java/J8.0_64/lib/rt.jar
class load: java/lang/reflect/GenericDeclaration from: /Z24A/usr/lpp/java/J8.0_64/lib/rt.jar

This shows two classe were extracted from the jar file.

In a perfect system you will get the class load entries, but not <Loaded….

Even when I had a very large cache size, I still got dynload entries. These tended to be loading class files rather than jar files.

For example there was a dynload entry for com/ibm/tcp/ipsec/CaApplicationProperties. This was file /usr/lpp/zosmf./installableApps/izuCA.ear/izuCA.war/WEB-INF/classes/com/ibm/tcp/ipsec/CaApplicationProperties.class

If you can make these into a .jar file you may get better performance. (But you may not get better performance, as it may take more time to load a large jar file).

I also noticed that there was dynload for com/ibm/xml/crypto/IBMXMLCryptoProvider which is in /Z24A/usr/lpp/java/J8.0_64/lib/ext/ibmxmlcrypto.jar, so shared classes has some deeper mysteries!

What happens if the .jar file changes?

As part of the class load, it checks the signature of the file on disk, matches the signature on the data space. If they are different the data space will be updated.

“Why were the options I passed to Java ignored” – or “how to tell what options were passed to my Java?”

I was struggling to understand a problem with shared classes in Java and I found the options being used by my program were not as I expected.

I thought it would be a very simple task to display at start up options used. It may be, but I could not find how to do it. If anyone knows the simple answer please tell me.

I found one way – take a dump! This seems a little extreme, but it was all I could find. With Liberty you can take a javacore dump (F IZUSVR1,JAVACORE) and display it, or you can take a dump at start up.

In the jvm.options I specified

-Xdump:java:events=vmstart

This gave me in //STDERR

JVMDUMP039I Processing dump event “vmstart”, detail “” at 2021/05/20 13:19:06 – please wait.
JVMDUMP032I JVM requested Java dump using ‘/S0W1/var/…/javacore.20210520.131906.65569.0001.txt’
JVMDUMP010I Java dump written to /S0W1/var…/javacore.20210520.131906.65569.0001.txt
JVMDUMP013I Processed dump event “vmstart”, detail “”.

In this file was list of all the options passed to the JVM

1CIUSERARGS UserArgs:
2CIUSERARG -Xoptionsfile=/usr/lpp/java/J8.0_64/lib/s390x/compressedrefs/options.default
2CIUSERARG -Xlockword:mode=default,noLockword=java/lang/String,noLockword=java/util/Ma
2CIUSERARG -Xjcl:jclse29
2CIUSERARG -Djava.home=/usr/lpp/java/J8.0_64
2CIUSERARG -Djava.ext.dirs=/usr/lpp/java/J8.0_64/lib/ext
2CIUSERARG -Xthr:tw=HEAVY
2CIUSERARG -Xshareclasses:name=liberty-%u,nonfatal,cacheDirPerm=1000,cacheDir=…
2CIUSERARG -XX:ShareClassesEnableBCI
2CIUSERARG -Xscmx60m
2CIUSERARG -Xscmaxaot4m
2CIUSERARG -Xdump:java:events=vmstart
2CIUSERARG -Xscminjitdata5m
2CIUSERARG -Xshareclasses:nonFatal
2CIUSERARG -Xshareclasses:groupAccess
2CIUSERARG -Xshareclasses:cacheDirPerm=0777
2CIUSERARG -Xshareclasses:cacheDir=/tmp,name=zosmf2
2CIUSERARG -Xshareclasses:verbose
2CIUSERARG -Xscmx100m

the storage limits

1CIUSERLIMITS User Limits (in bytes except for NOFILE and NPROC)
NULL ------------------------------------------------------------------------
NULL         type            soft limit  hard limit
2CIUSERLIMIT RLIMIT_AS       1831837696   unlimited
2CIUSERLIMIT RLIMIT_CORE        4194304     4194304
2CIUSERLIMIT RLIMIT_CPU       unlimited   unlimited
2CIUSERLIMIT RLIMIT_DATA      unlimited   unlimited
2CIUSERLIMIT RLIMIT_FSIZE     unlimited   unlimited
2CIUSERLIMIT RLIMIT_NOFILE        10000       10000
2CIUSERLIMIT RLIMIT_STACK     unlimited   unlimited
2CIUSERLIMIT RLIMIT_MEMLIMIT 4294967296  4294967296

and environment variables used.

1CIENVVARS Environment Variables
2CIENVVAR LOGNAME=IZUSVR
2CIENVVAR _CEE_ENVFILE_S=DD:STDENV
2CIENVVAR _EDC_ADD_ERRNO2=1
2CIENVVAR _EDC_PTHREAD_YIELD=-2
2CIENVVAR WLP_SKIP_MAXPERMSIZE=true
2CIENVVAR _BPX_SHAREAS=NO
2CIENVVAR JAVA_HOME=/usr/lpp/java/J8.0_64

All interesting stuff including the -X.. parameters. I could see that the parameters I had specified were not being picked up because they were specified higher up! Another face palm moment.

There was a lot more interesting stuff in the file, but this was not relevant to my little problems.

Once z/OSMF was active I took a dump using the f izusvr1,javacore command and looked at the information on the shared classes cache

1SCLTEXTCMST Cache Memory Status
1SCLTEXTCNTD Cache Name      Cache path
2SCLTEXTCMDT sharedcc_IZUSVR /tmp/javasharedresources/..._IZUSVR_G37
2SCLTEXTCPF Cache is 85% full
2SCLTEXTCSZ Cache size = 104857040
2SCLTEXTSMB Softmx bytes = 104857040
2SCLTEXTFRB Free bytes = 14936416

This is where I found the shared cache was not what I was expecting! I originally spotted that the cache was too small – and made it bigger.

Afterwards…

Remember to delete the javacore files.

I removed the -Xdump:java:events=vmstart statement, because I found it more convenient to use the f izusvr1,javacore command to take a dump when needed.

Looking at the performance of Liberty products and what you can do with the information

The Liberty Web Server on z/OS can produce information on the performance of “transactions”. It may not be as you expect, and it may not be worth collecting it. I started looking into this to see why certain transactions were slow on z/OSMF.

This blog post covers

  • How does the web server work?
  • How can I see what is going on?
  • Access logs
  • SMF 120
  • WLM
  • Do I want to use this in production?

How does it work?

A typical Web Server transaction might use a database to debit your account, and to credit my account. I think of this as a fat transaction because the application does a lot of work,

z/OSMF runs on top of Liberty, and allows you to run SDSF and ISPF within a web brower, and has tools to help automate work. You can also use REST services, so you can send a self contained HTTP request to z/OSMF, for example request information about data sets, or jobs running on the system. Both of these send a request to z/OSMF, which might send a request to a TSO userid, get the response back and pass the response back to the requester. I think of this as a thin transaction because the application running on the web server is mainly acting as a router or broker. What the end user sees as a “transaction” may be many micro services – each of which is a REST requests.

How can I see what is going on?

You can see what “transactions” or requests have been issued from

  • the access log – a flat file of requests
  • SMF 120 records
  • WLM reports.

Access logs

In the <httpEndpoint … accessLoggingRef=”accessLogging”…>, the accessLogging is a reference to an <httpAccessLogging… statement. This creates a flat files containing a record of all inbound HTTP client requests. If you have multiple httpEndpoint statements, you can have multiple accessLogging files. You can control the number and size of the files.

This has information with fields like

  • 10.1.0.2 – the IP address the request came in from
  • COLIN – the userid
  • {16/May/2021:14:47:40 +0000} – the date, time and time zone of the event
  • PUT /zosmf/tsoApp/tso/ping/COLIN-69-aabuaaac – the request
  • HTTP/1.1 – the type of HTTP request
  • 200 – the HTTP return code
  • 78 – the number of bytes sent back.

You can have multiple httpEndpoint definitions, for example specifing different IP address and port combinations. These definitions can point to a shared or indivdual httpAccessLogging statement, and so can share (or not) the flat files. This allows you to specify that for one port you will use the accessLogging, and another port does not have access logging.

The SMF 120 records.

The request logging writes a record to SMF for each request.

<server>
<featureManager>
<feature>zosRequestLogging-1.0</feature>
</featureManager>
</server

I have not been able to find a formatter for these records from IBM, so I have written my own, it is available on github.

It produces output like

Server version      :2                                                              
System              :S0W1                                                           
Syplex              :ADCDPL                                                         
Server job id       :STC02771                                                       
Server job name     :IZUSVR1                                                        
config dir          :/var/zosmf/configuration/servers/zosmfServer/                  
codelevel           :19.0.0.3  
                                                     
Start time          :2021/05/16 14:42:33.955288                                     
Stop  time          :2021/05/16 14:42:34.040698                                     
Duration in secs    : 0.085410                                                      
WLMTRan             :ZCI4                                                           
Enclave CPU time    : 0.042570                                                      
Enclave ZIIP time   : 0.042570                                                      
WLM Resp time ratio :10.000000                                                      
userid long         :COLIN                                                          
URI                 :/zosmf/IzuUICommon/externalfiles/sdsf/index.html               
CPU Used Total      : 0.040111                                                      
CPU Used on CP      : 0.000000                                                      
CPU Delta Z**P      : 0.040111 
                                                     
WLM Classify type   :URI                                                            
WLM Classify by     :/zosmf/IzuUICommon/externalfiles/sdsf/index.html
WLM Classify type   :Target Host                            
WLM Classify by     :10.1.1.2                               
WLM Classify type   :Target Port                            
WLM Classify by     :10443   
                               
Response bytes      :7003                                   

Origin port         :57706                                  
Origin              :10.1.0.2                                              

This has information about

  • The Liberty instance, where it is running and configuration information
  • Information about the transaction, start time, stop time, duration and CPU time used
  • WLM information about the request. It was classified as
    • URI:…index.html
    • Target Host:10.1.1.2
    • Target port:10443
  • 7003 bytes were sent to the requester
  • the request came from 10.1.0.2 port 57706

The SMF formatter program also summarises the records and this shows there are records for

  • /zosmf/IzuUICommon/externalfiles/sdsf/js/ui/theme/images/zosXcfsCommand_enabled_30x30.png
  • /zosmf/IzuUICommon/externalfiles/sdsf/js/ui/theme/sdsf.css
  • /zosmf/IzuUICommon/externalfiles/sdsf/sdsf.js
  • /zosmf/IzuUICommon/externalfiles/sdsf/IzugDojoCommon.js
  • /zosmf/IzuUICommon/persistence/user/com.ibm.zos.sdsf/JOBS/JOBS

This shows there is a record for each part of the web page, icons, java scripts and style sheets.

Starting up SDSF within z/OSMF created 150 SMF records! Refreshing the data just created 1 SMF record. The overhead of creating all the SMF records for one “business transaction” may be too high for production use.

As far as I can tell this configuration is server wide. You cannot enable it for a specific IP address and port combination.

WLM reports

Much of the data produced in the records above can be passed to WLM. This can be used to give threads appropriate priorities, and can produce reports.

You enable WLM support using

  • <featureManager>
    • <feature>zosWlm-1.0 </feature>
  • </featureManager>
  • <zosWorkloadManager collectionName=”MOPZCET”/>
  • <wlmClassification>
    • <httpClassification transactionClass=”ZCI6″ resource=”/zosmf/desktop/”/>
    • <httpClassification transactionClass=”ZCI1″ resource=”/**/sdsf/**/*”/>
    • <httpClassification transactionClass=”ZCI3″ resource=”/zosmf/webispf//”/>
    • <httpClassification transactionClass=”ZCI4″ resource=”/**/*”/>
    • <httpClassification transactionClass=”ZCI2″ resource=”IZUGHTTP” />
    • <httpClassification transactionClass=”ZCI5″ port=”10443″/>
  • </wlmClassification>

Where the httpClassification maps a z/OSMF resource to an 8 character transaction class. THe records are process from the top until there is a match. For example port=10443 would not be used because of the generic resource=/**/* definition.

These need to be mapped into the WLM configuration…

WLM configuration

You can configure WLM through the WLM configuration panels.

option 6. Classification rules.

option 3 to modify CB(Component broker)

          -------Qualifier--------                 -------Class--------  
 Action   Type      Name     Start                  Service     Report   
                                          DEFAULTS: STCMDM      TCI2
  ____  1 CN        MOPZCET  ___                    ________    THRU
  ____  2   TC        ZCI1     ___                  SCI1        RCI1
  ____  2   TC        ZCI2     ___                  SCI2        RCI2
  ____  2   TC        ZCI3     ___                  SCI3        RCI3
  ____  2   TC        ZCI4     ___                  SCI4        RCI4
  ____  2   TC        ZCI5     ___                  SCI5        RCI5
  ____  2   TC        ZCI6     ___                  SCI6        RCI6
  ____  2   TC        THRU     ___                  THRU        THRU

For the Type=CN, Name=MOPZCET, this value ties up with the <zosWorkloadManager collectionName=”MOPZCET” above. Type=CN is for CollectionName.
For the subtype 2 TC Name ZCI4, This is for TransactionClass which ties up with a httpClassification transactionClass statement.

The service class SCI* controls the priority of the work, the Report class RCI* allow you to produce a report by this name.

If you make a change to the WLM configuration you can save it from the front page of the WLM configuration panels, Utilities-> 1. Install definition, and activate it using Utilities-> 3. Activate service policy.

If you change the statements in Liberty or z/OSMF I believe you have to restart the server.

How to capture the data

The data is written out to SMF records on a timer, or on the SMF end-of-interval broadcast. If you change the interval, SMF sends an end-of-interval broadcast and writes the records to SMF. For example on my dedicate test system I use the operator command setsmf intval(10) to change the interval to 10 minutes. After the next test, I use setsmf intval(15) etc..

The data is kept in SMF buffers, and you may have to wait for a time, before the data is written out to external storage. It SMF data is being produced on a regular basis, it will be flushed out.

How to report the data

I copy the SMF data to a temporary data

//IBMURMF  JOB 1,MSGCLASS=H RESTART=POST 
//* DUMP THE SMF DATASETS 
// SET SMFPDS=SYS1.S0W1.MAN1 
// SET SMFSDS=SYS1.S0W1.MAN2 
//* 
//SMFDUMP  EXEC PGM=IFASMFDP 
//DUMPINA  DD   DSN=&SMFPDS,DISP=SHR,AMP=('BUFSP=65536') 
//DUMPINB  DD   DSN=&SMFSDS,DISP=SHR,AMP=('BUFSP=65536') 
//* MPOUT  DD   DISP=(NEW,PASS),DSN=&RMF,SPACE=(CYL,(1,1)) 
//DUMPOUT  DD   DISP=SHR,DSN=IBMUSER.RMF SPACE=(CYL,(1,1)) 
//SYSPRINT DD   SYSOUT=* 
//SYSIN  DD * 
  INDD(DUMPINA,OPTIONS(DUMP)) 
  INDD(DUMPINB,OPTIONS(DUMP)) 
  OUTDD(DUMPOUT,TYPE(70:79)) 
  DATE(2020316,2021284) 
  START(1539) 
  END(2359) 
/* 

and display the report classes

//POST EXEC PGM=ERBRMFPP 
//* PINPUT DD DISP=SHR,DSN=*.SMFDUMP.DUMPOUT 
//MFPINPUT DD DISP=SHR,DSN=IBMUSER.RMF 
//SYSIN DD * 
SYSRPTS(WLMGL(RCPER)) 
/* 

The output in //PPXSRPTS was

REPORT CLASS=RCI4                                                      
-TRANSACTIONS--  TRANS-TIME HHH.MM.SS.FFFFFF  
AVG        0.01  ACTUAL                69449  
MPL        0.01  EXECUTION             68780  
ENDED       160  QUEUED                  668  
END/S      0.13  R/S AFFIN                 0  
#SWAPS        0  INELIGIBLE                0  
EXCTD         0  CONVERSION                0  
                 STD DEV              270428  
                                              
----SERVICE----   SERVICE TIME  ---APPL %---  
IOC           0   CPU    2.977  CP      0.05  
CPU        2551   SRB    0.000  IIPCP   0.05  
MSO           0   RCT    0.000  IIP     0.19  
SRB           0   IIT    0.000  AAPCP   0.00  
TOT        2551   HST    0.000  AAP      N/A  
/SEC          2   IIP    2.338                
ABSRPTN     232   AAP      N/A                
TRX SERV    232                               
                                              

There were 160 “transactions” within the time period or 0.13 per second, The average response time was 69449 microseconds, with a standard deviation of 270428. This is a very wide standard deviation, so there was a mixture of long, and short response times.

The total CPU for this report class was 2.977 seconds of CPU, or 0.019 seconds per “transaction”.

Do I want to use this in production?

I think that the amount of data produced is managable for a low usage system. For a production environment where there is a lot of activity then the amount of data produced, and the cost of producing the data may be excessive. This could be an example of the cost of collecting the data is much larger than the cost of running the workload.

As z/OSMF acts as a broker, passing requests between end points you may wish just to use your existing reporting structures.

I used z/OSMF to display SDSF data, and set up and ISPF session within the web browse. This created two TSO tasks for my userid. If you include my traditional TSO session and the two from z/OSMF this is three TSO sessions in total running on the LPAR.

Using jConsole and with z/OS Liberty web server

I wanted to get some monitoring information out from z/OSMF using jConsole on my Ubuntu machine. Eventually this worked, but I had a few problems on the way. The same technique can be used for base Liberty, MQWeb, z/OSMF and ZOWE all of which are based on Liberty.

Configuring z/OSMF

I changed the z/OSMF configuration to include

<featureManager> 
  <feature>restConnector-2.0</feature> 
</featureManager> 

and restarted the server.

In the stdout (or message log) will be something like

CWWKX0103I: The JMX REST connector is running and is available at the following service
URL: service:jmx:rest://sss.com:10443/IBMJMXConnectorREST

You need the URL. The message above gave service:jmx:rest://sss.com:10443/IBMJMXConnectorREST, but I needed to use service:jmx:rest://10.1.1.2:10443/IBMJMXConnectorREST .

The port number came from the httpEndpoint with id=”defaultHttpEndpoint” . I have another httpEndpoint with port 24993, and this also worked with jConsole.

Set up jConsole

I set up a script for jConsole

k1='-J-Djavax.net.ssl.keyStore=/home/colinpaice/ssl/ssl2/adcdc.p12'
k2='-J-Djavax.net.ssl.keyStorePassword=password'
k3='-J-Djavax.net.ssl.keyStoreType=pkcs12'
t1='-J-Djavax.net.ssl.trustStore=/home/colinpaice/ssl/ssl2/zca.jks'
t2='-J-Djavax.net.ssl.trustStorePassword=password'
t3='-J-Djavax.net.ssl.trustStoreType=jks'
d='-J-Djavax.net.debug=ssl:handshake'
d=' '
de='-debug'
de=' '
s='service:jmx:rest://10.1.1.2:10443/IBMJMXConnectorREST'
jconsole $de $s $k1 $k2 $k3 $t1 $t2 $t3 $d

Where

  • the -J .. parameters are passed through to java,
  • the -Djava… are the standard set of parameters to define the key stores on the Linux

Running this script gave a pop up window with

Secure connection failed. Retry insecurely?

The connection to service:jmxLret://10.1.1.2:10443/IBMJMXConnectorREST could not be made using SSL.

Would you like to try without SSL?

This was because of the exception

java.io.IOException: jmx.remote.credentials not provided.

I could not see how to pass userid and password to jConsole.

I then used Cntrl+N to create a new connection and entered Username: and Password: which jConsole requires. After a short delay of a few seconds jConsole responded with a graphs of Heap Memory Usage, and Threads in use. You can then select from the Measurement Beans.

The TLS setup

In the keystore I had a certificate which I had used to talk to a Liberty instance before.

This was signed, and the CA certificate had been imported into the key trust keyring on z/OS, for that HttpEndPoint.

The server responded with a server certificate (“CN=SERVER,O=SSS,C=GB”) which had been signed on z/OS. The signing certificate had been exported from z/OS and downloaded to Linux

I created a jks key trust store using this certificate, using the command

keytool -importcert -file temp4ca.pem -keystore zca.jks -storetype jks -storepass password

and used this trust store to validate the server certificate sent down from z/OS.

This worked with jConsole.

I created a pkcs12 keystore using keytool

keytool -importcert -file temp4ca.pem -keystore zca2.p12 -storetype pkcs12 -storepass password

Which also worked.

Problems using a .p12 trust store

I used

runmqakm -keydb -create -db zca.p12 -type pkcs12 -pw password
runmqakm -cert -add -file temp4ca.pem -db zca.p12 -type pkcs12 -pw password -label tempca
runmqakm -cert -details -db zca.p12 -type pkcs12 -pw password -label tempca

to create a pkcs12 keystore and import the z/OS CA certificate. The -details option displayed it.

When I tried to use it, jConsole produced the message (after the Cntl+N)

Secure connection failed. Retry insecurely?

The connection to service:jmxLret://10.1.1.2:10443/IBMJMXConnectorREST could not be made using SSL.

Would you like to try without SSL?

I used Ctrl-N as before, and got the same message.

Using

d=’-J-Djavax.net.debug=ssl:handshake’

and rerunning the script, produced a TLS trace. At the bottom was

VMPanel.connect, handling exception: java.lang.RuntimeException: Unexpected error: java.security.InvalidAlgorithmParameterException: the trustAnchors parameter must be non-empty

%% Invalidated:…
VMPanel.connect, SEND TLSv1.2 ALERT: fatal, description = internal_error

Using a trace at the server, gave the unhelpful , SEND TLSv1.2 ALERT:

Using openssl also failed. Create the .p12 keystore

openssl pkcs12 -export -out zca.p12 -in temp4ca.pem -name zCA -nokeys

and rerun the jconsole script, and it failed the same way.

Unexpected error: java.security.InvalidAlgorithmParameterException: the trustAnchors parameter must be non-empty

It looks like runmqakm and openssl do not create a valid trust store with an imported certificate.

Additional diagnostics

When the trust store created by keytool was used; at the top of the TLS trace output was

System property jdk.tls.client.cipherSuites is set to ‘null’

Ignoring disabled cipher suite: SSL_DHE_RSA_WITH_3DES_EDE_CBC_SHA
trustStore is: /home/colinpaice/ssl/ssl2/zca.p12
trustStore type is : pkcs12

trustStore provider is :
init truststore
adding as trusted cert:
Subject: CN=TEMP4Certification Authority, OU=TEST, O=TEMP
Issuer: CN=TEMP4Certification Authority, OU=TEST, O=TEMP
Algorithm: RSA; Serial number: 0x0
Valid from Tue Jul 14 00:00:00 BST 2020 until Fri Jul 02 23:59:59 BST 2021

keyStore is : /home/colinpaice/ssl/ssl2/adcdc.p12
keyStore type is : pkcs12
keyStore provider is :
init keystore
init keymanager of type SunX509

When the runmqakm or openssl was used, the green entries were missing.

When I used runmqakm to create the pkcs12 keystore

runmqakm -cert -details -db zca.p12 -type p12 -pw password -label tempca

listed the certificate successfully.

When I used keytool to list the contents

keytool -list -keystore zca.p12 -storetype pkcs12 -storepass password
Keystore type: PKCS12
Keystore provider: SunJSSE

Your keystore contains 0 entries

When I created the key store with keytool, both runmqakm and keytool displayed the certificate.

The problem looks like Java is only able to process the imported CA certificates when keytool was used to create the trust store.

Why do they ship java products on z/OS with the handbrake on? And how to take the brake off.

I noticed that it takes seconds to start MQ on my little z/OS machine, but minutes (feels like days) to start anything with Liberty Web server.  This include the MQWEB, z/OSMF,  and Z/OSConnect.  I mentioned this to an IBM colleague who asked if I was using Java Shared classes.  These get loaded into z/OS shared pages.

When I implemented it, my Liberty server came up in half the time!

I found this blog post which was very helpful, and showed me where to look for more information.  I subsequently found this document (from 2006!)

The kinder garden overview of how Java works.

  • You start with a program written in the Java language.
  • When you run this, Java converts it into byte codes
  • These byte codes get converted to native instructions  – so a byte code “push onto the stack” may become 8  390 assembler instructions.
  • This code can be optimised, for example code which is executed frequently can have the assembler instructions rewritten to go faster.  It might put code inline instead of out in a subroutine.
  • If you are using Java shared classes, this code can be written out and reused by other applications, or if you restart the server, it can reused what it created before.  Reusing the shared classes means that programs benefit because the byte codes have already been converted into native code, and optimisations have been done on the hot code.

What happens on z/OS?

By default, z/OS writes the code to virtual memory and does not save anything to disk.  If you restart your Java application within the same IPL, it can exploit the shared classes which have been converted to native code, and optimised – great- good design.   I found the second time I started the web server it took half the time.  However I IPL once a day, and start my web server once a day. I do not benefit from having it start faster a second time – as I only started it once per session. By default when you re-ipl, the shared classes code is discarded, and so next time you need the code, it has to be to convert to native instructions again, and it loses any optimisation which had been done.

What is the solution?

It is two easy steps:!

  1. Tell Java to write the information from memory to disk – to take a snaphot.
  2. After IPL tell Java to load memory from the disk image – to restore a snapshot.

It is as simple as that.

Background.

It is all to do with the java -Xshareclasses.

With your application you tell Java where to store information about the shared classed.  It defaults to Cache=/tmp/ name=javasharedresources.

In my jvm.options I overrode the defaults and specified

-Xshareclasses:nonFatal 
-Xshareclasses:groupAccess
-Xshareclasses:cacheDirPerm=0777
-Xshareclasses:cacheDir=/tmp,name=mqweb

If you give each application a name (such as mqweb)  you can isolate the cache to an application and not disrupt another JVM if you change the cache.  For example if you restore from a snapshot, only users of that “name” will be affected.

List what is in the cache

You can use the USS command,

java -Xshareclasses:cacheDir=/tmp/,listAllCaches

I used a batch job to do the same thing.

//IBMJAVA  JOB  1 
// SET V='listAllCaches' 
// SET C='/tmp/' 
//S1       EXEC PGM=BPXBATCH,REGION=0M, 
// PARM='SH java -Xshareclasses:cacheDir=&C,&V' 
//STDERR   DD   SYSOUT=* 
//STDOUT   DD   SYSOUT=*            

The output below, shows the cache name is mqweb.  Once you have created a snapshot it has an entry for it.

Listing all caches in cacheDir /tmp/                                                                          
                                                                                                              
Cache name       level         cache-type      feature         OS shmid       OS semid 
mqweb            Java8 64-bit  non-persistent  cr              8197           4101 

For MQWEB the default parameters are -Xshareclasses:cacheDir=/u/mqweb/servers/.classCache,name=liberty-%u” where /u/mqweb is the WLP parameter, where my parameter are defined, and %u is the userid the server is running under, so in my case liberty=START1.

When I had /u/mqweb/servers/.classCache, then the total command line was too long for BPXBATCH.   (Putting it into STDPARM gave me IEC020I 001-4 on the instream STDPARM because the resolved line wa greater than 80 characters.   I resolved this by adding -Xshareclasses:cacheDir=/u/mqweb,name=cache to the jvm.options file.

To take a snapshot


//IBMJAVA  JOB  1 
// SET C='/tmp/' 
// SET N='mqweb' 
// SET V='restoreFromSnapshot' 
// SET V='listAllCaches'
// SET V='snapshotCache' //S1 EXEC PGM=BPXBATCH,REGION=0M, // PARM='SH java -Xshareclasses:cacheDir=&C,name=&N,&V' //STDERR DD SYSOUT=* //STDOUT DD SYSOUT=* //

This job took a few seconds to run.

I believe you have to take the snapshot while your java application is executing – but I do not know for definite.

Restore a snapshot

To restore a snapshot just use restoreFromSnapshot in the above JCL. This took a few seconds to run. 

How to use it.

If you put the restoreFromSnaphot JCL at the start of the web server, it will preload it whenever you use your server.

If you take a snapshot every day before shutting down your server, you will get a copy with the latest optimisations.  If you do not take a new snapshot it continues to use the old one.

If you want to not use the shared cache you can get rid of it using the command destroySnapshot.

Is my cache big enough?

If you use the printStats request you get information like

Current statistics for cache "mqweb":                                                
...                                                                                     
cache size                           = 104857040                                     
softmx bytes                         = 104857040                                     
free bytes                           = 70294788 
...
Cache is 32% full                                     
                                                      
Cache is accessible to current user = true                                                 

The documentation says

When you specify -Xshareclasses without any parameters and without specifying either the -Xscmx or -XX:SharedCacheHardLimit options, a shared classes cache is created with a default size, as follows:

  • For 64-bit platforms, the default size is 300 MB, with a “soft” maximum limit for the initial size of the cache (-Xscmx) of 64MB, …

I had specified -Xscmx100m  which matches the value reported.

What is in the cache?

You can use the printAllStats command.  This displays information like

Classpath

1: 0x00000200259F279C CLASSPATH
/usr/lpp/java/J8.0_64/lib/s390x/compressedrefs/jclSC180/vm.jar
/usr/lpp/java/J8.0_64/lib/se-service.jar
/usr/lpp/java/J8.0_64/lib/math.jar

Methods for a class
  • 0x00000200259F24A4 ROMCLASS: java/util/HashMap at 0x000002001FF7AEB8.
  • ROMMETHOD: size Signature: ()I Address: 0x000002001FF7BA88
  • ROMMETHOD: put Signature: (Ljava/lang/Object;Ljava/lang/Object;)Ljava/lang/Object; Address: 0x000002001FF7BC50

This shows

  • there is a class HashMap. 
  • It has a method size() with no parameters returning an Int.  It is at…. in memory
  • There is another method put(Object o1, Object o2)  returning an Object.  It is at … in memory
Other stuff

There are sections with JITHINTS and other performance related data.

 

Should I Use MQ workflow in z/OSMF or not? Not.

It was a bit windy up here in Orkney, so instead of going out for a blustery walk, I thought I would have a look at the MQ Workflow in z/OSMF; in theory it makes it easier to deploy a queue manager on z/OS. It took a couple of days to get z/OSMF working, a few hours to find there were no instructions on how to use the MQ workflow in z/OSMF, and after a few false starts, I gave up because there were so many problem! I hate giving up on a challenge, but I could not find solutions to the problems I experienced.

I’ve written some instructions below on how I used z/OSMF and the MQ workflow. They may be inaccurate or missing information as I was unable to successfully deploy a queue manager.

I think that the workflow technique could be useful, but not with the current implementation. You could use the MQ stuff as an example of how do do things.

I’ve documented below some of the holes I found.

Overall this approach seems broken.

I came to this subject knowing nothing about it.  May be I should have gone on a two week course to find out how to use it.  Even if I had had education I think it is broken. It may be that better documentation is needed.

For example

  • Rather than simplifying the process, the project owner may have to learn about creating and maintaining the workflows and the auxiliary files the flows use – for example loops in the JCL files.
  • I have an MQ fix in a library, so I have a QFIX library in my steplib.   The generated JCL has just the three SCSQLOAD,SCSQAUTH, SCSQANLE in STEPLIB. I cannot see how to get the QFIX library included. This sort of change should be trivial.
  • If I had upgraded from CSQ913 to CSQ920, I do not see how the workflows get refreshed to point to the new libraries.
  • There are bits missing – setting up the STARTED profile for the queue manager, and specifying the userid.

Documentation and instructions to tell you how to use it.

The MQ documentation points to the program directory which just list the contents of the ../zosmf directory. Using your psychic powers you are meant to know that you have to go the z/OSMF web console and use the WorkFlows icon. There is a book z/OSMF Programming Guide which tells you how to create a workflow – but not how to use one!

The z/OSMF documentation (which you can access from z/OSMF) is available here. It is a bit like “here is all we know about how to use it”, rather than “baby steps for the new user”. There is a lot to read, and it looks pretty comprehensive. My eyes glazed over after a few minutes reading and so it was hard to see how to get started.

Before you start

You need access to the files in the ZFS.

  • For example  /usr/lpp/mqm/V9R1M1/zosmf/provision.xml this is used without change.   This has all of the instructions that the system needs to create the workflow.
  • Copy /usr/lpp/mqm/V9R1M1/zosmf/workflow_variables.properties  to your directory.  You need to edit it and fill in all of the variables between <…>.   There are a lot of parameters to configure!

You need a working z/OSMF where you can logon, use TSO etc.

Using z/OSMF to use the work flow

When I logged on to z/OSMF the first time, I had a blank  web page. I played around with the things at the bottom left of the screen, and then I got a lot of large icons.  These look a bit of a mess – rather than autoflow them to fill the screen, you either need to scroll sideways or make your screen very wide (18 inches wide).  These icons do not seem to be in any logical order.  It starts with “Workflows”; “SDSF” is a distance away from “SDSF settings”.  I could not see how to make small icons, or to sort them.

I selected “Classic” from the option by my  userid – This was much more usable, it had a compact list of actions, grouped sensibly,  down the side of the screen,

Using “workflow” to define MQ queue managers.

Baby steps instructions to get you started are below.

  • Click on the Workflows icon.
  • From the actions pull down, select Create Workflow.
  • In the workflow definition file enter /usr/lpp/mqm/V9R1M1/zosmf/provision.xml  or what workflow file you plan to use.
  • In the Workflow variable input file, enter the name of your edited workflow_variables.properties file.  Once you have selected this the system copies the content. To use a different file, or update the file,  you have to create a new Workflow.  If you update it, any workflows based on it do not get updated.
  • From the System pull down, select the system this is for.
  • Click ‘next’.   This will validate the variables it will be using.  Variables it does not use, do not get checked.
    • It complained that CSQ_ARC_LOG_PFX = <MQ.ARC.LOG.PFX> had not been filled in
    • I changed it to CSQ_ARC_LOG_PFX = MQ.ARCLOG – and it still complained
    • It would only allow CSQ_ARC_LOG_PFX = MQ, pity as my standards were MQDATA.ARCHIVE.queue_manager etc.
  • Once the input files have been validated it displays a window “Create Workflow”.  Tick the box “Assign all steps to owner userid”.  You can (re-)assign them to people later.
  • Click Finish.
  • It displays “Procedure to provision a MQ for zOS Queue manager 0 Workflow_o” and lists all of the steps – all 22 of them.
  • You are meant to do the steps in order.   The first step has State “Ready”, the rest are “Not ready”.
  • I found the interface unreliable.  For example
    • Right click on the Title of the first item.  Choose “Assignment And Ownership”.  All of the items are greyed out and cannot be selected.
    • If you click the tick box in the first column, click on “Actions” pull down above the workflow steps.  Select “Assignment And Ownership”.  You can now  assign the item to someone else.
      • If you select this, you get the “Add Assignees” panel.  By default it lists groups.  If you go to the “Actions” pull down, you can add a SAF userid or group.
      • Select the userids or groups and use the “Add >” to assign the task to them.
  • With the list of tasks, you can pick “select all”, and assign them;  go to actions pull down, select Assignment And Ownership, and add the userids or groups.
  • Once you are back at the workflow you have to accept the task(s).   Select those you are interested in, and pick Actions -> Accept. 
  • Single left click on the Title – “Specify Queue Manager Criteria”.  It displays a tabbed pane with tabs
    • General – a description of the task
    • Details – it says who the task has been assigned to  and its status.
    • Notes – you can add items to this
    • Perform – this actually does the task.
  • Click on the “Perform” tab.  If this is greyed out, it may not have been assigned to you, or you may not have accepted it.
    • It gives a choice of environments, eg TEST, Production.  Select one
    • The command prefix eg !ZCT1.
    • The SSID.
    • Click “Next”.  It gives you Review Instructions; click Finish.
  • You get back to the list of tasks.
    • Task 1 is now marked as Complete.
    • Task 2 is now ‘ready’.
    • Left single click task 2 “Validate the Software Service Instance Name Length”.
    • The Dependencies tab now has an item “Task 1 complete”.  This is all looking good.
  • Note: Instead of going into each task, sometimes you can use the Action -> perform to go straight there – but usually not.
  • Click on Perform
    • Click next
    • It displays some JCL which you can change, click Next. 
    • It displays “review JCL”.
      • It displays Maximum Record Length 1024.  This failed for me – I changed it to 80, and it still used 1024!
    • Click Next..  When I clicked Finish, I got “The request cannot be completed because an error occurred. The following error data is returned: “IZUG476E:The HTTP request to the secondary z/OSMF instance “S0W1” failed with error type “HttpConnectionFailed” and response code “0” .”  The customising book mentions this, and you get it if you use AT-TLS – which I was not using.  It may be caused by not having the IP address in my Linux /etc/hosts file.  Later, I added the address to the /etc/hosts file on my laptop, restarted z/OSMF and it worked.
    • I unticked “Submit JCL”, and ticked “Save JCL”.  I saved it in a Unix file, and clicked finish.  It does not save the location so you have to type it in every time (or keep it in your clipboard), so not very usable.
    • Although I had specified  Maximum Record Length of 80, it still had 1024.  I submitted the job and it complained with “IEB311I CONFLICTING DCB PARAMETERS”.   I edited to change 1024 to 80 in the SPACE and DCB, and submitted it.  The JCL then worked.
    • When the rexx ran, it failed with …The workflow Software Service Instance Name (${_workflow-softwareServiceInstanceName})….  The substitution had not been done.  I dont know why – but it means you cannot do any work with it.
    • When I tried another step, this also had no customisation done, so I gave up.
    • Sometimes when I ran this the “Finish” button stayed greyed out, so I was unable to complete the step.  After I shut down z/OSMF and restarted it, it usually fixed it.
  • I looked at the job “Define MQ Queue Manager Security Permissions” – this job creates a profile to disable MQ security – it did not define the security permissions for normal use.
  • I tried the step to Dynamically allocate a port for the MQ chin.  I got the same IZUG476E error as before.   I fixed my IP address, and got another error. It received status 404 from the REST request.   In the /var/zosmfcp/data/logs/zosmfServer/logs/messages.log  I had  SRVE0190E: File not found: /resource-mgmt/rest/1.0/rdp/network/port/actions/obtain.  For more information on getting a port see here.

Many things did not work, so I gave up.

Put messages to a queue workflow. 

I tried this, and had a little (very little) more success.

As before I did

  • Workflows
  • Actions pull down
  • Create workflow.   I used
    • /usr/lpp/mqm/V9R1M1/zosmf/putQueue.xml
    • and the same variable input file
  • Lots of clicks – including  Assign all steps to owner userid
  • Click Finish.   This produced a workflow with one step!
  • Left click on the step.  Go to Perform.  This lists
    • Subsystem ID
    • Queue name
    • Message_data
    • Number of messages.
  • Click Next.
  • Click Next, this shows the job card
  • Click Next,  this shows the job.
  • Click Next.  It has “Submit JCL” ticked.  Click Finish.   This time it managed to submit the JCL successfully!
  • After several seconds it displays the “Status” page, and after some more seconds, it displays the job output.
  • There is a tabbed panel with tabs for JESMSGLG, JESJCL, JESYSMSG,SYSPRINT.
  • I had a JCL error – I had made a mistake in the  MQ libraries High level qualifier.
  • I updated my workflow_variables.properties file, but I could not find say of telling the workflow to use the update variable properties file.  To solve this I had to
    • Go back to the Workflows screen where it lists the workflows I have created. 
    • Delete the workflow instance.
    • Create a new workflow instance, which picks up the changed file
    • I then deployed it and “performed” the change, and the JCL worked.
    • I would have preferred a quick edit to some JCL and resubmit the job, rather than the relatively long process I had to follow.
  • If this had happened during the deploy a queue manager workflow this would have been really difficult.   There is no “Undo Step”, so I think you would have had to create  the De-provision workflow – which would have failed because many of the steps would not have been done, or you delete the provision workflow, fix the script and redo all the steps (or skip them).

If this all worked, would I use it?

There were too many clicks for my liking.  It feels like trying to simplify things has made it more complex.  There are many more things that could go wrong – and many did go wrong, and it was hard to fix.  Some problems I could not fix.  I think these work flows are provided as an example to the customer.  Good luck with it!

A practical guide to getting z/OSMF working.

Every product using Liberty seems to have a different way of configuring the product.  As first I thought specifing parameters to z/OSMF was great, as you do it by a SYS1.PARMLIB member.  Then I found you have other files to configure; then I found that I could not reuse my existing definitions from z/OS Connect, and MQWEB.  Then I found it erases any local copy of the server.xml file, and links to the one shipped as part of the server.xml file from the configuration files each time.   Later I used this to my advantage.  Once again I seemed to be struggling against the product to do simple things.  Having gone through the pain, and learnt how to configure z/OSMF, its configuration is OK.

You specify some parameters in SYS1.PARMLIB(IZUPRMxx) concatenation.  See here for the syntax and the list of parameters. In mid 2020 there was an APAR PH24088  which allowed you to change change these parameters dynamically, using a command such as:

SETIZU ILUNIT=SYSDA

Before you start.

I had many problems with certificates before I could successfully logon to z/OSMF.

I initially found it hard to understand where to specify configuration options, as I was expecting to specify them in the server.xml file. See z/OSMF configuration options mapping for the list of options you can specify.

If you change the options you have to restart the server.   Other systems that use Liberty have a refresh option which tells the system to reread the server.xml file.   z/OSMF stores the variables in variable strings in the bootstrap.options file, and I could not find a refresh command which refreshed the data.   (There is a refresh command which does not refresh.)  See z/OSMF commands.

 Define the userid

I used a userid with a home directory /var/zosmfcp/data/home/izusvr.   I had to issue

mkdir /var/zosmfcp
chown izusvr /var/zosmfcp

mkdir -p /var/zosmfcp/data/home/izusvr
chown izusvr /var/zosmfcp/data/home/izusvr

touch /var/zosmfcp/configuration/local_override.cfg 
chmod g+r /var/zosmfcp/configuration/local_override.cfg
chown :IZUADMIN /var/zosmfcp/configuration/local_override.cfg

Getting the digital certificate right.

I had problems using the provided certificate definitions.

  1. The CN did not match what I expected.
  2. There was no ALTNAME specified.  For example ALTNAME((IP(10.1.1.2))  (where 10.1.1.2 was the external IP address of my z/OS). The browser complained because it was not acceptable.   An ALTNAME must match the IP address or host name the data came from.  Without a valid ALTNAME you can get into a logon loop.  Using Chrome I got
    1. “Your connection is not private”. Common name invalid.
    2. Click on Advanced and “proceed to..  (unsafe)”
    3. Enter userid and password.
    4. I had the display with all of the icons.  Click on the Pul-up and  switch to “classic interface”
    5. I got “Your connection is not private”. Common name invalid, and round the loop again.
  3. The keystore is also used as the trust store, so it needs the Certificate Authority’s certificates.  Other products using Liberty use a separate trust store.  (The keystore contains the certificate the server uses to identify itself.  The trust store contains the certificates (such as Certificate Authority certificates) to validates certificates sent from clients to the server).   With z/OSMF there is no definition for the trust store.   To make the keystore work as a trust store, the keystore needs:
    1.  the CA for the server (z/OSMF talks to itself over TLS) this means the each end of the conversation within the server, needs it to validate the server’s certificate.
    2. the CA for any certificates in any browsers being used.
    3. I had to use the following statements to convert my keystore to add the trust store entries.
      RACDCERT ID(IZUSVR) CONNECT(CERTAUTH -
      LABEL('MVS-CA') RING(KEY) )

      RACDCERT ID(IZUSVR) CONNECT(CERTAUTH -
      LABEL('Linux-CA2') -
      RING(KEY ) USAGE(CERTAUTH))

Reusing my existing keyring

Eventually I got this to work.  I had to…

    1. Connect the CA of the z/OS server into the keyring.
    2. Update /var/zosmfcp/configuration/local_override.cfg for ring //START1/KEY2
    3. KEYRING_NAME=KEY2
      KEYRING_OWNER_USERID=START1
      KEYRING_TYPE=JCERACFKS
      KEYRING_PASSWORD=password

The z/OSMF started task userid requests CONTROL access to the keyring. 

It requests CONTROL access (RACF reports this!), but UPDATE  access seems to work See RACF: Sharing one certificate among multiple servers.

 With only READ access I got.

CWWKO0801E: Unable to initialize SSL connection. Unauthorized access was denied or security settings have expired. Exception is javax.net.ssl.SSLException: Received fatal alert: certificate_unknown

If it does not have UPDATE access, then z/OSMF cannot see the private certificate.

Use the correct keystore type. 

My RACF KEYRING keystore only worked when I had a keystore type of JCERACFKS.  I specified it in /var/zosmf/configuration/local_override.cfg

KEYSTORE_TYPE=JCERACFKS 

Before starting the server the first time

If you specify TRACE=’Y’, either in the procedure or as part of the start command, it traces the creating of the configuration file, and turns on the debug script.  TRACE=’X’ gives a BASH trace as well.

It looks like the name of the server is hard coded internally as zosmfServer, and the value in the JCL PARMS= is ignored.

Once it has started

If you do not get the logon screen, you may have to wait.  Running z/OS under my Ubuntu Laptop is normally fine for editing etc.  it takes about 10 minutes to start z/OSMF.  If you get problems, with incomplete data displayed, or messages saying resources not found, wait and see if it gets resolved.  Sometimes I had to close and restart my browser.

Useful to know…

  1. Some, but not all, error messages are in the STDERR output from the started task.
  2. The logs and trace are in /var/zosmfcp/data/logs/zosmfServer/logs/.  Other products using Liberty have /var/zosmfcp/data/logs/zosmfServer/logs/ so all the files are under /var/zosmfcp/zosmfServer
  3. The configuration is in /var/zosmfcp/configuration and /var/zosmfcp/configuration/servers/zosmfServer.   This is a different directory tree layout from other Liberty components.
  4. If you want to add one JVM option, edit the local_override.cfg and add   JVM_OPTIONS=-Doption=value.  See here if you want to add more options.  I used JVM_OPTIONS=-Djavax.net.debug=ssl:handshake to give me the trace of the TLS handshake.
  5. If you have problems with certificate not being found on z/OS, you might issue the SETROPTS  command to be 100% sure that what is defined to RACF is active in RACF.  Use SETROPTS RACLIST(DIGTCERT,DIGTRING,RDATALIB) refresh.  
  6. Check keyring contents using racdcert listring(KEY2) id(START1)
  7. Check certificate status using RACDCERT LIST(LABEL(‘IZUZZZZ2’ )) ID(START1) and check it has status:trust and the dates are valid.
  8. If your browser is not working as expected – restart it.

Tailoring your web browser environment.

Some requests, for example workflow, use a REST request to perform an action.  Although I had specified a HOSTNAME of 10.1.1.2, z/OSMF used an internal address of S0W1.DAL-EBIS.IHOST.COM When the rest request was issued from my browser – it could not find the back end z/OS.  I had to add this site to my Linux /etc/hosts

10.1.1.2   S0W1.DAL-EBIS.IHOST.COM 

Even after I had resolved the TCPIP problem on z/OS which caused this strange name to be used, z/OSMF continued to use it. When I recreated the environment the problem was resolved. I could not see this value in any of the configuration files.

Getting round the configuration problems

I intially found it hard so specify additional configuration options, to override what was provided by z/OSMF.  For example reuse what I had from other Liberty servers.

I changed the server.xml file to include

<include optional="false" location="${server.config.dir}/colin.xml"/> 

This allowed me to put things in the colin.xml file.  Im sure this solution is not recommended by IBM, but it is a practical solution.  This file may be read only to you.

You should not need this solution if you can use the z/OSMF configuration options mapping.