Why do they ship java products on z/OS with the handbrake on? And how to take the brake off.

I noticed that it takes seconds to start MQ on my little z/OS machine, but minutes (feels like days) to start anything with Liberty Web server.  This include the MQWEB, z/OSMF,  and Z/OSConnect.  I mentioned this to an IBM colleague who asked if I was using Java Shared classes.  These get loaded into z/OS shared pages.

When I implemented it, my Liberty server came up in half the time!

I found this blog post which was very helpful, and showed me where to look for more information.  I subsequently found this document (from 2006!)

The kinder garden overview of how Java works.

  • You start with a program written in the Java language.
  • When you run this, Java converts it into byte codes
  • These byte codes get converted to native instructions  – so a byte code “push onto the stack” may become 8  390 assembler instructions.
  • This code can be optimised, for example code which is executed frequently can have the assembler instructions rewritten to go faster.  It might put code inline instead of out in a subroutine.
  • If you are using Java shared classes, this code can be written out and reused by other applications, or if you restart the server, it can reused what it created before.  Reusing the shared classes means that programs benefit because the byte codes have already been converted into native code, and optimisations have been done on the hot code.

What happens on z/OS?

By default, z/OS writes the code to virtual memory and does not save anything to disk.  If you restart your Java application within the same IPL, it can exploit the shared classes which have been converted to native code, and optimised – great- good design.   I found the second time I started the web server it took half the time.  However I IPL once a day, and start my web server once a day. I do not benefit from having it start faster a second time – as I only started it once per session. By default when you re-ipl, the shared classes code is discarded, and so next time you need the code, it has to be to convert to native instructions again, and it loses any optimisation which had been done.

What is the solution?

It is two easy steps:!

  1. Tell Java to write the information from memory to disk – to take a snaphot.
  2. After IPL tell Java to load memory from the disk image – to restore a snapshot.

It is as simple as that.

Background.

It is all to do with the java -Xshareclasses.

With your application you tell Java where to store information about the shared classed.  It defaults to Cache=/tmp/ name=javasharedresources.

In my jvm.options I overrode the defaults and specified

-Xshareclasses:nonFatal 
-Xshareclasses:groupAccess
-Xshareclasses:cacheDirPerm=0777
-Xshareclasses:cacheDir=/tmp,name=mqweb

If you give each application a name (such as mqweb)  you can isolate the cache to an application and not disrupt another JVM if you change the cache.  For example if you restore from a snapshot, only users of that “name” will be affected.

List what is in the cache

You can use the USS command,

java -Xshareclasses:cacheDir=/tmp/,listAllCaches

I used a batch job to do the same thing.

//IBMJAVA  JOB  1 
// SET V='listAllCaches' 
// SET C='/tmp/' 
//S1       EXEC PGM=BPXBATCH,REGION=0M, 
// PARM='SH java -Xshareclasses:cacheDir=&C,&V' 
//STDERR   DD   SYSOUT=* 
//STDOUT   DD   SYSOUT=*            

The output below, shows the cache name is mqweb.  Once you have created a snapshot it has an entry for it.

Listing all caches in cacheDir /tmp/                                                                          
                                                                                                              
Cache name       level         cache-type      feature         OS shmid       OS semid 
mqweb            Java8 64-bit  non-persistent  cr              8197           4101 

For MQWEB the default parameters are -Xshareclasses:cacheDir=/u/mqweb/servers/.classCache,name=liberty-%u” where /u/mqweb is the WLP parameter, where my parameter are defined, and %u is the userid the server is running under, so in my case liberty=START1.

When I had /u/mqweb/servers/.classCache, then the total command line was too long for BPXBATCH.   (Putting it into STDPARM gave me IEC020I 001-4 on the instream STDPARM because the resolved line wa greater than 80 characters.   I resolved this by adding -Xshareclasses:cacheDir=/u/mqweb,name=cache to the jvm.options file.

To take a snapshot


//IBMJAVA  JOB  1 
// SET C='/tmp/' 
// SET N='mqweb' 
// SET V='restoreFromSnapshot' 
// SET V='listAllCaches'
// SET V='snapshotCache' //S1 EXEC PGM=BPXBATCH,REGION=0M, // PARM='SH java -Xshareclasses:cacheDir=&C,name=&N,&V' //STDERR DD SYSOUT=* //STDOUT DD SYSOUT=* //

This job took a few seconds to run.

I believe you have to take the snapshot while your java application is executing – but I do not know for definite.

Restore a snapshot

To restore a snapshot just use restoreFromSnapshot in the above JCL. This took a few seconds to run. 

How to use it.

If you put the restoreFromSnaphot JCL at the start of the web server, it will preload it whenever you use your server.

If you take a snapshot every day before shutting down your server, you will get a copy with the latest optimisations.  If you do not take a new snapshot it continues to use the old one.

If you want to not use the shared cache you can get rid of it using the command destroySnapshot.

Is my cache big enough?

If you use the printStats request you get information like

Current statistics for cache "mqweb":                                                
...                                                                                     
cache size                           = 104857040                                     
softmx bytes                         = 104857040                                     
free bytes                           = 70294788 
...
Cache is 32% full                                     
                                                      
Cache is accessible to current user = true                                                 

The documentation says

When you specify -Xshareclasses without any parameters and without specifying either the -Xscmx or -XX:SharedCacheHardLimit options, a shared classes cache is created with a default size, as follows:

  • For 64-bit platforms, the default size is 300 MB, with a “soft” maximum limit for the initial size of the cache (-Xscmx) of 64MB, …

I had specified -Xscmx100m  which matches the value reported.

What is in the cache?

You can use the printAllStats command.  This displays information like

Classpath

1: 0x00000200259F279C CLASSPATH
/usr/lpp/java/J8.0_64/lib/s390x/compressedrefs/jclSC180/vm.jar
/usr/lpp/java/J8.0_64/lib/se-service.jar
/usr/lpp/java/J8.0_64/lib/math.jar

Methods for a class
  • 0x00000200259F24A4 ROMCLASS: java/util/HashMap at 0x000002001FF7AEB8.
  • ROMMETHOD: size Signature: ()I Address: 0x000002001FF7BA88
  • ROMMETHOD: put Signature: (Ljava/lang/Object;Ljava/lang/Object;)Ljava/lang/Object; Address: 0x000002001FF7BC50

This shows

  • there is a class HashMap. 
  • It has a method size() with no parameters returning an Int.  It is at…. in memory
  • There is another method put(Object o1, Object o2)  returning an Object.  It is at … in memory
Other stuff

There are sections with JITHINTS and other performance related data.

 

2 thoughts on “Why do they ship java products on z/OS with the handbrake on? And how to take the brake off.

  1. “then the total command line was too long for BPXBATCH” If you can break the long string up, there are two techniques for handling this:

    1) use a concatenation of USS variables
    //*
    // EXPORT SYMLIST=*
    // SET P1=’/high-dir’
    // SET P2=’/next-dir’
    //*
    //BPX EXEC PGM=BPXBATCH
    //STDOUT DD SYSOUT=*,LRECL=134,RECFM=VB
    //PP DD SYSOUT=*
    //STDPARM DD *,SYMBOLS=(JCLONLY,PP)
    SH set -vx;
    A=’&P1′;
    B=’&P2′;
    ls -lR $A$B;
    /*

    2) use a concatenation in a PARMDD
    //*
    //BPX EXEC PGM=BPXBATCH,PARMDD=SYSIN
    //STDOUT DD SYSOUT=*,LRECL=134,RECFM=VB
    //SYSIN DD *,SYMBOLS=JCLONLY
    SH set -vx; ls -lR
    &P1
    &P2;
    /*

    The ‘1’ in &P1 is in col 72 – this is to demonstrate that the symbolic is expanded – even though it expands past col 72 (the limit for the PARMDD) and is pre-pended to the expansion of &P2.

    Like

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s