I was running z/OSMF and saw that the CPU costs where high when it was sitting there doing nothing. I managed to reduce the CPU costs by more than half. This would apply to other Liberty based web servers, such as MQWEB, and z/OS Connect.
I could see from the MVS system trace there was a lot of activity creating a thread, and deleting a thread, a lot of costs associated with these activities, such as allocating and freeing storage.
I increased the number of threads so that this allocating a thread and delete a thread activity disappeared.
In the xml configuration file (based from server.xml) was the default
<executor name=”LargeThreadPool” id=”default” coreThreads=”100″
maxThreads=”0″ keepAlive=”60s” stealPolicy=”STRICT”
I changed this to
<executor name=”LargeThreadPool” id=”default”
coreThreads=”300″ maxThreads=”600″ keepAlive=”60s”
stealPolicy=”STRICT” rejectedWorkPolicy=”CALLER_RUNS” />
and restarted the server.
The options are documented here. There is an option keepAlive which defaults to 60 seconds. If a thread has been idle for this time, the thread is a candidate to be freed to reduce the pool back to corethreads size.
I was alerted to this problem when I looked at an MVS system trace. This is described here.
There is a discussion how sun thread pools work in this post. It is not obvious. This may or may not be how this executor works.
What value should you use?
This is a hard question, as Liberty does not provide this information directly.
I used the Health Checker connects from Eclipse to the JVM and extracts information about the JVM and applications.
This shows that at rest there was a lot of activity. I increased it to 250 threads and restarted the server and got
So better … but still some activity. I increased it to 300 threads, and the graph was flat.
I set up USER.Z24A.PROCLIB(CEEOPT) with
in my z/OSMF job I had
//CEEOPTS DD DISP=SHR,DSN=USER.Z24A.PROCLIB(CEEOPT)
This printed out a lot of useful information about the stack and heap usage. It the bottom it said
Largest number of threads concurrently active: 397
The number of threads includes threads from the pool I had specified, plus other threads that z/OSMF creates. The health check showed there were 372 threads, event though coreThreads was set to “300”.
I also used jconsole to display information about the highest thread usage. The URL was service:jmx:rest://10.1.1.2:10443/IBMJMXConnectorREST. It displays peak threads and live threads.
I found the security of both jconsole, and health check, was weak (userid and password). I was unable to successfully set up a TLS certificate logon to the server.
The information from rptstg was only available at shutdown.
Why does increasing the number of threads reduce the CPU when idle?
The thread pool has logic to remove unused threads and shrink it to the coreThreads size. If the pool size is too small it has to create threads and delete threads according to the load. See here. The keepAlive mentioned at the top is how long a thread can be idle for, before it can be considered a candidate for deletion.
Monitor the CPU used when idle and see if increasing the threadpool to 300 helps.