I’m sorry I haven’t a clue…

As well as being a very popular British comedy, it is how I sometimes feel about what is happening inside the Liberty Web servers, and products like z/OSMF, z/OS Connect and MQWEB. It feels like a spacecraft in cartoons – there are usually only two controls – start and stop.

One reason for this is that the developers often do not have to use the product in production, and have not sat there, head in hand saying “what is going on ?”.

In this post I’ll cover

What data values to expose

As a concept, if you give someone a lever to pull – you need to give them a way of showing the effect of pulling the level.

If you give someone a tuning parameter, they need to know the impact of using the tuning parameter. For example

  • you implement a pool of blocks of storage.
  • you can configure the number of maximum number of blocks
  • if a thread needs some storage, and there is a free block in the pool, then assign the block to the thread. When the thread has finished with it, the thread goes back into the pool.
  • if all the blocks in the pool are in-use, allocate a block. When the thread has finished with the block – free it.
  • if you specify a very large number of blocks it could cause a storage shortage

The big questions with this example is “how big do you make the pool”?

To be able to specify the correct pool size you need to know information like

  • What was the maximum number of blocks used – in total
  • How many times were additional blocks allocated (and freed)
  • What was the total number of blocks requested.

You might decide that the pool is big enough if less than1% of requests had to allocate a block.

If you find that the maximum value used was 1% of the size of the pool, you can make the pool much smaller.

If you find that 99% of the requests were allocated/freed, this indicates the pool is much to small and you need to increase the size.

For other areas you could display

  • The number of authentication requests that were userid+ password, or were from a certificate.
  • The number of authentication requests which failed.
  • The list of userid names in the userid cache.
  • How many times each application was invoked.
  • The number of times a thread had to wait for a resource.
  • The elapsed time waiting for a resource, and what the resource was.

What attributes to expose

You look at the data to ask

  • Do I have a problem now?
  • Will I have a problem in the future? You need to collect information over time and look at trends.
  • When we had a problem yesterday, did this component contribute to it? You need to have historical data.

It is not obvious what data attributes you should display.

  • The “value now” is is easy to understand.
  • The “average value” is harder. Is this from the start of the application (6 months ago), or a weighted average (99 * previous average + current value)/100. With this weighted average, a change since the previous value indicates the trend.
  • The maximum value is hard – from when? There may have been a peak at startup, and small peaks since then will not show up. Having a “reset command” can be useful, or have it reset on a timer – such as display and reset every 10 minutes.
  • If you “reset” the values and display the value before any activity, what do you display? “0”s for all of the values, or the values when the reset command was issued.

Resetting values can make it easier to understand the data. Comparing two 8 digit numbers is much harder than comparing two 2 digit numbers.

How to expose data

Java has a Java Management eXtension (JMX) for reporting management information. It looks very well designed, is easy to use, and very compact! There is an extensive document from Oracle here.

I found Basic Introduction to JMX by Baeldung , was an excellent article with code samples on GitHub. I got these working in Eclipse within an hour!

The principal behind JMX is …

For each field you want to expose you have a get… method.

You define an interface with name class| |”MBean” which defines all of the methods for displaying the data.

public interface myClassMBean {
public String getOwner();
public int getMaxSize();
}

You define the class and the methods to expose the data.

public class myClass implements myClassMBean{

// and the methods to expose the data

public String getOwner() {
return fileOwner;
}

public int getMaxSize() {
return fileSize;
}

}

And you tell JMX to implement it

myClass myClassInstance = new myClass(); // create the instance of myClass

MBeanServer server = ManagementFactory.getPlatformMBeanServer();
ObjectName objectName =….
server.registerMBean(myClassInstance, objectName);

Where myClassInstance is a class instance. The JMX code extracts the name of the class from the object, and can the identify all the methods defined in the class||”MBean” interface. Tools like jconsole can then query these methods, and invoke them.

ObjectName is an object like

ObjectName objectName = new ObjectName(“ColinJava:type=files,name=onefile”);

Where “ColinJava” is a high level element, “type” is a category, and “name” is the description of the instance .

That’s it.

When you use jconsole ( or other tools) to display it you get

You could have

MBeanServer server = ManagementFactory.getPlatformMBeanServer();

ObjectName bigPoolName = new ObjectName(“ColinJava:type=threadpool,name=BigPool”);
server.registerMBean(bigpoolInstance, bigPoolName);

ObjectName medPoolName = new ObjectName(“ColinJava:type=threadpool,name=MedPool”);
server.registerMBean(medpoolInstance, medPoolname);

ObjectName smPoolName = new ObjectName(“ColinJava:type=threadpool,name=SmallPool”);
server.registerMBean(smallpoolInstance,smPoolName);

This would display the stats data for three pools

  • ColinJava
    • threadpool
      • Bigpool..
      • MedPool….
      • SmallPool…

And so build up a tree like

  • ColinJava
    • threadpool
      • Bigpool..
      • MedPool….
      • SmallPool…
    • Userids
      • Userid+password
      • Certificate
    • Applications
      • Application 1
      • Application 2
    • Errors
      • Applications
      • Authentication

You can also have set…() methods to set values, but you need to be more careful; checking authorities, and possibly synchronising updates with other concurrent activity.

You can also have methods like resetStats() which show up within jconsole as Operations.

How do I build up the list of what is needed?

It is easy to expose data values which have little value. I remember MQ had a field in the statistics “Number of times the hash table changed”. I never found a use for this. Other times I thought “If only we had a count of ……”

You can collect information from problems reported to you. “It was hard to diagnose because… if we had the count of … the end user could have fixed it without calling us”.

Your performance team is another good source of candidates fields. Part of the performance team’s job is to identify statistics to make it easier to tune the system, and reduce the resources used. It is not just about identifying hot spots.

Before you implement the collection of data, you could present to your team on how the data will be used, and produce some typical graphs. You should get some good feedback, even if it is “I dont understand it”.

What can I use to display the data

There are several ways of displaying the data.

  • jconsole – which comes as part of Java can display the data in a window
  • python – you can issue a query can capture the data. I have this set up to capture the data every 10 seconds
  • other tools using the standard interfaces.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s