Why cant I logoff from mqconsole?

If you are using mqweb using certificates to identify yourself, if you logoff, or close the tab, then open a new tab, you will get a session using the same certificate as before.

This little problem has been a tough one to investigate, and turns out to be lack of function in Chromium browser.

The scenario is you connect to mqweb using a digital certificate. You want to logoff and logon again with a different certificate, for example you do most of your work with a read only userid, and want to logon with a more powerful id to make a change.  You click logoff, and your screen flashes and logs you on again with the same userid as before.

At first glance this may look like a security hole, but if someone has access to your web browser, then the can click on the mqweb site, and just pick a certificate – so it is no different.

Under the covers,  the TLS handshake can pass up the previous session ID.   If the server  recognises this, then it is a short handshake instead of a full hand shake, so helping performance.

To reset the certificate if you are using Firefox

To clear your SSL session state in Firefox choose History -> Clear Recent History… and then select “Active Logins” and click “OK”. Then the next time you connect to your SSL server Firefox will prompt for which certificate to use, you may need to reset the URL.

You should check Firefox preferences, certificates, “Ask you every time” is selected, rather than “Select one automatically”.

Chrome does not support this reset of the certificate.

There has been discussion over the last 9 years along the lines of, seeing as Internet Explorer, and Firefox have there, should we do it to met the end user demand?

If you set up an additional browser instance, you get the same problem. With Chrome you have to close down all instances of the browser and restart chrome to be able to select a different certificate.

It looks like there is code which has a cache of url, and certificate to use.   If you open up another tab using the same IP address you will reuse the same certificate.

If you localhost instead of 127.0.0.1 – it will prompt for certificate, and then cache it, so you can have one tab open with one certificate, and another tab, with a different URL and another certificate.

What does mqRestCorsMaxAgeInSeconds in mqweb mean?

Ive blogged about CORS, and how this allows you to list sites that are permitted to use scripts to send request to the mqweb server.

I struggled with understanding what value mqRestCorsMaxAgeInSeconds has, as it did not behave as expected (this was due to a bug).

If you have a CORS transaction there is an OPTIONS request, followed by the actual DELETE/GET/POST requests.
The OPTIONS request checks that the request is coming from an authorised list of URLs, and that the parameters are valid.  The OPTIONS information can be cached.

If the check is successful then the real request can be issued.  If the requests occur in a short time, then you can have OPTIONS, DELETE, DELETE, DELETE, using the cached values.  If there is a “long” time interval between the requests you may get OPTIONS, DELETE, gap, OPTIONS, DELETE.

The OPTIONS information can be cached for up to the minimum of the mqweb mqRestCorsMaxAgeInSeconds  time, and the browser time.

For Chrome the maximum time interval is 600 seconds.  If no time is specified in the OPTIONS response, then 5 seconds is used.

There is a bug in the Liberty base code which sends down the header Access-Control-Allow-Max-Age: …, when the browser is expecting Access-Control-Max-Age.   Because of this, the default time of 5 seconds is used in Chrome.

This should not have a major impact.  For those applications using scripts to send multiple REST API request, there will be more frequent OPTIONS requests – every 5 seconds instead of up to 600 seconds.  These extra flowes are invisible to the scripts and the end user.

What value should I use?

Chrome has a maximum of 600 seconds, with a default of 5 seconds.

Firefox has a maximum of 24 hours (86400 seconds).

Setting it to 600 seconds sounds reasonable to me.

Making changes to mqRestCorsMaxAgeInSeconds

If you change mqRestCorsMaxAgeInSeconds you have to restart the mqweb server.

I do not get caching!

When researching this topic I found every GET request had an OPTIONS request, rather than the OPTIONS, GET, GET.   A quick search on the internet showed me the Chrome option ( F1 -> Settings and preferences) “Disable Cache ( while DevTOOLS is open)” was enabled. I deselected this, and I got the caching.

Can evil websites get to your mqweb – using java script to get to the backend server

With HTML and scripting people would write scripts and get you to execute them, and so access your personal information, and steal your money.  Over time security has been improved to make this harder, and now you have to explicitly say which web sites can run scripts which use the mqweb interface to access MQ to put and get messages.

One way of protecting the access is using Cross Origin Resource Sharing, or CORS.  I explained the basics of CORS here.  I struggled getting it to work with a web browser.

  • browsers have been improved and what worked last year may no longer work now, and the documentation does not reflect the newer browsers.
  • Chrome carefully changes your hand crafted HTTP headers, so what is sent up may not what you expected.

I’ll go through three examples, and then show how it gets more difficult, and what you can do to identify problems.

Note: If you use a web page from a file:// then the origin is ‘null’, and this will fail any checks in the backend, as the checks compare the passed origin to the list of acceptable urls.

I used Dev Tools in Chrome (Alt+Ctrl+i) to display information including the headers flowing.

Simple HTML

With the following within your web page,

<a href=”https://localhost:9445/ibmmq/rest/v1/messaging/qmgr/…/message
> direct link ></a>

It issues the REST request, returns the data from the queue, and displays it.

From the headers we see

Simple HTML: Request headers

The important request headers were

And no origin header.

Simple HTML: Response headers

The response headers have

  • HTTP/1.1 200 OK
  • ibm-mq-md-expiry: unlimited
  • ibm-mq-md-messageId: 414d5120514d412020202020202020204c27165d04a98a25
  • ibm-mq-md-persistence: persistent
  • Content-Length: 1024
  • Set-Cookie: LtpaToken2_1 ….

There are no Allow….  headers, so this indicates the response is not a valid cross origin response.   The request came from one page, so there was no cross origin request.

You can see the MQ attributes returned, ibm-mq*.

Invoking request from java script.

My HTML page had

<html
<head>
<script src=”http://localhost:8884/src.js ” text=”text/javascript”></script >
</head>
<body>
<button onclick=”src()””>Get message</button>
</body>

I had the src() script downloaded from http://localhost:8884/src.js, and inline within the html file.  Both worked.

function src()
{
  fetch("https://localhost:9445/ibmmq/rest/v1/messaging/qmgr/QMA/queue/DEEPQ/message",
  { 
     method :'DELETE',
     headers: {
                'Origin': 'https://localhost:888499'
                ,'Access-Control-Request-Headers' : 'Content-Type' 
                ,'ibm-mq-rest-csrf-token' : '99'
              } 
   }
  ) // end of fetch
   .then((response) => response.text() )
   .then(x => {document.write("OUTPUT:"+x); } 
  ) 
  .catch(error =>  {console.log("Error from src:" + error);});
}

When the button was pressed, the script was executed,  there was an OPTIONS request (and response), and a DELETE request (and response).   It returned a message and displayed it.

In more detail, the flows were:

JavaScript OPTIONS request

The OPTIONS request had headers

This is doing a preflight check, saying it intends to issue a DELETE request from Origin  http://localhost:8889, the url of the web page.

JavaScript OPTIONS response headers

The OPTIONS response headers were the same as before but with additional ones

  • Access-Control-Allow-Credentials: true
  • Access-Control-Allow-Origin: http://localhost:8889
  • Access-Control-Allow-Max-Age: 91
  • Access-Control-Allow-Methods: GET, POST, PATCH, PUT, DELETE
  • Access-Control-Allow-Headers: Accept-Charset, Accept-Language, Authorization, Content-Type, ibm-mq-rest-csrf-token, ibm-mq-md-correlationId, ibm-mq-md-expiry, ibm-mq-md-persistence, ibm-mq-md-replyTo, ibm-mq-rest-mft-total-transfers

This means, the script is allowed to issue a request from http://localhost:8889 using a request method in the list {GET, POST, PATCH, PUT, DELETE } and the valid headers that can be used are in the list of headers.

The Access-Control-Allow-Max-Age: 91 came from my mqwebuser.xml file, <variable name=”mqRestCorsMaxAgeInSeconds” value=”91″/>.

After this there was a DELETE request to get the message.

JavaScript DELETE request headers

JavaScript DELETE response headers

The response included the CORS headers, and included the headers from the non CORS situation

  • HTTP/1.1 200 OK
  • Access-Control-Allow-Credentials: true
  • Access-Control-Allow-Origin: http://localhost:8889
  • Access-Control-Expose-Headers: Content-Language, Content-Length, Content-Type, Location, ibm-mq-qmgrs, ibm-mq-md-messageId, ibm-mq-md-correlationId, ibm-mq-md-expiry, ibm-mq-md-persistence, ibm-mq-md-replyTo, ibm-mq-rest-mft-total-transfers
  • Content-Type: text/plain; charset=utf-8
  • Content-Language: en-GB
  • ibm-mq-md-expiry: unlimited
  • ibm-mq-md-messageId: 414d5120514d412020202020202020204c27165d04a98a25
  • ibm-mq-md-persistence: persistent
  • Content-Length: 1024
  • Set-Cookie: LtpaToken2_….

Because the Access-* headers are present, this is a CORS response.

The message content was displayed in a browser window.

Link to another page

I set up a link <a href=”http://localhost:8884/page.html”/>localhost 8884</a>  to execute a page on another web server.  When this executed, it issued the java script request as before.  The Origin was Origin: http://localhost:8884 – so the page where the script executed.

What happens if CORS is not set up?

If the http://localhost:8889 is not in the list in the mqwebuser.xml file,

No data was  displayed.   The Chrome browser Developer tools ( Alt+Ctrl+i) displays a message

Access to fetch at ‘ https://localhost:9445/ibmmq/rest/v1/messaging/qmgr/QMA/queue/DEEPQ/message ‘ from origin ‘http://localhost:8889 ‘ has been blocked by CORS policy: Response to preflight request doesn’t pass access control check: No ‘Access-Control-Allow-Origin’ header is present on the requested resource. If an opaque response serves your needs, set the request’s mode to ‘no-cors’ to fetch the resource with CORS disabled.

The OPTIONS Request header has, as before

  • HTTP/1.1 200 OK
  • Host: localhost:9445
  • Access-Control-Request-Method: DELETE
  • Origin: http://localhost:8889
  • Access-Control-Request-Headers: ibm-mq-rest-csrf-token

but the OPTIONS Response header has no Access-* headers.

No DELETE request was issued.

The HTTP/1.1 200 OK is the same for all cases – it means the request was successful.

Trying to be clever

I read the documentation on the web, and much of it was very helpful, but some of it is no longer true.  It was hard to get it working, because every things has to be right for it to work.

Unsupported header.

I added an extra header, which is a valid CORS thing to do – but the back end has to support it.  With hindsight it makes no sense to add headers that will be ignored by the server.

headers: {
     'Origin': 'https://localhost:888499'
     , 'Access-Control-Request-Headers' : 'Content-Type' 
     , 'colin':'TRUE' 
     ,'ibm-mq-rest-csrf-token' : '99'
}

This sent up a header with

Access-Control-Request-Headers: colin,ibm-mq-rest-csrf-token

Header “colin” is not in the list of accepted headers, header ibm-mq-rest-csrf-token is in the list:

Access-Control-Allow-Headers: Accept-Charset, Accept-Language, Authorization, Content-Type, ibm-mq-rest-csrf-token, ibm-mq-md-correlationId, ibm-mq-md-expiry, ibm-mq-md-persistence, ibm-mq-md-replyTo, ibm-mq-rest-mft-total-transfers

The Developer tools message was the same as before,

Access to fetch at ‘https://localhost:9445/ibmmq/rest/v1/messaging/qmgr/QMA/queue/DEEPQ/message ‘ from origin ‘http://localhost:8889 ‘ has been blocked by CORS policy: Response to preflight request doesn’t pass access control check: No ‘Access-Control-Allow-Origin’ header is present on the requested resource. If an opaque response serves your needs, set the request’s mode to ‘no-cors’ to fetch the resource with CORS disabled.

I removed the unsupported header and it worked.

Other headers do not work

The request header ‘Access-Control-Allow-Method’ : ‘DELETE’ is a valid header.

When this was used, the request headers included

  • Access-Control-Request-Headers: access-control-allow-method,ibm-mq-rest-csrf-token
  • Access-Control-Request-Method: DELETE

As before  Access-Control-Request-Method is not in the list of Access-Control-Allow-Headers, so the request fails.

The Access-Control-Request-Method: DELETE is not needed, as the method: DELETE defines what will be used.

Using Access-Control-Request-Headers to add more headers does not work, if the header is not in the list of valid parameters.

The true origin is used

I tried using the header ‘Origin’: ‘https://localhost:888499 ‘ – as it did with my curl – and this was ignored.  The true Origin from the web page was used (I am glad to say, otherwise it would make the whole protection scheme worthless)

Some options ignored

I found  the “fetch” options credentials:, redirect: , and cache:, were all ignored by Chrome.

Some  cookies  were used.

The LtpaToken2_… was sent up and down

Problems in mqwebuser.xml file

I misconfigured the configuration file with <variable name=”mqRestCorsMaxAgeInSeconds” value=”AA91″/>

This gave me the familair message

Access to fetch at ‘https://localhost:9445/ibmmq/rest/v1/messaging/qmgr/QMA/queue/DEEPQ/message ‘ from origin ‘http://localhost:8889 ‘ has been blocked by CORS policy: Response to preflight request doesn’t pass access control check: No ‘Access-Control-Allow-Origin’ header is present on the requested resource. If an opaque response serves your needs, set the request’s mode to ‘no-cors’ to fetch the resource with CORS disabled.

So if you get this message it is worth checking the configuration file for problems.

Can I trace it?

I used <variable name=”traceSpec” value=”*=audit:CorsService=finest”/> in the mqwebuser.xml file to provide information about the CORS processing, and what was coming in, and what was being checked.

Can evil websites get to your mqweb – understanding CORS

In the beginning was the html, and the html was good;  then we had html and scripts  which could only do things on the page, which was also good; then we had scripts which could reach out to other websites – and that’s where the problems began.  It was easy for evil developers to get you to click on an innocent looking page, which had a script which jumped into a different tab of your browser where you had your banking window open ,  or  to executed a script ; and steal all your money.

The browsers were improved to stop evil scripts from accessing a server, and then they were improved again so the server could say “stuff coming from this list of web sites is OK, I trust them.   One implementation is called CORS.  There is a good description here.   It says

Cross-Origin Resource Sharing (CORS) is a mechanism that uses additional HTTP headers to tell browsers to give a web application running at one origin, access to selected resources from a different origin. A web application executes a cross-origin HTTP request when it requests a resource that has a different origin (domain, protocol, or port) from its own.

I do not trust things that “just work”, I like to see evidence of it.   This lack of trust comes from working with a group of young testers who came to IBM Hursley to test their code.   All of their tests ran cleanly – and thought they had a few days spare to go to London sightseeing.  I stopped the server they were meant to be using (without telling them) and the tests carried on running successfully!   It turned out they had been testing their code with a dummy program acting as the server.    They removed this code, reran their tests  and most of them failed – and had to stay an extra week.

I also remember changing a config file, and being surprised when my changed worked first time.   After a cup of tea (an invaluable thinking aid) I put some spelling mistakes in the file ; and it carried on running.  Why? I was using the wrong config file.

I played with CORS and wanted to get things to fail, as well as to work.   This was a good choice, as I had many failures.

I’ll document how I got curl to work and demonstrate CORS , and document how I got a web browser to work – a real challenge

mqweb implements CORS, so you can configure mqweb to give a list of websites which may access your server.

The documentation  is not very clear.  It says

where allowedOrigins specifies the origin that you want to allow cross-origin requests from. You can use an asterisk surrounded by double quotation marks, “*”, to allow all cross-origin requests. You can enter more than one origin in a comma-separated list, surrounded by double quotation marks. To allow no cross-origin requests, enter empty quotation marks as the value for allowedOrigins.

My observations,

  • You cannot use generics, so http://127.0.0.1:* is the same as “*” – or allow all cross-origin requests
  • You must specify {scheme:address:port} so http or https,  the url with // at the front, and the specific port number
  • The match is an string equality test, so the case, spacing and values must be the same

How does an HTTP request work?

When you click on a web page, data is sent to the back end server.   The following data is exchanged

  • the request
  • request headers
  • your data going to the server
  • response headers
  • response data – such as the content of web page.

In more detail…

Request

Request Headers

  • accept: for example  text/html, application/xml
  • accept-languages: en-GB
  • dnt:  1  this is “do not track me”
  • user-agent:  Chromium
  • cookie: bcookie=”….”

Response headers

  • status: 200
  • status: 200 OK
  • set-cookie ….
  • server: nginx
  • ….

Response data

This might be the web page to be displayed.  It can include script, images etc.

What is the origin of the page?

If your web page, invoked a script, for example from clicking a button, an “Origin” header is added, for example Origin: http://localhost:8884 which is the address of the web server hosting the page.   The backend server checks to see if this header is present, and looks up site in the authorised list.

If the Origin is acceptable, it sends down additional response headers (CORS Headers), so the browser (or your program) knows and can use the web site.

As part of the handshake, the browser can send up an “OPTIONS” request (instead of a get/delete etc), with a header saying the browser will be doing a GET/DELETE etc from this origin.    If there is a positive response, where the additional CORS headers are send back then the get/delete is allowed.   If the CORS headers are not present in the response, then the request will not be permitted.   This is called a preflight check – just like having your boarding pass checked at the gate before you get on the plane.

Does it work?

If there is no Origin header in the request, the backend server thinks it is all same domain, and does no CORS checks.

This CORS support is really aimed at web browsers, as the web browsers will automatically add headers for you.  If you are using curl or other tools to create your own request, you specify exactly the headers you want, so if you omit the Origins header, the server will not check.

My mqwebuser.xml file had <variable name=”mqRestCorsAllowedOrigins” value=”https://9999.0.0.1:19442,http://localhost:8884”/&gt;

So origins for https://9999.0.0.1:19442 and http://localhost:8884 are permitted.

I used a configuration file for curl (because command parameters did not work passing the headers) and had in curl.parms

cacert ./cacert.pem
cert ./colinpaice.pem:password
key ./colinpaice.key.pem
cookie cookie.jar.txt
cookie-jar cookie.jar.txt
request OPTIONS
header “Access-Control-Request-Method: DELETE”
header “Access-Control-Request-Headers: Content-Type”
header “Origin: https://9999.0.0.1:19442
header “ibm-mq-rest-csrf-token : COLINCSRF”
include

I used the command

curl –verbose –config curl.parms –url https://localhost:9445/ibmmq/rest/v1/messaging/qmgr/QMA/queue/DEEPQ/message

The headers were displayed, and I had

OPTIONS /ibmmq/rest/v1/messaging/qmgr/QMA/queue/DEEPQ/message HTTP/1.1
> Host: localhost:9445
> User-Agent: curl/7.58.0
> Accept: */*
> Cookie: LtpaToken2_….
> Access-Control-Request-Method: DELETE
> Access-Control-Request-Headers: Content-Type
> Origin: https://9999.0.0.1:19442
> ibm-mq-rest-csrf-token : COLINCSRF

The response headers were

< HTTP/1.1 200 OK
< X-Powered-By: Servlet/3.1
< X-XSS-Protection: 1;mode=block
< X-Content-Type-Options: nosniff
< Content-Security-Policy: default-src ‘none’; script-src ‘self’ ‘unsafe-inline’ ‘unsafe-eval’; connect-src ‘self’; img-src ‘self’; style-src ‘self’ ‘unsafe-inline’; font-src ‘self’
< Cache-Control: no-cache, no-store, must-revalidate
< Access-Control-Allow-Credentials: true
< Access-Control-Allow-Origin: https://9999.0.0.1:19442
< Access-Control-Allow-Max-Age: 90
< Access-Control-Allow-Methods: GET, POST, PATCH, PUT, DELETE
< Access-Control-Allow-Headers: Accept-Charset, Accept-Language, Authorization, Content-Type, ibm-mq-rest-csrf-token, ibm-mq-md-correlationId, ibm-mq-md-expiry, ibm-mq-md-persistence, ibm-mq-md-replyTo, ibm-mq-rest-mft-total-transfers

The response header Access-Control-Allow-Origin: https://9999.0.0.1:19442 shows that requests with origin https://9999.0.0.1:19442 is acceptable.

When I used a different “Origin”,   I did not get any Access-Control-Allow-* headers.  So from the absence,  I could tell the request was not support from the different origin.

The Access-Control-Allow-Headers is a list of header names which can be sent, so the header ibm-mq-rest-csrf-token : COLINCSRF is valid, but “COLIN:value”  is not valid, because ibm-mq-rest-csrf-token  is in the Access-Control-Allow-Headers list, and COLIN is not in the list.

The Access-Control-Allow-Methods: GET, POST, PATCH, PUT, DELETE says my application can use any of the methods in the list.

Using a web browser.

If you use HTML in a local file, then the “Origin” is null, and so does not match any elements in the authorised list.  I had to set up my own web server – which was easy to do using Python.

Using the web page at the bottom of this posting, I pointed my web browser at the web server.   It displayed a button.  I pressed it, and it invoked a script. Using the developer mode ( Al+Ctrl +i) in Chrome could see network flows etc.

The request headers had

  • Host: localhost:9445 This is where my webserver is hosted
  • ibm-mq-rest-csrf-token : 99  I specified this header value
  • Origin: http://localhost:8884  this overrode the value I had specified in the headers.

The response headers included

  • Access-Control-Allow-Credentials: true
  • Access-Control-Allow-Origin: http://localhost:8884
  • Access-Control-Expose-Headers: Content-Language, Content-Length, Content-Type, Location, ibm-mq-qmgrs, ibm-mq-md-messageId, ibm-mq-md-correlationId, ibm-mq-md-expiry, ibm-mq-md-persistence, ibm-mq-md-replyTo, ibm-mq-rest-mft-total-transfers
  • ibm-mq-md-expiry: unlimited
  • ibm-mq-md-messageId: 414d5120514d412020202020202020204c27165d04a98a25
  • ibm-mq-md-persistence: persistent

We can see from

  • the Access-Control-* headers we know this has validated for http://localhost:8884
  • the Access-Control-Expose-Headers we can see what headers will be accepted
  • ibm-mq-md-persistence:  persistence. For the returned messages, it was a persistent message

The web page used

<HTML>
<HEAD>
<TITLE>Call a  mqweb rest API </TITLE>
<script>
  function local()
  {
    fetch("https://localhost:9445/ibmmq/rest/v1/messaging/qmgr/QMA/queue/DEEPQ/message" 
      {
         method :'DELETE',
         headers: {
                    'Origin': 'https://localhost:888499'
                    , 'Access-Control-Request-Headers' : 'Content-Type'
                    ,'ibm-mq-rest-csrf-token' : '99'
                  }
     }
    )
    .then((response) => response.text() )
    .then(x => {  document.write("OUTPUT:"+x);    } )
    .catch(error => { console.log("Booo:" + error);});
  }
</script> /head> <body> <button onclick="local()"">press me</button> </body> </html>

 

For information about “fetch(…,…)” see here.

For information about the “.then(…) ” see here.

Getting useful information out of JMX data

The data coming from Liberty WebServer through the JMX interface  provides some data, but it is not very useful, and it may become inaccurate over time.

I’ll cover

  1. Getting a useful mean value
  2. Getting a more accurate long term mean
  3. Data gets more inaccurate over time
  4. Standard deviation (this may only be of interest to a few people)

For example from JMX, the reported  mean time for mqconsole transactions  was 9.9 milliseconds – this is for all requests since the mqweb server was started.   Over the previous minute the average measured time, for a 10 second period was 7, or 8 milliseconds, so well below the 9.9 reported.

This is because the mean time includes any initial start up time.   The maximum transaction time, at the start of the run, was over 2 seconds.   This will bias the mean.

You can process the data to extract some useful information, and I show below how to get out useful response time values.

You get the following data (and example values) from mqweb through the JMX interface.

ResponseTimeDetails/count (Long) = 20
ResponseTimeDetails/description (String) = Average Response Time for servlet
ResponseTimeDetails/maximumValue (Long) = 3060146565 
– in nanoseconds (see below for the unit)
ResponseTimeDetails/mean (Double) = 4.336789965E8
– in nanoseconds
ResponseTimeDetails/minimumValue (Long) = 2474556
– in nanoseconds
ResponseTimeDetails/standardDeviation (Double) = 9.089057964078983E8
– in nanoseconds
ResponseTimeDetails/total (Double) = 8.67357993E9
– used in calculations
ResponseTimeDetails/unit (String) = ns
– the unit ns = nanoseconds
ResponseTimeDetails/variance (Double) = 8.319076762335653
– used in calculations

Getting a useful mean value

To produce these numbers, the count of the response times and the sum of the transaction response times are accumulated within the Liberty Server.  To calculate the mean value you calculate sum/count.   This gives you the overall mean time.  If you obtain the data periodically you can manipulate the data to provide some useful statistics.

Let the count and sum at time T1 be Count1, and Sum1, and at time T2 Count2, and Sum2.
You can now calculate (Sum2- Sum1)/(Count2 – Count1) to get the average for that period.  For example the reported mean was 0.016 ms, but the calculated value gave 0.008 ms.  You can also calculate (Count2 – Count1)/(T2-T1) to give a rate of requests per second.   These are much more useful than the raw data.  I suggest collecting the data every minute.

Getting a more accurate long term mean

The first rest request and console request take a long time because the java code has to be loaded in etc.  In one test the duration of the first request was 50 times the duration of the other requests.  A better “mean” value is to ignore the duration of the first request.

The improved mean is (JMX mean * JMX count  – JMX Maximum value) /(JMX Count-1), or JMXMean – (JMXMaximum/JMXCount) .

Data gets more inaccurate over time

The total time is stored as a floating point double.  As you add small numbers to big numbers, the small numbers may be ignored.  Let me try to explain.

Consider a float which has precision of 3, so you could have 1.23 E2 = 1230.  Add 1 to this, and you get 1231 which is 1.23 E2 with a precision of 3 – the number we started with.

The total time is in nanoseconds so 1 second is stored as 1.0 E9.  With 100 of these a second, and 1 hour( 3600 seconds) for 100 hours is 360,000,000, or 3.6 E8 seconds.  * 1.0 E9 nano seconds. = 3.6E17 nano seconds.   The precision  of most float numbers is 16, so with this 3.6 E17 we have lost the odd nanosecond.    I do not think this is a big enough problem to worry about – unless you are running for years without restarting the server.

The variance uses the time**2 value.  So with the maximum time above 599482097 nano seconds. Time **2 is 3.593787846×10¹⁷ and you are already losing data.  I expect the variance will not be used very often, so I think this can be ignored.

If the times were in microseconds instead of nano seconds, this would not be a problem.

Getting a useful standard deviation (this may only be of interest to a few people)

The standard deviation gives a measure of the spread of the data, a small standard deviation means the data is in a narrow band, a larger standard deviation means it is spread over a wider band.  Often 95% of the values are within plus or minus 3 * standard deviations from the mean, so anything outside this range would considered an outlier, or unusual statistic.

I set up some data, a mixture of  10  values 9, 10, 11,  the standard deviation was 0.73.    I changed one value to 20, and the standard deviation changed to 3.3, indicating a wide spread of values.

With a mixture of 100 values 9,10,11, the standard deviation was 0.71.   I changed one value to 20, and the standard deviation changed to 1.2, so a much smaller value, most of the data was still around 10 – just one outside the range.

With a lot of data, the standard deviation converges on a value, and “unusual” numbers make little difference to the value.  I think that the standard deviation over an extended period is not that useful, especially if you get periodic variations such as busy time, and overnight.

You calculate the standard deviation as the square root of the variance.   The variance is (Sum of (values**2) – (mean ** 2)) /number of measurements.

With data

ResponseTimeDetails/count (Long) = 203
ResponseTimeDetails/mean (Double) = 6420785.187192118 nanoseconds
ResponseTimeDetails/variance (Double) = 1.7113341125320868E15 – used in calculations

Variance  = 1.7113341125320868E15 =  ( (Sum of (values**2) – (6420785.187192118 ** 2)) / 203

So (Sum of (values**2)) =   3.474420513264337e+17

You can now do the sums as with the mean, above:

At time T1, the ssquares1 is the sum of (values**2)   at time T2, the ssquares2 is the sum of (values**2).

You can now calculate ssquares2 – ssquares2, and used that to estimate the variance, and so the standard deviation of the data in the range T1 to T2, I’ll leave the details to the end user.

For the advanced user,  you can use the mean for the interval – rather than the long term mean.  Good luck.

 

mqweb error messages and symptoms of TLS setup problems

I deliberately caused TLS set up errors, and noted the symptoms.  Ive recorded them below; the article is not meant to be read, but indexed by search engines.

There are three sections

  1. Problems with server certificates
  2. Problems with the client certificate
  3. Chrome messages, and possible causes of the problems.

The mqweb messages.log reported problems that the mqweb server saw.   For me this was in file /var/mqm/web/installations/Installation1/servers/mqweb/logs/messages.log

Problems with the server certificate

Problem: mqwebuser.xml serverKeyAlias name not in the keystore

This can be caused by the certificate being in the keyring but not visible, or cannot be validated.

The RACF command RACDCERT LISTRING(KEYRING) ID(IZUSVR) will list the contents of the keyring. For example it gives ZZZZ ID(START1).  You can then use

RACDCERT LIST(LABEL(‘ZZZZ’ )) ID(START1).   This gives output including

Status: TRUST
Start Date: 2020/12/17 00:00:00
End Date: 2021/12/17 23:59:59

Check it has STATUS:TRUST and the dates are valid.  If you make a change, check it afterwards.  Several times I got the change wrong!

Check the CA for the certificate is in the keystore; you need the key, and the CA in the keystore.

Message log:

  • Failed validating certificate paths
  • E CWPKI0024E: The certificate alias mqweb specified by the property com.ibm.ssl.keyStoreServerAlias is not found in KeyStore /home/colinpaice/ssl/ssl2/mqweb.p12.
  • I FFDC1015I: An FFDC Incident has been created: “com.ibm.wsspi.channelfw.exception.ChannelException: java.lang.IllegalArgumentException: CWPKI0024E: The certificate alias mqweb specified by the property com.ibm.ssl.keyStoreServerAlias is not found in KeyStore /home/colinpaice/ssl/ssl2/mqweb.p12. com.ibm.ws.channel.ssl.internal.SSLConnectionLink 238″ at ffdc_….

curl:

* TLSv1.3 (OUT), TLS handshake, Client hello (1):
* curl (35) OpenSSL SSL_connect: SSL_ERROR_SYSCALL in connection to 127.0.0.1:9443
* stopped the pause stream!
* Closing connection 0

chrome:

This site can’t be reached.  ERR_CONNECTION_CLOSED

Problem:  The host certificate is self signed and not in the client keystore

Problem:  The host certificate is signed but the signer certificate is not in  the client keystore

Message log:

Nothing.

curl:

* TLSv1.2 (OUT), TLS alert, Server hello (2):
* SSL certificate problem: self signed certificate
* stopped the pause stream!
* Closing connection 0
curl: (60) SSL certificate problem: self signed certificate

Chrome: in browser

NET::ERR_CERT_AUTHORITY_INVALID

Click on the Not Secure in the url, to display the certificate which was sent down.

If it is signed, make a note of the “issued by” Common Name(CN), and the  Organisation(0) and look up the value of Organisation in the “Authorities” section of “Manage Certificates”.

Use the chrome url chrome://settings/certificates .  Authorities tab

  1. if it is not present, import it
  2. it it is present and UNTRUSTED, edit it, and tick the “Trust this certificate for identifying web sites”

Chrome log:

ERROR:cert_verify_proc_nss.cc(1011)] CERT_PKIXVerifyCert for localhost failed err=-8179

From here  -8179 is Peer’s certificate issuer is not recognized.

Firefox:  browser

SEC_ERROR_UNKNOWN_ISSUER

Action import the CA signing certificate into the keystore and make it trusted.

Problem: curl: The host certificate is self signed and you use the –insecure option

curl

* TLSv1.2 (IN), TLS handshake, Finished (20):
* SSL connection using TLSv1.2 / ECDHE-RSA-AES256-GCM-SHA384
* ALPN, server did not agree to a protocol
* Server certificate:
* subject: C=GB; O=aaaa; CN=testuser
* start date: Jan 20 17:39:37 2020 GMT
* expire date: Feb 19 17:39:37 2020 GMT
* issuer: C=GB; O=aaaa; CN=testuser
* SSL certificate verify result: self signed certificate (18), continuing anyway.

Problem: Chrome:  The host certificate is self signed and is not trusted

Chrome browser

This site can’t be reached
localhost unexpectedly closed the connection.
ERR_CONNECTION_CLOSED

Debugging

  • I could find nothing that told me what certificate was being used.  The Chrome network trace just gave “net_error = -100 (ERR_CONNECTION_CLOSED)“.
  • Use certutil -L $sql  to list the contents of your browsers keystore.   The certificate needs “P,…” permissions.
  •  Or use the chrome url chrome://settings/certificates  and display “your certificates”. Pick the likely one, if it says “UNTRUSTED” then this may be the problem.   View the certificate, and check it, for example under details, there may be a comment describing its use.
  •  Defined the server certificate as trusted using certutil -M $sql -n name -t “P,,” 
  • Restart the web browser.

Problem: The  CA signer server certificate had the wrong subjectAltName

curl:

* subjectAltName does not match 127.0.0.1
* SSL: no alternative certificate subject name matches target host name ‘127.0.0.1’

Chrome:

NET::ERR_CERT_COMMON_NAME_INVALID
From the “Not Secure” in front of the URL, display the certificate, and check the extenstions, especially Certificate Subject Alternative Names.

Chrome log:

ERROR:ssl_client_socket_impl.cc(935)] handshake failed; returned -1, SSL error code 1, net_error -200
From here -200 is  CERT_INVALID

Problem: The mqweb server certifcate has expired

curl:

* TLSv1.2 (OUT), TLS alert, Server hello (2):
* SSL certificate problem: certificate has expired
curl: (60) SSL certificate problem: certificate has expired

chrome:

while Chrome running:   web page reports Lost communication with the server.  Could not establish communication with the server. Check your network connections and refresh your browser

restart browser, get “Your connection is not private NET::ERR_CERT_DATE_INVALID”

message.log.  Chrome session was working, then server certificate expired

  • E CWWKO0801E: Unable to initialize SSL connection. Unauthorized access was denied or security settings have expired. Exception is javax.net.ssl.SSLException: Received fatal alert: certificate_unknown

Problem: The mqweb server certificate is missing extendedKeyUsage = serverAuth

curl:

* SSL certificate problem: unsupported certificate purpose
curl: (60) SSL certificate problem: unsupported certificate purpose

Chrome:

Your connection is not private
Attackers might be trying to steal your information from localhost (for example, passwords, messages or credit cards).
NET::ERR_CERT_INVALID

Chrome log:

CERT_PKIXVerifyCert for localhost failed err=-8101
From here  -8101 is Certificate type not approved for application.

ERROR:ssl_client_socket_impl.cc(935)] handshake failed; returned -1, SSL error code 1, net_error -207
From here -207 is CERT_INVALID

Problems with the server ca certificate

Problem: The trust store has an expired CA.

curl:

* gnutls_handshake() failed: The TLS connection was non-properly terminated.

pycurl.error: (35, ‘gnutls_handshake() failed: The TLS connection was non-properly terminated.’)

Problems with the client certificate

Problem: There is no suitable certificate in the client keystore.

For example

  1. There are no “Your certificates” in the browsers keystore
  2. There is a certificate, but has a CA which was not passed down from the server trust keystore
  3. As part of the TLS handshake any self signed certificates are read from the server trust keystore and sent down.  None were found in the “Your certificates”

Curl:

  • * gnutls_handshake() failed: The TLS connection was non-properly terminated.
  • pycurl.error: (35, ‘gnutls_handshake() failed: The TLS connection was non-properly terminated.’)

These messages basically mean the server just ended the connection

Chrome:

ERR_CONNECTION_CLOSED

For a test site, change <ssl clientAuthentication=”true” to false.  Restart mqweb, restart the web browser.  If it prompts for userid and password, the certificate sent from the server was OK.  It is the certificate sent up to the server that has a problem.

Reset false back to true.

Messages in messages.log:

None.

How to debug it.

Check the logs/ffdc directory.  I found I had an ffdc with Stack Dump = java.security.cert.CertPathValidatorException: The certificate issued by CN=SSCA8, OU=CA, O=SSS, C=GB is not trusted; internal cause is:   java.security.cert.CertPathValidatorException: Signature does not match.

Using Chrome trace

When I repeated the investigations, I got different records in the Chromium trace.  One included

--> net_error = -110 (ERR_SSL_CLIENT_AUTH_CERT_NEEDED)

Using the mqweb server java trace – which traces the whole server

See the Oracle Debugging SSL/TLS Connections page and an IBM page.  I could not see how to trace just “the problem”.

With -Djavax.net.debug=ssl:handshake in the jvm.options file, and restarting the mweb server I got

 *** ServerHelloDone
Default Executor-thread-8, WRITE: TLSv1.2 Handshake, length = 3054
Default Executor-thread-2, READ: TLSv1.2 Handshake, length = 7
*** Certificate chain
***
Default Executor-thread-2, fatal error: 40: null cert chain

When it worked I had

*** ServerHelloDone
Default Executor-thread-7, WRITE: TLSv1.2 Handshake, length = 3054
Default Executor-thread-15, READ: TLSv1.2 Handshake, length = 2433
*** Certificate chain
chain [0] = […. the  certificates

Found trusted certificate:

When there was no certificate sent up,  it reported null cert chain.

Problem: The client certificate is self signed and not in the server’s trust store

curl:

* TLSv1.2 (OUT), TLS handshake, Finished (20):
* OpenSSL SSL_connect: SSL_ERROR_SYSCALL in connection to 127.0.0.1:9443

Chrome:

ERR_CONNECTION_CLOSED

Messages in messages.log:

  • I FFDC1015I: An FFDC Incident has been created: “java.security.cert.CertPathBuilderException: unable to find valid certification path to requested target com.ibm.ws.ssl.core.WSX509TrustManager checkClientTrusted” at ffdc_20.01.30_08.29.27.0.log
  •  E CWPKI0022E: SSL HANDSHAKE FAILURE: A signer with SubjectDN CN=testuser, O=aaaa, C=GB was sent from the target host. The signer might need to be added to local trust store /home/colinpaice/ssl/ssl2/trust.jks, located in SSL configuration alias defaultSSLConfig. The extended error message from the SSL handshake exception is: PKIX path building failed: java.security.cert.CertPathBuilderException: unable to find valid certification path to requested target
  •  I FFDC1015I: An FFDC Incident has been created: “java.security.cert.CertificateException: unable to find valid certification path to requested target com.ibm.ws.ssl.core.WSX509TrustManager checkClientTrusted” at ffdc_20.01.30_08.29.27.1.log
  • E CWWKO0801E: Unable to initialize SSL connection. Unauthorized access was denied or security settings have expired. Exception is javax.net.ssl.SSLHandshakeException: null cert chain

 

Problem: Invalid cn=, the cn value is not a valid userid.

curl message

{“error”: [{

  • “action”: “Provide credentials using a client certificate, LTPA security token, or username and password via HTTP basic authentication header. On z/OS, if the mqweb server has been configured for SAF authentication, check the messages.log file for messages indicating that SAF authentication is not available. Start the Liberty angel process if it is not already running. You might need to restart the mqweb server for any changes to take effect.”,
  • “completionCode”: 0,
  •  “explanation”: “The REST API request cannot be completed because credentials were omitted from the request. On z/OS, if the mqweb server has been configured for SAF authentication, this can be caused by the Liberty angel process not being active.”,
  • “message”: “MQWB0104E: The REST API request to ‘https://127.0.0.1:9443/ibmmq/rest/v1/login ‘ is not authenticated.”,
  • “msgId”: “MQWB0104E”,
  • “reasonCode”: 0,
  • “type”: “rest”

chrome:

It gives you a window to enter userid and password.   This looks like a bug as I have <webAppSecurity allowFailOverToBasicAuth=”false”/>.  It takes the userid and password.

Messages in  messages.log:

R com.ibm.websphere.security.CertificateMapFailedException
and 100 lines of stack trace

The certificate causing the problems, nor the userid is listed – so pretty useless.

Problem: Client certificate missing “extendedKeyUsage = clientAuth”  during signing.

curl message

* TLSv1.2 (OUT), TLS change cipher, Client hello (1):
* TLSv1.2 (OUT), TLS handshake, Finished (20):
curl session hangs…
* Operation timed out after 300506 milliseconds with 0 out of 0 bytes received

Chrome

ERR_CONNECTION_CLOSED

message in messages.log:

  • E CWPKI0022E: SSL HANDSHAKE FAILURE: A signer with SubjectDN CN=colinpaice, O=cpwebuser, C=GB was sent from the target host. The signer might need to be added to local trust store /home/colinpaice/ssl/ssl2/trust.jks, located in SSL configuration alias defaultSSLConfig. The extended error message from the SSL handshake exception is: Extended key usage does not permit use for TLS client authentication
  •  I FFDC1015I: An FFDC Incident has been created: “java.lang.NullPointerException com.ibm.ws.ssl.core.WSX509TrustManager checkClientTrusted” at ffdc_20.01.28_17.11.10.1.log

ffdc in /var/mqm/web/installations/Installation1/servers/mqweb/logs/messages.log/ffdc

Exception = java.lang.NullPointerException
Source = com.ibm.ws.ssl.core.WSX509TrustManager
probeid = checkClientTrusted
Stack Dump = java.lang.NullPointerException
at com.ibm.ws.ssl.core.WSX509TrustManager.checkClientTrusted(WSX509TrustManager.java:202)

Problem: Client certificate missing “keyUsage = digitalSignature”  during signing.

curl message

* TLSv1.2 (OUT), TLS handshake, Finished (20):
* Operation timed out after 300509 milliseconds with 0 out of 0 bytes received

message in messages.log

  • E CWPKI0022E: SSL HANDSHAKE FAILURE: A signer with SubjectDN CN=colinpaice, O=cpwebuser, C=GB was sent from the target host. The signer might need to be added to local trust store /home/colinpaice/ssl/ssl2/trust.jks, located in SSL configuration alias defaultSSLConfig. The extended error message from the SSL handshake exception is: KeyUsage does not allow digital signatures
  • FFDC1015I: An FFDC Incident has been created: “java.lang.NullPointerException com.ibm.ws.ssl.core.WSX509TrustManager checkClientTrusted”
  • E CWWKO0801E: Unable to initialize SSL connection. Unauthorized access was denied or security settings have expired. Exception is javax.net.ssl.SSLHandshakeException: null cert chain

ffdc in /var/mqm/web/installations/Installation1/servers/mqweb/logs/messages.log/ffdc

Exception = java.lang.NullPointerException
Source = com.ibm.ws.ssl.core.WSX509TrustManager
probeid = checkClientTrusted
Stack Dump = java.lang.NullPointerException
at com.ibm.ws.ssl.core.WSX509TrustManager.checkClientTrusted(WSX509TrustManager.java:202)

Chrome:

  • If there is one or more certificates in the keystore, the list of valid certificates does not include the problem one.
  • If there is only the problem certificate in the keystore, you get
    This site can’t be reached.
    localhost unexpectedly closed the connection.
    ERR_CONNECTION_CLOSED

CA Signed client certificate has expired

curl:

* TLSv1.2 (OUT), TLS change cipher, Client hello (1):
* TLSv1.2 (OUT), TLS handshake, Finished (20):
* OpenSSL SSL_connect: SSL_ERROR_SYSCALL in connection to 127.0.0.1:9443
* stopped the pause stream!
* Closing connection 0

Chrome:

This site can’t be reached
localhost unexpectedly closed the connection.
ERR_CONNECTION_CLOSED

message in messages.log:

for curl.

  • I FFDC1015I: An FFDC Incident has been created: “java.security.cert.CertPathValidatorException: The certificate expired at Thu Jan 30 16:46:00 GMT 2020; internal cause is:
    java.security.cert.CertificateExpiredException: NotAfter: Thu Jan 30 16:46:00 GMT 2020 com.ibm.ws.ssl.core.WSX509TrustManager checkClientTrusted” at ffdc_20.01.30_17.16.11.0.log
  • E CWPKI0022E: SSL HANDSHAKE FAILURE: A signer with SubjectDN CN=colinpaice, O=cpwebuser, C=GB was sent from the target host. The signer might need to be added to local trust store /home/colinpaice/ssl/ssl2/trust.jks, located in SSL configuration alias defaultSSLConfig. The extended error message from the SSL handshake exception is: PKIX path validation failed: java.security.cert.CertPathValidatorException: The certificate expired at Thu Jan 30 16:46:00 GMT 2020; internal cause is:
    java.security.cert.CertificateExpiredException: NotAfter: Thu Jan 30 16:46:00 GMT 2020
  •  I FFDC1015I: An FFDC Incident has been created: “java.security.cert.CertificateException: The certificate expired at Thu Jan 30 16:46:00 GMT 2020 com.ibm.ws.ssl.core.WSX509TrustManager checkClientTrusted” at ffdc_20.01.30_17.16.11.1.log

for chrome:

  • I FFDC1015I: An FFDC Incident has been created: “java.security.cert.CertificateException: The cer
    tificate expired at Thu Jan 30 16:46:00 GMT 2020 com.ibm.ws.ssl.core.WSX509TrustManager checkClientTrusted” at ffdc_20.01.30_17.16.11.1.log
  • E CWWKO0801E: Unable to initialize SSL connection. Unauthorized access was denied or security settings have expired. Exception is javax.net.ssl.SSLHandshakeException: null cert chain

Bad requests

HTTP request was issued – it should have been HTTPS

curl:

curl:(52) Empty reply from server

messages.log:

E CWWKO0801E: Unable to initialize SSL connection. Unauthorized access was denied or security settings have expired. Exception is javax.net.ssl.SSLException: Unrecognized SSL message, plaintext connection?

The client certificate cannot be verified because it is too weak.

Chrome:  ERR_BAD_SSL_CLIENT_AUTH_CERT

Firefox:  An error occurred during a connection to …  security library: memory allocation failure.  Error code: SEC_ERROR_NO_MEMORY

Reason:

The selected client certificate cannot be validated.  For example it has been created with Elliptic Curve sect409k1.   This is considered weak see here.  The signature is not in the list of acceptable signatures.

Display the certificate and compare it with the list of weak signatures.  A TLS handshake trace may help identify this.  Create a new certificate with a supported signature, and import it.

Problem the CA signing is too weak.

For example signing with sha1RSA, when Chrome expects SHA256RSA or stronger.

Chrome:  NET::ERR_CERT_WEAK_SIGNATURE_ALGORITHM

Firefox: I didnt get the error

Action: Use stronger signing.  For example on z/OS use RSA SIZE(2048)

Firefox errors

Your computer clock is set to … . Make sure your computer is set to the correct date, time, and time zone in your system settings, and then refresh …

If your clock is already set to the right time, the web site is likely misconfigured, and there is nothing you can do to resolve the issue. You can notify the web site’s administrator about the problem.

… uses an invalid security certificate.

The certificate is not trusted because the issuer certificate has expired.

Error code: SEC_ERROR_EXPIRED_ISSUER_CERTIFICATE

Reason:

The CA certificate in the trust store has expired.  The a valid CA certificate may have been sent down with the server’s certificate, but the validation failed.

Action:

  1. From Warning: Potential Security Risk Ahead -> Advanced -> View certificate. It will have the certificate.  Note Issuer -> Organisation and common name
  2. Use Firefox preferences-> view certificates.   Select authorities.  Search for the Organisation from the previous line.  Display the certificate with the matching common name.  Replace it and restart the browser.   Replace the certificate through firefox or use this to locate the directory containing the cert9.db.

Error code: SSL_ERROR_CERTIFICATE_UNKNOWN_ALERT

The backend may get java.security.cert.CertPathValidatorException: signature check failed.

One reason, the certificate being used by firefox was signed by an invalid CA, for example the CA had expired.

Action:

  1. Check Firefox preferences-> certificates, and check “Ask you every time” is selected, repeat the connection and display information about the certificate.  It will give you the issuer, but no more information than that.
  2. Regenerate the certificate, import into Firefox, restart Firefox.

Chrome errors

Chrome has more stricter checks than curl.  These are from Chrome browser.

NET::ERR_CONNECTION_CLOSED

  • mqwebuser.xml serverKeyAlias name not in the keystore
  • The host certificate is self signed and is not trusted
  • The client certificate is self signed and not in the server’s trust store
  • Client certificate missing “extendedKeyUsage = clientAuth”  during signing.
  • CA Signed client certificate has expired
  • Client certificate missing “keyUsage = digitalSignature”  during signing.

NET::ERR_CERT_COMMON_NAME_INVALID

  • missing x509 extensions in the server certificate
  • invalid subjectAltName in x509 extensions, for example IP:127.0.0.11  instead of IP:127.0.0.1

NET::ERR_CERT_INVALID

  • missing extendedKeyUsage = serverAuth in x509 extensions

NET::ERR_CERT_AUTHORITY_INVALID

  • Certificate is not peer.  Need certutil -M $sql -n $name -t “P,,” to change the certificate to be a trusted peer
  • Server’s self signed not found in the browser keystore.
  • The CA from the server does not match the certificate in the browsers’ keystore.  It may have the same name,  but check validity dates, finger prints etc.  Check very carefully.

NET::ERR_CERT_DATE_INVALID

  • The mqweb server certificate has expired.

CWPKI0024E: The certificate alias …  specified by the
property com.ibm.ssl.keyStoreServerAlias is not found in KeyStore safkeyring://…/….

The z/OS certificate is not in the keyring, or it is in the keyring and needs to have TRUST

Make the change, stop and restart the web browser

Firefox:  PR_END_OF_FILE_ERROR

Slow backend server.

MQWEB on z/OS

 CWWKS2932I: The unauthorized version of the SAF user registry is activated.
Authentication will proceed using unauthorized native services.

Check at the top of the message log for.  CWWKB0104I: Authorized service group SAFCRED is not available.

Reason: When the web server was started the SAFCRED service was not available.   This could be caused by security not set up properly.

Fix the security.  For example here

CWWKS2930W: A SAF authentication attempt using authorized SAF services was rejected because the server is not authorized to
authorized to access the APPL-ID MQWEB. Authentication will proceed using unauthorized SAF services.

Problem:  the profile with class(SERVER) and profile(BBG.SECPFX.MQWEB) is missing
Action:  the define profile matching the APPL-ID.

RDEFINE SERVER BBG.SECPFX.MQWEB
PERMIT BBG.SECPFX.MQWEB  CLASS(SERVER) ID(START1) ACC(READ)
SETROPTS RACLIST((SERVER) refresh

Restart MQWEB server.

CWWKS2960W: Cannot create the default credential for SAF authorization of unauthenticated users.

All authorization checks for unauthenticated users will fail.
The default credential could not be created due to the following error:

CWWKS2907E: SAF Service IRRSIA00_CREATE did not succeed because user WSGUEST has insufficient authority to access APPL-ID MQWEB.

SAF return code 0x00000008. RACF return code 0x00000008. RACF reason code 0x00000020.

PERMIT MQWEB CLASS(APPL) ACCESS(READ) ID(MQWSGUEST)
SETROPTS RACLIST(APPL) REFRESH

CWPKI0022E: SSL HANDSHAKE FAILURE:

A signer with SubjectDN CN=colinpaicesECp256r1, O=cpwebuser,
C=GB was sent from the target host. The signer might need to be added to local trust store safkeyring://…/…,
located in SSL configurate on alias defaultSSLConfig.
The extended error message from the SSL handshake exception is:

Unexpected error: java.security. InvalidAlgorithmParameterException:
the trustAnchors parameter must be non-empty

The full error was

CWPKI0022E: SSL HANDSHAKE FAILURE: A signer with Subject DN  CN=colinpaice, O=HW, C=GB was sent from the target host.  The signer might need to be added to local trust store  safkeyring://START1/TRUST, located in SSL configuration alias izuSSLConfig. The extended error message from the SSL  handshake exception is: Unexpected error:  java.security.InvalidAlgorithmParameterException: the  trustAnchors parameter must be non-empt.

The problem was that the started task userid did not have update access to the trust keyring.  There was an FFDC in the log file at startup showing this.  Part of this was I assumed the wrong userid for the started task.  The z/OS Command D A,IZUSVR1 gave me th userid, which I then checked., and found it had no access.

ERROR: SEC_ERROR_REUSED_ISSUER_AND_SERIAL

I got this on a slow backend system.  I shut down the web server and restarted it, and it ran OK without the message.

ICH408I USER( ) GROUP( ) NAME()
DIGITAL CERTIFICATE IS NOT DEFINED. CERTIFICATE SERIAL NUMBER(…)
SUBJECT(CN=.. .O=… C=GB) ISSUER(….)

The certificate came in, but there was no mapping for it.

Use RACDCERT command to map it to a userid.

RACDCERT MAP ID(IBMUSER) –
SDNFILTER(‘CN…. ‘)
SETROPTS RACLIST(DIGTNMAP, DIGTCRIT) REFRESH

Firefox SEC_ERROR_BAD_SIGNATURE

Dont know what caused it.  I deleted the CA and readded it and it worked. 

Others

CWWKO0801E:

Unable to initialize SSL connection. Unauthorized access was denied or security settings have expired. Exception is javax.net.ssl.SSLHandshakeException: no cipher suites in common.

Problem:

There was no serverKeyAlias specified in the <ssl … tag.

CWPKI0024E:

The certificate alias… specified by the property com.ibm.ssl.keyStoreServerAlias is not found in KeyStore safkeyring://…/… .

Problem

  • The certificate was not in the keyring
  • It was NOTRUST
  • It had expired
  • The CA for the certificate was not in the keyring.


mqweb – performance notes

  • I found facilities in Liberty which can improve the performance of your mqweb server by 1% – ish, by using http/2 protocol and ALPN
  • Ive documented where time is spent in the mq rest exchange.

Use of http/2 and ALPN to improve performance.

According to Wikipedia, Application-Layer Protocol Negotiation (ALPN) is a Transport Layer Security (TLS) extension that allows the application layer to negotiate which protocol should be performed over a secure connection in a manner that avoids additional round trips and which is independent of the application-layer protocols. It is needed by secure HTTP/2 connections, which improves the compression of web pages and reduces their latency compared to HTTP/1.x.

mqweb configuration.

This is a liberty web browser configuration, see this page.

For example

 <httpEndpoint id="defaultHttpEndpoint"
   host="${httpHost}" 
   httpPort="${httpPort}"
   httpsPort="${httpsPort}"
   protocolVersion="http/2"
   >
   <httpOptions removeServerHeader="false"/>

</httpEndpoint>

Client configuration

Most web  browsers support this with no additional configuration needed.

With curl you specify ––http2.

With curl, ALPN is enabled by default (as long as curl is built with the ALPN support).

With the curl ––verbose option on a curl request,  you get

  • * ALPN, offering h2 – this tells you that curl has the support for http2.
  • * ALPN, offering http/1.1

and one of

  • * ALPN, server did not agree to a protocol
  • * ALPN, server accepted to use h2

The “* ALPN, server accepted to use h2” says that mqweb is configured for http2.

With pycurl you specify

 c.setopt(pycurl.SSL_ENABLE_ALPN,1)
 c.setopt(pycurl.HTTP_VERSION,pycurl.CURL_HTTP_VERSION_2_0)

Performance test

I did a quick performance test of a pycurl program getting a 1024 byte message (1024 * the character ‘x’) using TLS certificates.

HTTP support Amount of “application data” sent Total data sent.
http/1.1 2414 7151
http/2 2320 7097

So a slight reduction in the number of bytes send when using http/2.

The time to get 10 messages was 55 ms with http/2, and 77ms with http/1.1,  though there was significant variation in repeated measurements, so I would not rely on these measurements.

Where is the time being spent?

cURL and pycurl can report the times from the underlying libcurl package.  See TIMES here.

The times (from the start of the request) are

  • Name lookup
  • Connect
  • Application connect
  • Pre transfer
  • Start transfer
  • Total time

Total time- Start transfer = duration of application data transfer.

Connect duration = Connect Time – Name lookup Time etc.

For a pycurl request getting two messages from a queue the durations were

Duration in microseconds First message Second messages
Name_lookup 4265 32
Connect 53 3
APP Connect 18985 0
Pre Transfer 31 42
Start Transfer 12644 11036
Transfer of application data 264 235

Most of the time is spent setting up the connection, if the same connection can be reused, then the second and successive requests are much faster.

In round numbers, the first message took 50 ms, successive messages took between 10 and 15 ms.

mqweb charts look like “work in progress”

One of the widgets you can add to a dashboard page is a chart.   This can subscribe to the published monitoring data, and displays it as a line chart.

The topics are described in my post  What data is available with the Published Monitoring data, and include

  • Platform central processing units
      • CPU performance – platform wide  …
    • CPU performance – running queue manager …
  • Platform persistent data stores
    • Disk usage – platform wide …
    • Disk usage – running queue managers …
    • Disk usage – queue manager recovery logs …
  • API usage statistics
    • MQCONNs and MQDISCs …
    • etc
  • API per-queue usage statistics
    • MQOPEN and MQCLOSE …
    • etc

The data is published every 10 seconds or so, and the charts are refreshed around every 10 seconds.

These charts seem to be a work in progress or demonstration of technology.

After you’ve added the widget you need to click in the wheel to configure the chart.

You can select the data you want to display from drop downs

  • Select top left, for the “resource class” (major category), top right for the “resource type” (minor category) second row left for the “resource element” (detail), second row right for any object.
  • For those topics that need an object, such as a queue name, you must give  a specific object, such as CP9999, not as CP999*.
  • If you change what is being displayed, you have to select all of the data, for example,  I was collecting API usage statistics, get count per queue, and wanted to collect API usage statistics (on all queues).  I found I was changing the major category, clicking save, and getting the wrong data displayed.
    • Click on the cog
    • Select API resource class: “API usage statistics
    • Resource type: defaults to “MQCONN and MQDISC“, so select “MQGET
    • The Resource element defaults to “Interval total destructive get- count”.  I want this, so select Save.

View finder=select time range

From the cog, you can select or hide the “view finder”.  I would have called “view finder”  “select time range”.  If you “show” view finder, instead of one chart 6 cm high, the main chart is squashed into 5 cm, and there is a 1 cm squashed version below it.  It took me a while to find what this view finder is for.  If you click+hold on the “view finder” graph, and drag left or right,  the mini chart becomes a grey box with a slider.  Drag, right or left,  allows you to select the range which is displayed in the big chart.

  • To reset the window click somewhere else in the view finder chart.
  • As the small chart displays the whole time period available to you, you can drag the slider to an interesting area to allow you to drill into it.
  • You can click and drag the time range bar/gap in the view finder, so you get the same time duration, but at an earlier or later time.
  • As more data is added to the right hand side of the chart,  the time slide moves to the left over time
  • I could not find how to display just the last 5 minutes worth of data, as the window moved.
  • With the <variable name=”ltpaExpiration” value=”30″/> configuration, you get logged off after 30 minutes.  With certificates you get logged on again, but the time interval is reset, and you lose the historical data.  In this case you get no more than 30 minutes worth of data displayed.

Have data for more than one queue manager on a chart

  • If you have more than one queue manager active, you can display data from more than one queue manager.
  • A queue manager has to be active to be able to add it to a chart.   You can then stop the queue manager, and the chart will remember your selection.
  • You can select a colour for each queue manager, so for example have QMA in red, and QMB in blue, on the same chart.
  • On a chart, if you click on the circle in front of a queue manager, the circle changes from solid to empty. Click again and it goes solid.   When circle is solid the data is displayed on the chart, when it is empty, the data for that queue manager is not displayed.
    • The displayed time range may change.
  • If you move your cursor over the chart, a grey line will appear, and jump to data points, so you can see the data at that point.  It only jumped to data for one colour.
  • The number at the top of the y axis is the maximum value  displayed.

Some descriptions could be clearer

  • Some of the descriptions could be clearer, for example “interval total destructive get – count” is clear,  but “Failed MQGET – count”, is presumable for the interval as well.  I think these come from the published data.   (I found it easier to create better descriptions when I processed the published monitoring data)

The data for multiple queue managers may not be synchronised

  • For example “Platform CPU,  CPU performance – platform wide” when I had two queue managers, gave me two lines, which tend to follow each other.   The data is collected at different times (for example 10:00:03, and 10:00:07), so the data is different at each point.
  • The answer is easy – for system wide metrics just select one queue manager.

You can rename charts

  • If you hover over the title, you can click on the pencil to rename the widget.  If you clear the name it resets it to the default.
  • I changed it to “COLINs LAPTOP CPU”, then, later,  changed the chart to disk space usage.  The title “COLINs LAPTOP CPU ” was no longer relevant, so I clicked on the pencil icon, and cleared the title, and got the default chart description back.

Refresh window

  • If you refresh the window, all the charts are reset and historical data may be lost.
  • If a queue manager was stopped and restarted, refreshing the window will cause the subscription for all of the charts in the window to be reissued.

Some “hmm interesting” observations.

  • Some of the y axis data was strange.   When starting to collect some data, I had number of gets 2050, when the number of gets in the last hour was about 5.   This is from the published data.  Since data was published for the queue (over 1 week ago) there were 2050 gets from the queue since them.   The published data reported this value, and reset it.
  • Because I had <variable name=”ltpaExpiration” value=”30″/>, after 30 minutes the windows gets logged off.  Because I was using digital certificates it automatically logs on again.
  • I stopped one queue manager and restarted it.  In some charts, the data for that queue manager stopped being displayed. On other charts it was displayed successfully.
    • You need to go to the settings cog, and click save.   This reissues the subscription.
  • If you click on a the box with an arrow in it (“browse data”), by the cog, you can display the data for that chart.  Select a queue manager from the pull down list.   If you type “a” you can select all of the data – you cannot copy it to the clip board.
    • If you click on the column heading ( Timestamp or Data) you can sort ascending or descending.
  • If you select API stats for a specific queue, it does not display which queue is being displayed.
  • Sometimes data is missing.  I could see a line was missing some data.   Using the “browse data” box, I could see one queue manager had data from 13:14,  another had data from 13:15.   Both queue managers were running while I had my lunch.
  • The chart “MQ trace file system – bytes in use” reported 16KB of data – when I had over 160KB of *.trc data.  If if was for the file system, then it is a very small file system.  I do not understand this metric.

mqweb what’s the difference between the message API and the admin API?

At first glance it looks like the answer is in the question.  You can use

  • the messaging REST API put and get messages
  • the admin REST API to administer queue manager objects

In a couple of places the IBM documentation says you can use the messaging API to administer your objects, which is true at the general sense, but not the specific sense.  Until I hit a problem I thought there was one “messaging REST API” with different flavors of syntax.

Security

The admin API authorisation is managed through <security-role name=”MQWebAdmin”> and <security-role name=”MQWebAdminRO”> sections in the mqwebuser.xml file.

The messaging API authorisation is managed through <security-role name=”MQWebUser”> sections.

Access to resources is done using the Alternate Userid.  I can see in the activity trace that the userid is colinpaice(the id mqweb is running under), but the open of a queue was done with alternate userid testuser.  When I tried to browse messages on a queue, I got a message saying my userid did not have the correct authority. I used setmqaut, and mqsc command refresh security(*) to resolve it.

Cost of the admin interface

The admin interface has a request like

https://127.0.0.1:9443/ibmmq/rest/v1/admin/qmgr/QMA/queue/CP0000?attributes=*

which returns all of the attributes of the queue CP0000.  From the activity trace we can see

  • MQCONN + MQDISC
  • MQOPEN, MQINQ, MQCLOSE of the manager object – twice
  • MQOPEN, MQPUT, MQCLOSE to the SYSTEM.ADMIN.COMMAND.QUEUE
  • MQOPEN, MQGET, MQCLOSE to the SYSTEM.REST.REPLY.QUEUE
  • MQCMIT
  • MQBACK – the JMQI code always does this to be sure that there is no outstanding unit of work,

The most expensive request is the MQCONNect.

Using the admin interface is fine for administration because changes to objects are usually done infrequently.   If you are considering the admin interface to monitor objects, for example plot queue depths over time, the mq rest API may not be the best solution.

Cost of the messaging interface

The messaging API interface uses connection pooling.   When the application does an MQDISC, the connection is returned to a pool, and can be reused if the same userid does an MQCONN.  If the connection is not used for a period, it can be removed from the pool and an MQDISC done to release the connection.    This should eliminate frequent MQCONN and MQDISCs.

From the activity trace we see

MQOPEN, MQGET,MQGET,MQCLOSE of the queue, and no MQCONN.

There will be an MQCONN, is there is no connection available for that userid in the pool, but this should be infrequent.

Python and mq REST api

I found cURL a good way of using the mq REST API, but I wanted to do more.  cURL depends on a package called libcurl, which can be used by other languages.

Python seemed the next obvious place to look.

As I have found out, using digital certificates for authentication is hard to set up, and using signed certificates is even harder.  As I had done the hard work of setting up the certificates before I tried curl and Python, the curl and Python experience was pretty easy.

I looked at using the Python “request” package.   This allows you to specify most of the parameters that libcurl needs, except it does not allow you to specify the password for the user’s keystore.

I then looked at the Python package pycurl package.    This is a slightly lower level API, but got it working in an hour or so.
My whole program is below.

During the testing I got various errors, such as “77”.  These are documented here. 

The messages were clear, for example

CURLE_SSL_CACERT_BADFILE (77) Problem with reading the SSL CA cert (path? access rights?).

Which was enough to tell me where to look.

All the things you can do with curl, you can do with pycurl.

 

# program - based on code in http://pycurl.io/docs/latest/quickstart.html
import sys
import pycurl

from io import BytesIO

# header_function take from http://pycurl.io/docs/latest/quickstart.html
headers = {}
def header_function(header_line):
# HTTP standard specifies that headers are encoded in iso-8859-1.

header_line = header_line.decode('iso-8859-1')

# Header lines include the first status line (HTTP/1.x ...).
# We are going to ignore all lines that don't have a colon in them.
# This will botch headers that are split on multiple lines...
if ':' not in header_line:
  return

# Break the header line into header name and value.
name, value = header_line.split(':', 1)
print("header",name,value)

home = "/home/colinpaice/ssl/ssl2/"
ca=home+"cacert.pem"
cert=home+"testuser.pem"
key=home+"testuser.key.pem"
cookie=home+"cookie.jar.txt"
url="https://127.0.0.1:9443/ibmmq/rest/v1/admin/qmgr/QMA/queue/CP0000?attributes=type"
buffer = BytesIO()
c = pycurl.Curl()
print("C=",c)
try:
  # see option names here https://curl.haxx.se/libcurl/c/curl_easy_setopt.html
  # PycURL option names are derived from libcurl
  # option names by removing the CURLOPT_ prefix. 
  c.setopt(c.URL, url) 
  c.setopt(c.WRITEDATA, buffer) 
  c.setopt(pycurl.CAINFO, ca) 
  c.setopt(pycurl.CAPATH, "") 
  c.setopt(pycurl.SSLKEY, key) 
  c.setopt(pycurl.SSLCERT, cert) 
  c.setopt(pycurl.SSL_ENABLE_ALPN,1)
  c.setopt(pycurl.HTTP_VERSION,pycurl.CURL_HTTP_VERSION_2_0)
  c.setopt(pycurl.COOKIE,cookie) 
  c.setopt(pycurl.COOKIEJAR,cookie) 
  c.setopt(pycurl.SSLKEYPASSWD , "password") 
  c.setopt(c.HEADERFUNCTION, header_function)  
# c.setopt(c.VERBOSE, True)
  c.perform() 
  c.close()
except Exception as e: 
  print("exception :",e ) 
finally: 
  print("done") 
body = buffer.getvalue() # Body is a byte string. 
# We have to know the encoding in order to print it to a text file 
# such as standard output. 
print(body.decode('iso-8859-1'))