mqweb error messages and symptoms of TLS setup problems

I deliberately caused TLS set up errors, and noted the symptoms.  Ive recorded them below; the article is not meant to be read, but indexed by search engines.

There are three sections

  1. Problems with server certificates
  2. Problems with the client certificate
  3. Chrome messages, and possible causes of the problems.

The mqweb messages.log reported problems that the mqweb server saw.   For me this was in file /var/mqm/web/installations/Installation1/servers/mqweb/logs/messages.log

Problems with the server certificate

Problem: mqwebuser.xml serverKeyAlias name not in the keystore

Message log:

  • E CWPKI0024E: The certificate alias mqweb specified by the property com.ibm.ssl.keyStoreServerAlias is not found in KeyStore /home/colinpaice/ssl/ssl2/mqweb.p12.
  • I FFDC1015I: An FFDC Incident has been created: “com.ibm.wsspi.channelfw.exception.ChannelException: java.lang.IllegalArgumentException: CWPKI0024E: The certificate alias mqweb specified by the property com.ibm.ssl.keyStoreServerAlias is not found in KeyStore /home/colinpaice/ssl/ssl2/mqweb.p12. com.ibm.ws.channel.ssl.internal.SSLConnectionLink 238″ at ffdc_….

curl:

* TLSv1.3 (OUT), TLS handshake, Client hello (1):
* curl (35) OpenSSL SSL_connect: SSL_ERROR_SYSCALL in connection to 127.0.0.1:9443
* stopped the pause stream!
* Closing connection 0

chrome:

This site can’t be reached.  ERR_CONNECTION_CLOSED

Problem:  The host certificate is self signed and not in the client keystore

Message log:

Nothing.

curl:

* TLSv1.2 (OUT), TLS alert, Server hello (2):
* SSL certificate problem: self signed certificate
* stopped the pause stream!
* Closing connection 0
curl: (60) SSL certificate problem: self signed certificate

Chrome: in browser

NET::ERR_CERT_AUTHORITY_INVALID

Click on the Not Secure in the url, to display the certificate which was sent down.

Chrome log:

ERROR:cert_verify_proc_nss.cc(1011)] CERT_PKIXVerifyCert for localhost failed err=-8179

From here  -8179 is Peer’s certificate issuer is not recognized.

Problem: curl: The host certificate is self signed and you use the –insecure option

curl

* TLSv1.2 (IN), TLS handshake, Finished (20):
* SSL connection using TLSv1.2 / ECDHE-RSA-AES256-GCM-SHA384
* ALPN, server did not agree to a protocol
* Server certificate:
* subject: C=GB; O=aaaa; CN=testuser
* start date: Jan 20 17:39:37 2020 GMT
* expire date: Feb 19 17:39:37 2020 GMT
* issuer: C=GB; O=aaaa; CN=testuser
* SSL certificate verify result: self signed certificate (18), continuing anyway.

Problem: Chrome:  The host certificate is self signed and is not trusted

Chrome browser

This site can’t be reached
localhost unexpectedly closed the connection.
ERR_CONNECTION_CLOSED

Debugging

  • I could find nothing that told me what certificate was being used.  The Chrome network trace just gave “net_error = -100 (ERR_CONNECTION_CLOSED)“.
  • Use certutil -L $sql  to list the contents of your browsers keystore.   The certificate needs “P,…” permissions.
  •  Or use the chrome url chrome://settings/certificates  and display “your certificates”. Pick the likely one, if it says “UNTRUSTED” then this may be the problem.   View the certificate, and check it, for example under details, there may be a comment describing its use.
  •  Defined the server certificate as trusted using certutil -M $sql -n name -t “P,,” 
  • Restart the web browser.

Problem: The  CA signer server certificate had the wrong subjectAltName

curl:

* subjectAltName does not match 127.0.0.1
* SSL: no alternative certificate subject name matches target host name ‘127.0.0.1’

Chrome:

NET::ERR_CERT_COMMON_NAME_INVALID
From the “Not Secure” in front of the URL, display the certificate, and check the extenstions, especially Certificate Subject Alternative Names.

Chrome log:

ERROR:ssl_client_socket_impl.cc(935)] handshake failed; returned -1, SSL error code 1, net_error -200
From here -200 is  CERT_INVALID

Problem: The mqweb server certifcate has expired

curl:

* TLSv1.2 (OUT), TLS alert, Server hello (2):
* SSL certificate problem: certificate has expired
curl: (60) SSL certificate problem: certificate has expired

chrome:

while Chrome running:   web page reports Lost communication with the server.  Could not establish communication with the server. Check your network connections and refresh your browser

restart browser, get “Your connection is not private NET::ERR_CERT_DATE_INVALID”

message.log.  Chrome session was working, then server certificate expired

  • E CWWKO0801E: Unable to initialize SSL connection. Unauthorized access was denied or security settings have expired. Exception is javax.net.ssl.SSLException: Received fatal alert: certificate_unknown

Problem: The mqweb server certificate is missing extendedKeyUsage = serverAuth

curl:

* SSL certificate problem: unsupported certificate purpose
curl: (60) SSL certificate problem: unsupported certificate purpose

Chrome:

Your connection is not private
Attackers might be trying to steal your information from localhost (for example, passwords, messages or credit cards).
NET::ERR_CERT_INVALID

Chrome log:

CERT_PKIXVerifyCert for localhost failed err=-8101
From here  -8101 is Certificate type not approved for application.

ERROR:ssl_client_socket_impl.cc(935)] handshake failed; returned -1, SSL error code 1, net_error -207
From here -207 is CERT_INVALID

Problems with the client certificate

Problem: The client certificate is self signed and not in the server’s trust store

curl:

* TLSv1.2 (OUT), TLS handshake, Finished (20):
* OpenSSL SSL_connect: SSL_ERROR_SYSCALL in connection to 127.0.0.1:9443

Chrome:

ERR_CONNECTION_CLOSED

Messages in messages.log:

  • I FFDC1015I: An FFDC Incident has been created: “java.security.cert.CertPathBuilderException: unable to find valid certification path to requested target com.ibm.ws.ssl.core.WSX509TrustManager checkClientTrusted” at ffdc_20.01.30_08.29.27.0.log
  •  E CWPKI0022E: SSL HANDSHAKE FAILURE: A signer with SubjectDN CN=testuser, O=aaaa, C=GB was sent from the target host. The signer might need to be added to local trust store /home/colinpaice/ssl/ssl2/trust.jks, located in SSL configuration alias defaultSSLConfig. The extended error message from the SSL handshake exception is: PKIX path building failed: java.security.cert.CertPathBuilderException: unable to find valid certification path to requested target
  •  I FFDC1015I: An FFDC Incident has been created: “java.security.cert.CertificateException: unable to find valid certification path to requested target com.ibm.ws.ssl.core.WSX509TrustManager checkClientTrusted” at ffdc_20.01.30_08.29.27.1.log
  • E CWWKO0801E: Unable to initialize SSL connection. Unauthorized access was denied or security settings have expired. Exception is javax.net.ssl.SSLHandshakeException: null cert chain

Problem: Invalid cn=, the cn value is not a valid userid.

curl message

{“error”: [{

  • “action”: “Provide credentials using a client certificate, LTPA security token, or username and password via HTTP basic authentication header. On z/OS, if the mqweb server has been configured for SAF authentication, check the messages.log file for messages indicating that SAF authentication is not available. Start the Liberty angel process if it is not already running. You might need to restart the mqweb server for any changes to take effect.”,
  • “completionCode”: 0,
  •  “explanation”: “The REST API request cannot be completed because credentials were omitted from the request. On z/OS, if the mqweb server has been configured for SAF authentication, this can be caused by the Liberty angel process not being active.”,
  • “message”: “MQWB0104E: The REST API request to ‘https://127.0.0.1:9443/ibmmq/rest/v1/login ‘ is not authenticated.”,
  • “msgId”: “MQWB0104E”,
  • “reasonCode”: 0,
  • “type”: “rest”

chrome:

It gives you a window to enter userid and password.   This looks like a bug as I have <webAppSecurity allowFailOverToBasicAuth=”false”/>.  It takes the userid and password.

Messages in  messages.log:

R com.ibm.websphere.security.CertificateMapFailedException
and 100 lines of stack trace

The certificate causing the problems, nor the userid is listed – so pretty useless.

Problem: Client certificate missing “extendedKeyUsage = clientAuth”  during signing.

curl message

* TLSv1.2 (OUT), TLS change cipher, Client hello (1):
* TLSv1.2 (OUT), TLS handshake, Finished (20):
curl session hangs…
* Operation timed out after 300506 milliseconds with 0 out of 0 bytes received

Chrome

ERR_CONNECTION_CLOSED

message in messages.log:

  • E CWPKI0022E: SSL HANDSHAKE FAILURE: A signer with SubjectDN CN=colinpaice, O=cpwebuser, C=GB was sent from the target host. The signer might need to be added to local trust store /home/colinpaice/ssl/ssl2/trust.jks, located in SSL configuration alias defaultSSLConfig. The extended error message from the SSL handshake exception is: Extended key usage does not permit use for TLS client authentication
  •  I FFDC1015I: An FFDC Incident has been created: “java.lang.NullPointerException com.ibm.ws.ssl.core.WSX509TrustManager checkClientTrusted” at ffdc_20.01.28_17.11.10.1.log

ffdc in /var/mqm/web/installations/Installation1/servers/mqweb/logs/messages.log/ffdc

Exception = java.lang.NullPointerException
Source = com.ibm.ws.ssl.core.WSX509TrustManager
probeid = checkClientTrusted
Stack Dump = java.lang.NullPointerException
at com.ibm.ws.ssl.core.WSX509TrustManager.checkClientTrusted(WSX509TrustManager.java:202)

Problem: Client certificate missing “keyUsage = digitalSignature”  during signing.

curl message

* TLSv1.2 (OUT), TLS handshake, Finished (20):
* Operation timed out after 300509 milliseconds with 0 out of 0 bytes received

message in messages.log

  • E CWPKI0022E: SSL HANDSHAKE FAILURE: A signer with SubjectDN CN=colinpaice, O=cpwebuser, C=GB was sent from the target host. The signer might need to be added to local trust store /home/colinpaice/ssl/ssl2/trust.jks, located in SSL configuration alias defaultSSLConfig. The extended error message from the SSL handshake exception is: KeyUsage does not allow digital signatures
  • FFDC1015I: An FFDC Incident has been created: “java.lang.NullPointerException com.ibm.ws.ssl.core.WSX509TrustManager checkClientTrusted”
  • E CWWKO0801E: Unable to initialize SSL connection. Unauthorized access was denied or security settings have expired. Exception is javax.net.ssl.SSLHandshakeException: null cert chain

ffdc in /var/mqm/web/installations/Installation1/servers/mqweb/logs/messages.log/ffdc

Exception = java.lang.NullPointerException
Source = com.ibm.ws.ssl.core.WSX509TrustManager
probeid = checkClientTrusted
Stack Dump = java.lang.NullPointerException
at com.ibm.ws.ssl.core.WSX509TrustManager.checkClientTrusted(WSX509TrustManager.java:202)

Chrome:

  • If there is one or more certificates in the keystore, the list of valid certificates does not include the problem one.
  • If there is only the problem certificate in the keystore, you get
    This site can’t be reached.
    localhost unexpectedly closed the connection.
    ERR_CONNECTION_CLOSED

CA Signed client certificate has expired

curl:

* TLSv1.2 (OUT), TLS change cipher, Client hello (1):
* TLSv1.2 (OUT), TLS handshake, Finished (20):
* OpenSSL SSL_connect: SSL_ERROR_SYSCALL in connection to 127.0.0.1:9443
* stopped the pause stream!
* Closing connection 0

Chrome:

This site can’t be reached
localhost unexpectedly closed the connection.
ERR_CONNECTION_CLOSED

message in messages.log:

for curl.

  • I FFDC1015I: An FFDC Incident has been created: “java.security.cert.CertPathValidatorException: The certificate expired at Thu Jan 30 16:46:00 GMT 2020; internal cause is:
    java.security.cert.CertificateExpiredException: NotAfter: Thu Jan 30 16:46:00 GMT 2020 com.ibm.ws.ssl.core.WSX509TrustManager checkClientTrusted” at ffdc_20.01.30_17.16.11.0.log
  • E CWPKI0022E: SSL HANDSHAKE FAILURE: A signer with SubjectDN CN=colinpaice, O=cpwebuser, C=GB was sent from the target host. The signer might need to be added to local trust store /home/colinpaice/ssl/ssl2/trust.jks, located in SSL configuration alias defaultSSLConfig. The extended error message from the SSL handshake exception is: PKIX path validation failed: java.security.cert.CertPathValidatorException: The certificate expired at Thu Jan 30 16:46:00 GMT 2020; internal cause is:
    java.security.cert.CertificateExpiredException: NotAfter: Thu Jan 30 16:46:00 GMT 2020
  •  I FFDC1015I: An FFDC Incident has been created: “java.security.cert.CertificateException: The certificate expired at Thu Jan 30 16:46:00 GMT 2020 com.ibm.ws.ssl.core.WSX509TrustManager checkClientTrusted” at ffdc_20.01.30_17.16.11.1.log

for chrome:

  • I FFDC1015I: An FFDC Incident has been created: “java.security.cert.CertificateException: The cer
    tificate expired at Thu Jan 30 16:46:00 GMT 2020 com.ibm.ws.ssl.core.WSX509TrustManager checkClientTrusted” at ffdc_20.01.30_17.16.11.1.log
  • E CWWKO0801E: Unable to initialize SSL connection. Unauthorized access was denied or security settings have expired. Exception is javax.net.ssl.SSLHandshakeException: null cert chain

Bad requests

HTTP request was issued – it should have been HTTPS

curl:

curl:(52) Empty reply from server

messages.log:

E CWWKO0801E: Unable to initialize SSL connection. Unauthorized access was denied or security settings have expired. Exception is javax.net.ssl.SSLException: Unrecognized SSL message, plaintext connection?

 

Chrome errors

Chrome has more stricter checks than curl.  These are from Chrome browser.

NET::ERR_CONNECTION_CLOSED

  • mqwebuser.xml serverKeyAlias name not in the keystore
  • The host certificate is self signed and is not trusted
  • The client certificate is self signed and not in the server’s trust store
  • Client certificate missing “extendedKeyUsage = clientAuth”  during signing.
  • CA Signed client certificate has expired
  • Client certificate missing “keyUsage = digitalSignature”  during signing.

NET::ERR_CERT_COMMON_NAME_INVALID

  • missing x509 extensions in the server certificate
  • invalid subjectAltName in x509 extensions, for example IP:127.0.0.11  instead of IP:127.0.0.1

NET::ERR_CERT_INVALID

  • missing extendedKeyUsage = serverAuth in x509 extensions

NET::ERR_CERT_AUTHORITY_INVALID

  • Certificate is not peer.  Need certutil -M $sql -n $name -t “P,,” to change the certificate to be a trusted peer
  • Server’s self signed not found in the browser keystore.

NET::ERR_CERT_DATE_INVALID

  • The mqweb server certifcate has expired

mqweb – what to do when you cannot get TLS to work?

It is hard to debug setup problems in mqweb.   I found it easiest to not use the mqweb trace, but diagnose problems from the client side.

You need to understand many TLS concepts.  I’ve documented a lot of information here: Understanding the TLS concepts for using certificates to authenticate in mqweb.

I found the easiest way to debug my mqconsole TLS setup, was to use extract the certificates from my browser’s key store and use curl’s verbose, or trace functions.   I’ve documented here how to get a Chrome trace.

I caused all of the common “user errors” and have documented the messages or symptoms I got, these are in this post.

Have you tried turning it off and on again?

The first thing you need to do if you have problems when you are configuring certificates is to restart mqweb, and your browser.   This is because updates to the keystores are not picked up till the mqweb or browser is restarted.  The Chrome and Firefox browsers, remember the certificate used, and logon this on again – so restart the browser to reset every thing.  With Chrome, I set up a bookmark url chrome://restart .

Once you have set up your first connection,  you should not need to change the mqweb server, as you will have set up the mqweb server certificate, and the CA certificate(s) to certify clients.  If you are using self signed,  you will have to import the SS certificate into the trust store, and restart the mqweb server (not good for high availability).

I found if I started chrome from a command window, instead of clicking on an icon, I got out some diagnostic messages to the command window.   These messages were slightly more useful than generic messages like “NET::ERR_CERT_AUTHORITY_INVALID”

Useful Chrome urls

  • chrome://restart
  • chrome://settings/certificates
  • chrome://net-export/ – for collecting a Chrome trace

Getting started

If you are using .pem files (for example openssl) you can use these with no further work.

If you have a .p12 (pkcs12) format keystore, you can use this with no further work.

If you are using a browser with its nssdb database, you need to extract the certificate and private key, and any CA certificates you use.  It is easy to extract a certificate and key  into a .p12 keystore.

Extract the certificate and private key from your browser’s keystore

Curl can use the browser’s key store directly if it has been compiled with NSS (instead of openssl).  “Curl -V”, built with openssl gave me “libcurl/7.58.0 OpenSSL/1.1.1″, someone else’s curl, built with NSS had “libcurl/7.19.7 NSS/3.14.3.0″.  If you do not have curl with NSS support you need to extract the certificate and key from the browsers keystore.

  • Check where your Chrome profile is.  In the Chrome browser, use the url chrome://version .   On one Chrome instance this was  /home/colinpaice/snap/chromium/986/.pki/nssdb .  On a different Chrome instance, the keystore was /home/colinpaice/.pki/nssdb .
  • Export your certificate and keystore
    • pk12util -o colinpaicex.p12 -d sql:/home/colinpaice/snap/chromium/986/.pki/nssdb/ -n colinpaice -W password
    • pk12util – invoke this program
    • -o colinpaicex.p12  – create this pkcs12 store
    • -d sql:/home/colinpaice/snap/chromium/986/.pki/nssdb/  – from this repository
    • -n colinpaice  – with this name
    • -W password  – and give it this password
  • If you have created your own certificate authority, you need to extract the certificate if you do not already have it.  Firstly list the contents to remind yourself what the CA certificate is called, then extract the certificate (‘myCACert’ in my case)
    • certutil -d sql:/home/colinpaice/snap/chromium/986/.pki/nssdb/ -L
      • This gives “Certificate Nickname ” and “Trust Attributes”.   Your CA should have a trust Attribute of “C”.
    • certutil -d sql:/home/colinpaice/snap/chromium/986/.pki/nssdb/ -L -n “myCACert” -a >outcacert.pem
    • certutil – this program
    • -d sql:/home/colinpaice/snap/chromium/986/.pki/nssdb/ – this key store
    • -L  – list
    • -n “myCACert”  – this name
    • -a – ASCII output
    • >outcacert.pem  – create this file

Issue the curl request

You can use the .p12 file, or the certificate.pem and the key.pem file

Example output

If you use the option — verbose  you get a lot of information for example, a successful request has

  • * Trying 127.0.0.1…
  • * TCP_NODELAY set
  • * ALPN, offering h2
  • * ALPN, offering http/1.1
  • * successfully set certificate verify locations:
  • * CAfile: ./outcacert.pem
  • CApath: /etc/ssl/certs
  • * TLSv1.3 (OUT), TLS handshake, Client hello (1):
  • * TLSv1.2 (IN), TLS handshake, Certificate (11):
  • * TLSv1.2 (IN), TLS handshake, Server key exchange (12):
  • * TLSv1.2 (IN), TLS handshake, Request CERT (13):
  • * TLSv1.2 (IN), TLS handshake, Server finished (14):
  • * TLSv1.2 (OUT), TLS handshake, Certificate (11):
  • * TLSv1.2 (OUT), TLS handshake, Client key exchange (16):
  • * TLSv1.2 (OUT), TLS handshake, CERT verify (15):
  • * TLSv1.2 (OUT), TLS change cipher, Client hello (1):
  • * TLSv1.2 (OUT), TLS handshake, Finished (20):
  • * TLSv1.2 (IN), TLS handshake, Finished (20):
  • * SSL connection using TLSv1.2 / ECDHE-RSA-AES256-GCM-SHA384
  • * ALPN, server did not agree to a protocol
  • * Server certificate:
  • *   subject: C=GB; O=cpwebuser; CN=mqweb5
  • *   start date: Jan 20 17:53:59 2020 GMT
  • *   expire date: Oct 16 17:53:59 2022 GMT
  • *   subjectAltName: host “127.0.0.1” matched cert’s IP address!
  • *   issuer: C=GB; O=SSS; OU=CA; CN=SSCA7
  • *  SSL certificate verify ok.
  • > GET /ibmmq/rest/v1/admin/qmgr/QMA/queue/CP0000?attributes=*&status=* HTTP/1.1
  • > Host: 127.0.0.1:9443

See here for an overview of the TLS handshake.   The amount of progress down the list of steps in the hand shake give you a clue as to where the problem may be.  If it is around “TLS handshake, Client Hello (1)”.  This is likely to be a problem with the server certificate.

The numbers as in TLS handshake, CERT verify (15): are the id number of the request, 15 is CERT verify.

A “Finished” message is always sent immediately after a change cipher spec message to verify that the key exchange and authentication processes were successful.  More checks are done after this.

If you use ‑‑trace filename.txt instead of ‑‑verbose you get the same data as displayed as with ‑‑verbose, plus the data flowing up and down the connection.  I found ‑‑verbose had sufficient details to resolve the problems.

mqweb – how to get a chrome browser trace

How to get a chrome trace

See Troubleshooting Chrome network issues  and the description here on how to collect a trace.

  • Open a tab with the chrome://net-export/ url.
  • Click start logging to disk
  • Select a file location
  • In another tab select the mqweb url
  • Click on the “stop” button in the window
  • If you select show file – it opens the json file.   This has all the information you need to process the file, but it is much easier to use the provided tools
  • The filename is given for example “FILE: /home/colinpaice/Downloads/chrome-net-export-log.json
  • Click on “The log file can be loaded using the netlog_viewer.” link.   This gets you to a page which says
  • This app loads NetLog files generated by Chromium’s chrome://net-export. Log data is processed and visualized entirely on the client side (your browser). Data is never uploaded to a remote endpoint.
  • Select  https://netlog-viewer.appspot.com/ to invoke the formatter.
  • Drag your netlog file, or use “choose file”
  • Select events, and this displays all of the traffic
  • In the search bar at the top enter your port 9443, or error
  • You get a list like
  • NONE HOST_RESOLVER_IMPL_REQUEST
    1083 URL_REQUEST https://127.0.0.1:9443/ibmmq/console/
    1084 DISK_CACHE_ENTRY
    1085 HTTP_STREAM_JOB_CONTROLLER https://127.0.0.1:9443/
    1086 HTTP_STREAM_JOB https://127.0.0.1:9443/
  • If the background  is pale green – it is good.  If it is pink (pale red) there was a problem.
  • Click on a line and it displays trace information in a window.  For example the first URL_REQUEST gave
    • t= 8 [st= 8]        HTTP_STREAM_JOB_CONTROLLER_BOUND
                          --> source_dependency = 1089 (HTTP_STREAM_JOB_CONTROLLER)
      t=65 [st=65]        HTTP_STREAM_REQUEST_BOUND_TO_JOB
                          --> source_dependency = 1090 (HTTP_STREAM_JOB)
      t=65 [st=65]     -HTTP_STREAM_REQUEST
      t=65 [st=65]      URL_REQUEST_DELEGATE_SSL_CERTIFICATE_ERROR  [dt=1]
      t=66 [st=66]      CANCELLED
                        --> net_error = -200 (ERR_CERT_COMMON_NAME_INVALID)
      t=66 [st=66]   -URL_REQUEST_START_JOB
                      --> net_error = -200 (ERR_CERT_COMMON_NAME_INVALID)
      t=66 [st=66]    URL_REQUEST_DELEGATE_RESPONSE_STARTED  [dt=0]
      t=66 [st=66] -REQUEST_ALIVE
      
    • SSL_CONNECT_JOB gave me
      1087: SSL_CONNECT_JOB
      ssl/127.0.0.1:9443
      Start Time: 2020-01-29 08:41:25.699
      t= 1 [st= 0] +CONNECT_JOB  [dt=64]
      t= 1 [st= 0]    SOCKET_POOL_CONNECT_JOB_CREATED
                      --> backup_job = false
                      --> group_id = "ssl/127.0.0.1:9443"
      t= 1 [st= 0]   +SSL_CONNECT_JOB_CONNECT  [dt=64]
      t= 1 [st= 0]     +TRANSPORT_CONNECT_JOB_CONNECT  [dt=0]
      t= 1 [st= 0]        HOST_RESOLVER_IMPL_REQUEST  [dt=0]
                          --> address_family = 0
                          --> allow_cached_response = true
                          --> host = "127.0.0.1:9443"
                          --> is_speculative = false
      t= 1 [st= 0]        CONNECT_JOB_SET_SOCKET
      t= 1 [st= 0]     -TRANSPORT_CONNECT_JOB_CONNECT
      t=65 [st=64]      CONNECT_JOB_SET_SOCKET
      t=65 [st=64]   -SSL_CONNECT_JOB_CONNECT
                      --> net_error = -200 (ERR_CERT_COMMON_NAME_INVALID)
      t=65 [st=64] -CONNECT_JOB
      

Understanding Chromium trace and performance data

I found this link very useful to explain the developer information, such as trace, performance etc.

mqweb – performance notes

  • I found facilities in Liberty which can improve the performance of your mqweb server by 1% – ish, by using http/2 protocol and ALPN
  • Ive documented where time is spent in the mq rest exchange.

Use of http/2 and ALPN to improve performance.

According to Wikipedia, Application-Layer Protocol Negotiation (ALPN) is a Transport Layer Security (TLS) extension that allows the application layer to negotiate which protocol should be performed over a secure connection in a manner that avoids additional round trips and which is independent of the application-layer protocols. It is needed by secure HTTP/2 connections, which improves the compression of web pages and reduces their latency compared to HTTP/1.x.

mqweb configuration.

This is a liberty web browser configuration, see this page.

For example

 <httpEndpoint id="defaultHttpEndpoint"
   host="${httpHost}" 
   httpPort="${httpPort}"
   httpsPort="${httpsPort}"
   protocolVersion="http/2"
   >
   <httpOptions removeServerHeader="false"/>

</httpEndpoint>

Client configuration

Most web  browsers support this with no additional configuration needed.

With curl you specify ––http2.

With curl, ALPN is enabled by default (as long as curl is built with the ALPN support).

With the curl ––verbose option on a curl request,  you get

  • * ALPN, offering h2 – this tells you that curl has the support for http2.
  • * ALPN, offering http/1.1

and one of

  • * ALPN, server did not agree to a protocol
  • * ALPN, server accepted to use h2

The “* ALPN, server accepted to use h2” says that mqweb is configured for http2.

With pycurl you specify

 c.setopt(pycurl.SSL_ENABLE_ALPN,1)
 c.setopt(pycurl.HTTP_VERSION,pycurl.CURL_HTTP_VERSION_2_0)

Performance test

I did a quick performance test of a pycurl program getting a 1024 byte message (1024 * the character ‘x’) using TLS certificates.

HTTP support Amount of “application data” sent Total data sent.
http/1.1 2414 7151
http/2 2320 7097

So a slight reduction in the number of bytes send when using http/2.

The time to get 10 messages was 55 ms with http/2, and 77ms with http/1.1,  though there was significant variation in repeated measurements, so I would not rely on these measurements.

Where is the time being spent?

cURL and pycurl can report the times from the underlying libcurl package.  See TIMES here.

The times (from the start of the request) are

  • Name lookup
  • Connect
  • Application connect
  • Pre transfer
  • Start transfer
  • Total time

Total time- Start transfer = duration of application data transfer.

Connect duration = Connect Time – Name lookup Time etc.

For a pycurl request getting two messages from a queue the durations were

Duration in microseconds First message Second messages
Name_lookup 4265 32
Connect 53 3
APP Connect 18985 0
Pre Transfer 31 42
Start Transfer 12644 11036
Transfer of application data 264 235

Most of the time is spent setting up the connection, if the same connection can be reused, then the second and successive requests are much faster.

In round numbers, the first message took 50 ms, successive messages took between 10 and 15 ms.

Rant: I find the IBM Knowledge center on the web runs like a dog with a wooden leg

While playing with the mqweb stuff, I found I was searching for materials on mqweb in the IBM knowledge center.    I got fed up with it being so slow, so I’ve spent some time looking into it.  The slowness may be due to “performance code”  within the page which measures how slowly it goes.  We had a basset hound who had one of its front legs in plaster, and the display of the web pages reminds me of how it used to run.

It is so bad, I see the picture stuttering as it build up.

  • I see the blue header
  • then “Do you want to” which finally ends up at the bottom of the screen.
  • table of contents on the left hand side
  • the page with the content on it appears
  • finally the banner saying “free trial.   Try RESTful APIs to and from your IBM Z mainframe”.

This banner is annoying – I cannot  get rid of it.  It takes up 2cm out of the 15 cm space in my browser – that’s 13% of the real estate!  I keep being asked to give comments on the web site… I do, but I think any comments are being ignored.

I compared the IBM site with the BBC, which has lots of coloured image files,  using the “lighthouse” capability within the Chrome browser.

Site First meaningful paint, seconds Time to interactive, seconds
IBM 9.1 KC page q132130_.htm 0.6 5.8
BBC news page with lots of images 0.3 1.5

Wow, 5.8 seconds – even worse than I thought!

With my broadband, I get download speed of about 53 Mb/Second and upload about 17 Mb/Second.  Ping took about 30 ms to both IBM and to the BBC.  We are on an island, north of Scotland, so I think our response time is typical.

How did I get this data?

In Google Chrome,  Ctrl-shift I, select the Audits tab,  type your URL at the top, press enter

Select “desktop”, Performance, No Throttling.

Click on “Run Audits”.  It runs for a few seconds and stops.

There is a lot of good information.

If you click on “view trace” button, then at the bottom you get a summary chart.

  •      93 ms Loading
  • 3419 ms Scripting
  •   321 ms  Rendering
  •     31 ms Painting
  •  885 ms System

So most of the time is spent scripting!

What sites are used?

I took the trace file, extracted the records with “url” and counted the occurrences.

  • 7357 1.www.s81c.com – an IBM site
  • 5347 http://www.ibm.com
  •   240 tags.tiqcdn.com – Tealium enterprise tag management and marketing software.
  •     42 consent.trustarc.com – TrustArc Cookie Consent Manager
  •    34 9j ?
  •     25 consent.truste.com – TrustArc Cookie Consent Manager
  •     13 consent-st.trustarc.com – TrustArc Cookie Consent Manager
  •    12 js.logentries.com –  Live Log Management and Analytics
  •      7 mapvip.podc.sl.edst.ibm.com
  •      3 www-api.ibm.com
  •      3 idaas.iam.ibm.com

And there was me thinking that the knowlegde center was like a fat pipe squirting down the data, when in fact it sends data down a drop at a time. It also tells other sites what you are looking at.

You can use the “source” tab, and explore all the files which were downloaded.  For example  there is the >V9.1.0  jpg file, along with .js and .css files used in formatting.

What are the top use java script files?

There seem to be a couple of hot java script files, taking over 2 seconds.  (on http://www.s81c.com file  js/www.js… )  The text inside the files begin with IBMPerformance…   I think that a hot function within this,  is the time function, so maybe this code is timing every thing it does, and so slowing it down.

What helps me?

This link explains how to understand the trace and performance data from Chrome.

 

mqweb charts look like “work in progress”

One of the widgets you can add to a dashboard page is a chart.   This can subscribe to the published monitoring data, and displays it as a line chart.

The topics are described in my post  What data is available with the Published Monitoring data, and include

  • Platform central processing units
      • CPU performance – platform wide  …
    • CPU performance – running queue manager …
  • Platform persistent data stores
    • Disk usage – platform wide …
    • Disk usage – running queue managers …
    • Disk usage – queue manager recovery logs …
  • API usage statistics
    • MQCONNs and MQDISCs …
    • etc
  • API per-queue usage statistics
    • MQOPEN and MQCLOSE …
    • etc

The data is published every 10 seconds or so, and the charts are refreshed around every 10 seconds.

These charts seem to be a work in progress or demonstration of technology.

After you’ve added the widget you need to click in the wheel to configure the chart.

You can select the data you want to display from drop downs

  • Select top left, for the “resource class” (major category), top right for the “resource type” (minor category) second row left for the “resource element” (detail), second row right for any object.
  • For those topics that need an object, such as a queue name, you must give  a specific object, such as CP9999, not as CP999*.
  • If you change what is being displayed, you have to select all of the data, for example,  I was collecting API usage statistics, get count per queue, and wanted to collect API usage statistics (on all queues).  I found I was changing the major category, clicking save, and getting the wrong data displayed.
    • Click on the cog
    • Select API resource class: “API usage statistics
    • Resource type: defaults to “MQCONN and MQDISC“, so select “MQGET
    • The Resource element defaults to “Interval total destructive get- count”.  I want this, so select Save.

View finder=select time range

From the cog, you can select or hide the “view finder”.  I would have called “view finder”  “select time range”.  If you “show” view finder, instead of one chart 6 cm high, the main chart is squashed into 5 cm, and there is a 1 cm squashed version below it.  It took me a while to find what this view finder is for.  If you click+hold on the “view finder” graph, and drag left or right,  the mini chart becomes a grey box with a slider.  Drag, right or left,  allows you to select the range which is displayed in the big chart.

  • To reset the window click somewhere else in the view finder chart.
  • As the small chart displays the whole time period available to you, you can drag the slider to an interesting area to allow you to drill into it.
  • You can click and drag the time range bar/gap in the view finder, so you get the same time duration, but at an earlier or later time.
  • As more data is added to the right hand side of the chart,  the time slide moves to the left over time
  • I could not find how to display just the last 5 minutes worth of data, as the window moved.
  • With the <variable name=”ltpaExpiration” value=”30″/> configuration, you get logged off after 30 minutes.  With certificates you get logged on again, but the time interval is reset, and you lose the historical data.  In this case you get no more than 30 minutes worth of data displayed.

Have data for more than one queue manager on a chart

  • If you have more than one queue manager active, you can display data from more than one queue manager.
  • A queue manager has to be active to be able to add it to a chart.   You can then stop the queue manager, and the chart will remember your selection.
  • You can select a colour for each queue manager, so for example have QMA in red, and QMB in blue, on the same chart.
  • On a chart, if you click on the circle in front of a queue manager, the circle changes from solid to empty. Click again and it goes solid.   When circle is solid the data is displayed on the chart, when it is empty, the data for that queue manager is not displayed.
    • The displayed time range may change.
  • If you move your cursor over the chart, a grey line will appear, and jump to data points, so you can see the data at that point.  It only jumped to data for one colour.
  • The number at the top of the y axis is the maximum value  displayed.

Some descriptions could be clearer

  • Some of the descriptions could be clearer, for example “interval total destructive get – count” is clear,  but “Failed MQGET – count”, is presumable for the interval as well.  I think these come from the published data.   (I found it easier to create better descriptions when I processed the published monitoring data)

The data for multiple queue managers may not be synchronised

  • For example “Platform CPU,  CPU performance – platform wide” when I had two queue managers, gave me two lines, which tend to follow each other.   The data is collected at different times (for example 10:00:03, and 10:00:07), so the data is different at each point.
  • The answer is easy – for system wide metrics just select one queue manager.

You can rename charts

  • If you hover over the title, you can click on the pencil to rename the widget.  If you clear the name it resets it to the default.
  • I changed it to “COLINs LAPTOP CPU”, then, later,  changed the chart to disk space usage.  The title “COLINs LAPTOP CPU ” was no longer relevant, so I clicked on the pencil icon, and cleared the title, and got the default chart description back.

Refresh window

  • If you refresh the window, all the charts are reset and historical data may be lost.
  • If a queue manager was stopped and restarted, refreshing the window will cause the subscription for all of the charts in the window to be reissued.

Some “hmm interesting” observations.

  • Some of the y axis data was strange.   When starting to collect some data, I had number of gets 2050, when the number of gets in the last hour was about 5.   This is from the published data.  Since data was published for the queue (over 1 week ago) there were 2050 gets from the queue since them.   The published data reported this value, and reset it.
  • Because I had <variable name=”ltpaExpiration” value=”30″/>, after 30 minutes the windows gets logged off.  Because I was using digital certificates it automatically logs on again.
  • I stopped one queue manager and restarted it.  In some charts, the data for that queue manager stopped being displayed. On other charts it was displayed successfully.
    • You need to go to the settings cog, and click save.   This reissues the subscription.
  • If you click on a the box with an arrow in it (“browse data”), by the cog, you can display the data for that chart.  Select a queue manager from the pull down list.   If you type “a” you can select all of the data – you cannot copy it to the clip board.
    • If you click on the column heading ( Timestamp or Data) you can sort ascending or descending.
  • If you select API stats for a specific queue, it does not display which queue is being displayed.
  • Sometimes data is missing.  I could see a line was missing some data.   Using the “browse data” box, I could see one queue manager had data from 13:14,  another had data from 13:15.   Both queue managers were running while I had my lunch.
  • The chart “MQ trace file system – bytes in use” reported 16KB of data – when I had over 160KB of *.trc data.  If if was for the file system, then it is a very small file system.  I do not understand this metric.

mqweb what’s the difference between the message API and the admin API?

At first glance it looks like the answer is in the question.  You can use

  • the messaging REST API put and get messages
  • the admin REST API to administer queue manager objects

In a couple of places the IBM documentation says you can use the messaging API to administer your objects, which is true at the general sense, but not the specific sense.  Until I hit a problem I thought there was one “messaging REST API” with different flavors of syntax.

Security

The admin API authorisation is managed through <security-role name=”MQWebAdmin”> and <security-role name=”MQWebAdminRO”> sections in the mqwebuser.xml file.

The messaging API authorisation is managed through <security-role name=”MQWebUser”> sections.

Access to resources is done using the Alternate Userid.  I can see in the activity trace that the userid is colinpaice(the id mqweb is running under), but the open of a queue was done with alternate userid testuser.  When I tried to browse messages on a queue, I got a message saying my userid did not have the correct authority. I used setmqaut, and mqsc command refresh security(*) to resolve it.

Cost of the admin interface

The admin interface has a request like

https://127.0.0.1:9443/ibmmq/rest/v1/admin/qmgr/QMA/queue/CP0000?attributes=*

which returns all of the attributes of the queue CP0000.  From the activity trace we can see

  • MQCONN + MQDISC
  • MQOPEN, MQINQ, MQCLOSE of the manager object – twice
  • MQOPEN, MQPUT, MQCLOSE to the SYSTEM.ADMIN.COMMAND.QUEUE
  • MQOPEN, MQGET, MQCLOSE to the SYSTEM.REST.REPLY.QUEUE
  • MQCMIT
  • MQBACK – the JMQI code always does this to be sure that there is no outstanding unit of work,

The most expensive request is the MQCONNect.

Using the admin interface is fine for administration because changes to objects are usually done infrequently.   If you are considering the admin interface to monitor objects, for example plot queue depths over time, the mq rest API may not be the best solution.

Cost of the messaging interface

The messaging API interface uses connection pooling.   When the application does an MQDISC, the connection is returned to a pool, and can be reused if the same userid does an MQCONN.  If the connection is not used for a period, it can be removed from the pool and an MQDISC done to release the connection.    This should eliminate frequent MQCONN and MQDISCs.

From the activity trace we see

MQOPEN, MQGET,MQGET,MQCLOSE of the queue, and no MQCONN.

There will be an MQCONN, is there is no connection available for that userid in the pool, but this should be infrequent.