The ccdt file was exactly where I specified it.

I had been successfully using a json ccdt file to use runmqsc as a client.  Because of a “.” it didn’t work when I used it elsewhere.

I had a directory of useful shell scripts.  In this directory I had my mqclient.ini with

CHANNELS:
MQReconnectTimeout=30
ReconDelay=(1000,200)(2000,200)(4000,1000)
ChannelDefinitionDirectory=.
ChannelDefinitionFile=ccdt.json

I had another directory with some shell scripts for some other applications.  I used

export MQCLNTCF=/home/colinpaice/… to point to the mqclient.ini file.

When I used runmqsc -c QMA I got

AMQ8118E: IBM MQ queue manager does not exist.

In /var/mqm/errors/AMQERR01.LOG was the error message AMQ9518E: File ‘./ccdt.json’ not found.

This was because I had specified ChannelDefinitionDirectory=. instead of a fully qualified name.  I had wrongly assumed that “.” meant the same directory as the mqclient.ini file.

The moral of this user error is to

  • use a fully qualified directory instead of “.”
  • or omit this parameter, and put the ccdt.json in the default location /var/mqm – but you may not have write access to this directory.
  • and go to the bottom of the class.

Do not panic so much if you are still running MQ V8

I had a question from a customer about being unable to use the REST API because they were still on MQ V8.

The announced end of support date for V8 was the end of April 2020. This gives them no time to upgrade test system, pre-production, and production.

Do not panic if you haven’t  upgraded yet – because the page says

This gives you a bit of time to get to MQ 9.1 – dont delay, raise a change request today.

Using Activity Trace to show a picture of which queues and queue managers your application used.

I used the midrange MQ activity trace to show what my simple application, putting a request to a cluster server queue and getting the reply, was doing.  As a proof of concept (200 lines of Python), I  produced the following

This output is a .png format.   You can create it as an HTML image, and have the nodes and links as clickable html links.

Ive ignored any SYSTEM.* queues, so the SYSTEM.CLUSTER.TRANSMIT.QUEUE does not appear.

The red arrows show the “high level” flow between queue managers at the “architectural”, hand waving level.

  • The application oemput on QMA did a put to a clustered queue CSERVER, there is an instance of the queue on QMB and QMC.   There is a red line from QMA.oemput to the queue CSERVER on QMB and QMC
  • The server programs, server running on QMB and QMC put the reply message to queue CP0000 on queue manager A

The blue arrows show puts to the application specified queue name – even though this may map to the S.C.T.Q.  There are two blue lines from QMA.oemput because one message went to QMC.CSERVER, and another went to QMB.CSERVER

The yellow lines show the path the message took between queue managers.  The message was put by QMA.oemput to queue CSERVER; under the covers this was put to the SCTQ.  From the accounting trace record this shows the remote queue manager and queue name:  the the yellow line links them.

The black line is getting from the local queue

The green line is the get from the underlying queue.  So if I had a reply queue called CP0000, with a QAlias of QAREPLY. If the application does a get from QAREPLY,  There would be a black line to CP0000, and a green line to QAREPLY

How did I get this?

I used the midrange activity trace.

On QMA I had in mqat.ini

applicationTrace:
ApplClass=USER # Application type
ApplName=oemp* # Application name (may be wildcarded)
Trace=ON # Activity trace switch for application
ActivityInterval=30 # Time interval between trace messages
ActivityCount=10 # Number of operations between trace msgs
TraceLevel=MEDIUM  # Amount of data traced for each operation
TraceMessageData=0 # Amount of message data traced

I turned on the activity trace using the runmqsc command

ALTER QMGR ACTVTRC(ON)

I ran some work load, and turned the trace off few seconds later.

I processed the trace data into a json file using

/opt/mqm/samp/bin/amqsevt -m QMA -q SYSTEM.ADMIN.TRACE.ACTIVITY.QUEUE -w 1 -o json > aa.json

I captured the trace on QMB, then on QMC, so I had three files aa.json, bb.json, cc.json.  Although I captured these at different times, I could have collected them all at the same time.

jq is a “sed” like processor for processing json data.   I used it to process these json files and produce one output file which the Python json support can handle.

jq . --slurp aa.json bb.json cc.json  > all.json

The two small python files are zipped here. AT.

I used ATJson.py python script to process the all.json file and extract out key data in the following format:

server,COLIN,127.0.0.1,QMC,Put1,CP0000,SYSTEM.CLUSTER.TRANSMIT.QUEUE,QMC,CP0000,QMA, 400

  • server, the name of the application program
  • COLIN, the channel name, or “Local”
  • 127.0.0.1, the IP address, or “Local”
  • QMC, on this queue manager
  • Put1, the verb
  • CP0000, the name of the object used by the application
  • SYSTEM.CLUSTER.TRANSMIT.QUEUE, the queue actually used, under the covers
  • QMC, which queue manager is the SCTQ on
  • CP0000, the destination (remote) queue name
  • QMA, the destination queue manager
  • 400 the number of times this was used, so 400 puts to this queue.

I had another python program Process.py which took this table and used python graphviz to draw the graph of the contents.  This produces a file with DOT  (graph descriptor language)parameters, and used one of the many programs to draw the chart.

This shows you what can be done, it is not a turn-key solution, but I am willing to spend a bit of time making it easier to use, so you can automate it.  If so please send me your Activity Trace data, and I’ll see what I can do.

How do I use the mid-range MQ accounting data?

Like many topics I look at, this topic seems much harder than I expected.  I thought about this as I walked the long way to the shops, up a steep hill and down the other side (and came back the short way along the flat).  I wondered if it was because I ignore the small problems (like one queue manager and two queues), and always look at the bigger scale problems, for example with hundreds of queue managers and hundreds of queues.

You have to be careful processing and interpreting the data, as it is easy to give an inaccurate picture see “Be careful” at the bottom of this post.

Here are some of the things I found out about the data.

Enabling accounting trace

  • You need to alter qmgr acctQ(ON) acctqMQI(ON) before starting your applications  (and channels).  Enabling it after the program has started running does not capture the data for existing tasks.  You may want to consider enabling accounting trace, then shutdown and restart the queue manager to ensure that all channels are capturing data.
  • The interval ACCTINT is a minimum time.  The accounting records are produced after this time, and with some MQ activity.  If there is no MQ activity, no accounting record is produced, so be careful calculating rates eg messages processed a second.
  • Altering ACCTINT does not take effect until after the accounting record has been produced.  For example if you have ACCTINT(1800) and you change it to ACCTINT(60), the “ACCTINT(1800)” will have to expire, produce the records, then the ACCTINT(60) becomes operative.
  • You can use the queue attribute ACCTQ() to disable the queue accounting information for a queue.

The data

  • The MQI accounting has channel and connection information, the queue accounting does not.  This means you cannot tell the end point from just the queue data.
  • The MQI accounting and Queue accounting have a common field the “connectionId”.   It looks like this is made up with the first 12 characters of the the queue manager name, and a unique large number (perhaps a time stamp).  If you are using many machines, with similar long queue manager names you may want to use a combination field of machineName.connectionId, to make this truly unique.
  • I had an application using a dynamic reply to queue.  I ran this application 1000 times, so 1000 dynamic queues were used. When the application put to the server, and used the dynamic reply queue, the server had a queue record for each dynamic queue.   There were up to 100 queue sections in each queue record, and 11 accounting queue messages for the server ( 1000 dynamic queues, and one server input queue).  These were produced at the end of the accounting interval, they all had the same header information, connectionId, start time etc. You do not know in advance how many queue records there will be.
  • Compare this to using a clustered queue on a remote queue manager, the server queue accounting record on the remote system.had just two queues, the server input queue, and the SYSTEM.CLUSTER.TRANSMIT.QUEUE.
    • The cluster receiver channel on the originator’s queue manager had a queue entry for each dynamic queue.
  • In all my testing the MQI record was produced before the queue accounting record for a program.   If you want to merge the MQI and the Queue records, save information from the MQI record in a table, keyed with the connectionId. When the Queue records come along you use same connectioID key to get the connection information and MQI data.
    • You can get rid of the MQI key data from your table when queue record has less than 100 queues.
    • If the queue record has exactly 100 queues, you do not know if this is middle in a series or the last of the series.  To prevent a storage leak, you may want to store the time within the table and have a timer periodically delete these entries after a few seconds – or just restart the program once a day.
  • The header part of the data has a “sequenceNumber” field.    This  is usually incremented with every set of records.
  • On the SYSTEM.ADMIN.ACCOUNTING.QUEUE, messages for different program instances can be interleaved, for example client1 MQI, client2 MQI, client1 Queue, client3 MQI, client2 Queue, client1 Queue, client3 Queue.
  • You do not get information about the queue name as used by an application, the record has the queue name as used by the queue manager (which may be the same as that which the application used).  For example if your program uses a QALIAS -> clustered queue.   The queue record will have the remote queue name used: SYSTEM.CLUSTER.TRANSMIT.QUEUE, not what the application used.
    • You can use the activity trace to get a profile of what queues an application uses, and feed this into your processing
  • You do not get information about topics or subscriptions names.
  • You may want to convert connectionName  from 127.0.0.1 format to a domain name.

Using the data in your enterprise

You need to capture and reduce the data into a usable form.

From a central machine you can use the amqsevt sample over a client channel for each queue manager and output the data in json format.
I used a python script to process this data. For example:

  • Output the data into a file of format yymmdd.hh.machine.queueManager.json.  You can then use program like jq to take the json data for a particular day (or hour of day) and merge the output from all your queue managers to one stream, for reporting.
    • You get a new file every day/every hour for each queue manager, and allows you to archive or delete old files.
  • Depending on what you want to do with the data, you need to select different fields.  You may not be able to summarise the data by queue name,  as you may find that all application are using clustered queues, and so it is reported as SYSTEM.CLUSTER.TRANSMIT.QUEUE.  You might consider
    • ConnectionName – the IP address where the client came from
    • Channel Name
    • Userid
    • The queue manager where the application connected to
    • The queue manager group (see below)
    • The program name
    • Record start time, and record end time
    • The interval of the record – for a client application, this may be 1 second or less.  For a long running server or channel this will be the accounting interval
    • Number of puts that worked.
    • Number of gets that worked
    • Number of gets that failed.
    • Queue name used for puts
    • Queue name used for gets.
  • You can now display interesting things like
    • Over a day, the average number of queue managers used by an application.   Was this just one (a single point of failure), mostly using one (an imbalance), or spread across multiple queue managers(good).
    • If application did a few messages and ended, and repeated this frequently.  You can review the application and get it to stay connected for longer, and avoid expensive MQCONN and MQDISC, and so save CPU.
    • Did an application stay connected to one queue manager all day?  It is good practice to disconnect and reconnect perhaps hourly to spread the connections across available queue managers
    • You could charge departments for usage based on userid, or program name, and charge on the number or size of messages processed and connections (remembering that an MQCONN and MQDISC are very expensive).
    • You may want to group queue managers  so the queue managers FrontEnd1, FrontEnd2, FrontEnd3 would be grouped as FrontEndALL.   This provides summary information.

Be careful

The time interval of an accounting record can vary.  For example if you are collecting data at hh:15 and hh:45, and you business hours are  0900 to 1700.  If you have no traffic after 1700, and the next MQ request is 0900 the next day, the accounting record will be from 1645 today to 0901 tomorrow.

  • If you are calculating rates eg messages per second then the rates will be wrong, as the time interval is too long.
  • The start times of an interval may vary from day to day.  Yesterday it may have been 0910 to 0940, today it is 0929 to 0959.  It makes it hard to compare like for like.
  • If you try to partition the data into hourly buckets 0900 to 1000,1000 to 1100 etc. using data from 0950: 10:20 by putting a third of the numbers in the 0900:1000 bucket, and the other two thirds into 1000:1100 bucket, then for a record from1645 today to 0900 tomorrow will spread the data across the night and so give a false picture.

You may want to set your accounting interval to 30 minutes, and restart your servers  at 0830, so they are recording on the hour.  At the end of the day, shut down and restart your servers to capture all of the data in the correct time interval bucket

When is mid-range accounting information produced?

I was using the mid-range accounting information to draw graphs of usage, and I found I was missing some data.

There is a  “Collect Accounting” Time for every queue every ACCTINT seconds (default 1800 seconds = 30 minutes).  After this time, any MQ activity will cause the accounting record to be produced.  This does not mean you get records every half hour as you do on z/OS, it means you get records with a minimum interval of 30 minutes for long running tasks.

Setup

I had a server which got from its input queue and put a reply message to the reply-to-queue.

Every minutes an application started once a minute which put messages to this server, got the replies and ended.

When are the records produced?

Accounting data is produced (if collecting is enabled) when:

  • an MQDISC is issued, either explicitly implicitly
  • for long running tasks  the accounting record(s) seems to be produced at when the current time is past the “Collect Accounting time”, when there has been some MQ activity. For example  there were accounting records for a server at the following times
    • The queue manager was started at 12:35:51, and the server started soon afterwards
    • 12:36:04 to 13:06:33.   An application put a message to the server queue and got the response back.   This is 27 seconds after the half hour
    • 13:06:33 to 13:36:42  The application had been putting messages to the server and getting the responses back.   This is 6 seconds after the half hour
    • 13:36:42 to 14:29:48 this interval is 57 minutes.  The server did no work from 1400 to 14:29:48 ( as I was having my lunch).  At 14:29:48 a message arrived, and the accounting record was written for the server.
    • 14:29:48 to 15:00:27 during this time messages were being processed, the interval is just over the 30 minutes.

What does this mean?

  • If you want accounting data with an interval “on the half hour”, you need to start your queue manager “just before the half hour”.
  • Data may not be in the time period you expect.  If you have accounting record produced at 1645, the data collected between 1645 and 17:14  may not appear until the first message is processed the next day. The record havean  interval  from 16:45 to  09:00:01 the next day.  You may not want to display averages if the interval is longer than 45 minutes.
  • You may want to stop and restart the servers every night to have the accounting data in the correct day.

 

My mid-range accounting is missing data

Like many of the questions I get asked, it started off as a simple question.

Over a day, what people and machines are using this queue?”  Simple I said.  Just use the accounting data.

Problem 1

I had a program which puts a message to a clustered queue on a remote queue manager.  The message flows to the server and a reply is sent back.  I thought I would check it works.  No it didn’t.

The accounting data has my program, but it recorded a message put to the SYSTEM.CLUSTER.TRANSMIT.QUEUE, and a message got from my reply queue, so I cannot tell what queue was used, but I can tell how many bytes were processed

Problem 2

My application doing just a publish did not show up in the accounting data.

I used the samples amqspub, and amqssub to publish, and subscribe respectively.

The accounting data had a record for the amqssub, getting from a queue, but there was no record for the publishing application.

As Morag kindly pointed out, you can tell how many publishes you did from the MQI accounting data, but not to which topic.

Ive started looking at the activity trace, but this looks like a heavyweight solution to the simple question.

Problem 3

Someone else pointed out that if you want to know the channel name, and IP address of the application using a queue, you have to do some extra work.

  • Collect the MQI accounting
  • Process queue accounting and MQI accounting, and use the common field connectionId to merge the data

Using amqsevt to display the accounting data in json format you get

“eventData” : {
“queueMgrName” : “QMC”,
“startDate” : “2020-04-17”,
“startTime” : “09.43.23”,
“endDate” : “2020-04-17”,
“endTime” : “10.18.58”,
“commandLevel” : 913,
“connectionId” : “414D5143514D432020202020202020201361995E0A24A923”,
“sequenceNumber” : 1,
“applName” : “server”,
“processId” : 29103,
“threadId” : 11,
“userIdentifier” : “colinpaice”,
   “connDate” : “2020-04-17”,
   “connTime” : “09.12.16”,
   “connectionName” : “127.0.0.1”,
   “channelName” : “COLIN”,

Where the bold text is for the data only in the MQI accounting information.

It looks like the MQI accounting record is produced before the Queue accounting record.

You will need to save the connectionId key with the connection specific information.  When the queue accounting record is processed, use the connectionId to look up the information and append it to the queue record.  If many queues are being used, you may get multiple queue accounting records – so you cannot simply delete the connection key and connection specific information.

 

What is in a cipher suite name? or how to tell your RSA from your ephemeral

Why do we need stronger encryption?

  • To make keys more resilient to attack you need longer keys.
  • There are newer ways of providing better private keys than just using large prime numbers.   For example using the equation y**2= a* x**3 + b * x**2  + c*x + d.  Which you may recognize as a cubic equation, but comes under the name of Elliptic Curves(EC).  (For some values of a,b,c,d if you plot the curve it is an ellipse.)  These Elliptic Curves with small keys are harder to crack than RSA with longer.  They also use less resources during encryption and decryption.
  • Originally public/private certificates were used for both authentication and encryption.  This has the disadvantage that if I monitor your traffic for a year, then steal your private key (for example from the corporate backups) then I can decrypt all of your traffic.  You need to use a technique called Forward Secrecy to prevent this.   This gives assurances that session keys will not be compromised even if the private key of the server is compromised.   With Forward Secrecy
    • You use the public/private key for authentication, and generate a secret for the encryption.   A technique called Diffie-Hellman(DH) can be used to agree an agreed  common secret without a man-in-the middle being able to determine the secret key.   See Wikipedia  for a good description.   This is good – but repeated conversations may use the same common secret and over repeated use, people may be able to guess your key.
    • This problem is fixed by using Ephemeral(E) keys, known as one-time keys, which are valid for just one conversation.  A second conversation will get a different secret key.

You need to support and use ECDHE (Elliptic Curves – Diffie-Hellman – Ephemeral)  suites in order to enable forward secrecy (having the private key means you cannot decrypt the message) with modern web browsers.  Avoid the RSA key exchange unless absolutely necessary.

What does a cipher spec tell us?

This is a good web site which tells you what the cipher spec means.

A cipher spec describes the techniques to be used for authentication, encryption and hashing the data.  This is negotiated between the two ends when setting up a TLS handshake.  The conversation is like “Client: I support the following cipher specs; Server: I like this one…”,  or “Client: I support the following cipher specs; Server: hmm none match”

If you look at the names of cipher suites available with TLS v1.2 you find names like

  • TLS_RSA_WITH_…  this is for a key with public certificate generated with RSA
  • TLS_ECDH_RSA_WITH… this is for a key with public certificate generated with Elliptic Curve(EC) and uses Diffie-Hellman(DH)
  • TLS_ECDHE_ECDSA_WITH…  this is for a key with public certificate generated with Elliptic Curve(EC) and uses Diffie-Hellman(DH) and Ephemeral key (E)

I found this document which is a good introduction to cipher specs TLS 1.2, TLS 1.3 etc

A cipher spec which I use a lot is TLS_ECDHE_ECDSA_WITH_AES_256_CBC_SHA384

Let me break that down into the components

  • TLS indicates the protocol ( old versions might have SSL)
  • ECDHE : Key Exchange Algorithm.  What is used to generate a secret number. Think of a telephone conversation between the UK and China. “Should we talk in English or Chinese”.  “I prefer English”. “OK my information is …”.   In this example  we have  ECDHE:  Elliptic Curve, Diffie-Hellman, Ephemeral.   This can be
    • ECDHE
    • ECDH
    • RSA
    • DH
  • ECDSA:  Authentication/Digital Signature Algorithm: What sort of certificate can be used.
    • ECDSA the server public key is an Elliptic Curvesignifies.  ECDSA is Elliptic Curve + Digital Signing Algorithm.   The TLS spec says it should be signed using a CA with EC public certificate – but it works even if it is signed with an RSA certificate
    • RSA the server public key is created with RSA public certificate.  The TLS spec says it should be signed using a CA with EC public certificate – but it works even if it is signed with an ECDSA certificate
  • “WITH” splits the authentication and encryption from the encryption of the data itself.
  • AES_256_CBC indicates the bulk encryption algorithm: Once the handshake has completed, the encryption of the payload is done using symmetric encryption.  They keys are determined during the handshake.    AES_256 is a symmetric encryption with a 256 bit key using Cipher Block Chaining. (CBC is like using a “running total” of the data encrypted so far, as an input to the encryption).   TLSv1.3 drops support for CBC;  GCM can be used instead. It is faster and can exploit pipeline processors.
  • SHA384 indicates the algorithm for hashing the message (MAC =  Message Authentication Code)

Notes:

  • DSS is a different authentication algorithm. For example TLS_DHE_DSS…   It also stands for Digital for Digital Signature Standard which covers all algorithms – so a touch confusing.
  • Because RSA tends to be used for authentication and encryption, I think of TLS_RSA_WITH… as TLS_RSA_RSA_WITH.   So the secret number generation algorithm is RSA, and then the certificate with an RSA public key is used.

For TLS 1.3 the cipher specs are like  TLS_AES_256_GCM_SHA384 because the key exchange algorithm will be either ECDHE or RSA.

How to restrict what certificates and algorithms clients can use to connect to java web servers

As part of your regular housekeeping you want to limit connections to your web server from weak keys and algorithms.   Making changes to the TLS configuration could be dangerous, as there is no “warning mode” or statistics to tell you if weak algorithms etc are being used.  You have to make a change and be prepared to have problems.

In this posting I’ll explain how to do it, then explain some of the details behind it.

How to restrict what certificates and algorithms can be used by web servers and java programs doing TLS.

One way which does not work.

The jvm.options file provided by mqweb includes commented out

-Djdk.tls.disabledAlgorithms=… and  -Djdk.tls.disabledAlgorithms=…..

These is the wrong way of specifying information, as you do it via the java.security file, not -D… .

Create an mqweb specific private disabled algorithm file

Java uses a java.security file to define security properties.

On my Ubuntu, this file if in /usr/lib/jvm/…/jre/lib/security/java.security  .

Create a file mqweb.java.security.  It can go anywhere – you pass the name using a java system property.

Copy  from the java.security file to your file, the lines with  with jdk.tls.disabledAlgorithms=..  and jdk.certpath.disabledAlgorithms=… . 

On my system, the lines are (but your security people may have changed them – if so,  you might want to talk to them before making any changes)

jdk.tls.disabledAlgorithms=SSLv3, RC4, DES, MD5withRSA, DH keySize < 1024,     EC keySize < 224, 3DES_EDE_CBC, anon, NULL

jdk.certpath.disabledAlgorithms=jdk.certpath.disabledAlgorithms=MD2, MD5,    SHA1 jdkCA & usage TLSServer,    RSA keySize < 1024, DSA keySize < 1024, EC keySize < 224

The jvm.options file provided by IBM has

-Djdk.tls.disabledAlgorithms=SSLv3, TLSv1, TLSv1.1, RC4, MD5withRSA, DH keySize < 768, 3DES_EDE_CBC, DESede, EC keySize < 224, SHA1 jdkCA & usage TLSServer

So you may want to add this as in your override file ( without the -D), so add “, SHA1 jdkCA & usage TLSServer” to  jdk.certpath.disabledAlgorithms .

Tell mqweb to use this file

Create a java system property in the mqweb jvm.options file

“-Djava.security.properties=/home/colinpaice/eclipse-workspace-C/sslJava/bin/serverdisabled.properties”

Restart your web server.  You have not changed anything – just copied some definitions into an mqweb specific file, so it should work as before.

Limit what can be used

I set up several certificates with combination of RSA and Elliptic Curves, varying keysize, signatures;  and signed with CAs with RSA, and Elliptic Curve, and different signatures.

For example RSA4096,SHA256withECDSA,/EC256,SHA384with ECDSA means

  • RSA4096 certificate is RSA with a key size of 4096
  • SHA256withECDSA signed with this
  • /EC256 the CA has a public key of EC 256
  • SHA384with ECDSA and the CA was signed with this

I then specified different options in the servers’ file, and recorded if they TLS connection worked or not; if not – why not.

certpath: RSA keySize <= 2048

Server EC407,   SHA256withRSA,   /RSA4096,   SHA512withRSA

  • ✅RSA4096,SHA256withECDSA,   /EC256,SW=SHA384with ECDSA
  • RSA2048, SHA256withRSA,      /RSA4096,/SHA512withRSA
  • ✅ EC407,      SHA256withRSA,      /RSA4096, SHA512withRSA
  • ✅EC384,       SHA256withECDSA, /EC256,      SHA384withECDSA

certpath: RSA keySize <= 4096

Server EC407,      SHA256withRSA,      /RSA4096, SHA512withRSA

  • RSA4096,SHA256withECDSA, /EC256,       SHA384with ECDSA
  • RSA2048,SHA256withRSA,     /RSA4096,  SHA512withRSA
  • ❌EC407,      SHA256withRSA,      /RSA4096, SHA512withRSA
  • ✅EC384,       SHA256withECDSA, /EC256,      SHA384withECDSA

certpath:EC keySize <= 256

Server EC407,      SHA256withRSA,      /RSA4096, SHA512withRSA

  • ❌  RSA4096,SHA256withECDSA, /EC256,       SHA384with ECDSA
  • ✅ RSA2048,SHA256withRSA,       /RSA4096,  SHA512withRSA
  • ✅ EC407,      SHA256withRSA,      /RSA4096, SHA512withRSA
  • ❌ EC384,       SHA256withECDSA, /EC256,      SHA384withECDSA

tls:EC keySize <= 256

Server EC407,      SHA256withRSA,      /RSA4096, SHA512withRSA

  • ✅ RSA4096,SHA256withECDSA,   /EC256,       SHA384with ECDSA
  • ✅ RSA2048,SHA256withRSA,        /RSA4096,  SHA512withRSA
  • ✅ EC407,      SHA256withRSA,       /RSA4096, SHA512withRSA
  • ✅ EC384,       SHA256withECDSA, /EC256,      SHA384withECDSA

certpath: SHA256withRSA

Server EC407,      SHA256withRSA,      /RSA4096, SHA512withRSA

  • ✅ RSA4096,SHA256withECDSA, /EC256,       SHA384with ECDSA
  • ❌RSA2048,SHA256withRSA,     /RSA4096,  SHA512withRSA
  • ❌ EC407,      SHA256withRSA,    /RSA4096, SHA512withRSA
  • ✅ EC384,       SHA256withECDSA, /EC256,      SHA384withECDSA

tls:SHA256withRSA

Server EC407,      SHA256withRSA,      /RSA4096, SHA512withRSA

  • ✅ RSA4096,SHA256withECDSA, /EC256,       SHA384with ECDSA
  • ❌RSA2048,SHA256withRSA,     /RSA4096,  SHA512withRSA
  • ❌ EC407,      SHA256withRSA,    /RSA4096, SHA512withRSA
  • ✅ EC384,       SHA256withECDSA, /EC256,      SHA384withECDSA

either: RSA

Server RSA4096,SHA256withECDSA, /EC256,       SHA384with ECDSA

All requests failed due to the server’s RSA.   Only 18 out of 50 cipher suites were available.  Server reported javax.net.ssl.SSLHandshakeException: no cipher suites in common

  • ❌RSA4096,SHA256withECDSA, /EC256,       SHA384with ECDSA
  • ❌ RSA2048,SHA256withRSA,     /RSA4096,  SHA512withRSA
  • ❌ EC407,      SHA256withRSA,      /RSA4096, SHA512withRSA
  • ❌EC384,       SHA256withECDSA, /EC256,      SHA384withECDSA

certpath: RSA keySize == 4096

Server RSA4096,SHA256withECDSA, /EC256,       SHA384with ECDSA

This was a surprise as I did not think this would work!

  • RSA4096,SHA256withECDSA, /EC256,       SHA384with ECDSA
  • RSA2048,SHA256withRSA,     /RSA4096,  SHA512withRSA
  • ❌EC407,      SHA256withRSA,      /RSA4096, SHA512withRSA
  • ✅EC384,       SHA256withECDSA, /EC256,      SHA384withECDSA

Summary of overriding.

You can specify restrictions in the server’s jdk.certpath.disabledAlgorithms and jdk.certpath.disabledAlgorithms. The restrictions apply to the how the certificate has been signed and the CA certificate.

You should check that the server’s certificate is not affected.

More details and what happens under the covers

The section below may be too much information, unless you are trying to work out why something is not working.

In theory jdk.tls.disabledAlgorithms and jdk.certpath.disabledAlgorithms are used for different areas of checking – reading certificates from key files, and what is passed during the handshake – but this does not seem to be true.  I found that it was best to put restrictions on both lines.

A certificate is of type RSA, EC, or DSA.

A certificate is signed for example Signature Algorithm: SHA256withECDSA.   This comes from the CA which signed it, message digest SHA256, and the CA is an Elliptic Curve.  See How do I create a certificate with Elliptic Curve (or RSA).

Signature Algorithms: is a combination of Hash Algorithm and Signature Type.   There are 6 hash algorithms: md5, sha1, sha224, sha256, sha384, sha512, and three types:  rsa, dsa, ecdsa.    These can be combined to to give 14 combinations of Signature Algorithms used in TLSv1.2

You can use java.security to control what TLS does.  On my Ubuntu this file /usr/lib/jvm/java-8-oracle/jre/lib/security/java.security.

This includes

  • jdk.certpath.disabledAlgorithms: Algorithm restrictions for certification path (CertPath) processing:  In some environments, certain algorithms or key lengths may be undesirable for certification path building and validation. For example, “MD2” is generally no longer considered to be a secure hash algorithm. This section describes the mechanism for disabling algorithms based on algorithm name
    and/or key length. This includes algorithms used in certificates, as well as revocation information such as CRLs and signed OCSP Responses.
  • jdk.tls.disabledAlgorithms: Algorithm restrictions for Secure Socket Layer/Transport Layer Security  (SSL/TLS) processing.  In some environments, certain algorithms or key lengths may be undesirable when using SSL/TLS. This section describes the mechanism for disabling algorithms during SSL/TLS security parameters negotiation, including protocol version negotiation, cipher suites selection, peer authentication and key exchange mechanisms.

I found you get better diagnostics if you put the restrictions on both statements.

The TLS Handshake (relating to java.security)

Server starts  up

  • I had 50 available cipher suites
  • Using -Djdk.tls.server.cipherSuites=…,… you can specify a comma separated list of  which cipher suites you want make available.  I recommend you do not specify this and use the defaults.
  • Using jdk.tls.disabledAlgorithm you can specify which handshake information is not allowed.  For example
    •  Any of the following would stop cipher suite TLS_ECDHE_ECDSA_WITH_AES_256_CBC_SHA384 from being used
      • java.security.tls = AES_256_CBC
      • java.security.tls= SHA384 – this loses 4 certificate TLS_ECDHE_ECDSA_WITH_AES_256_CBC_SHA384, TLS_ECDHE_RSA_WITH_AES_256_CBC_SHA384 etc
      • java.security.tls=TLS_ECDHE_ECDSA
    • Elliptic curves names Extension elliptic_curves, curve names: {secp256r1, secp384r1, secp521r1, sect283k1, sect283r1, sect409k1, sect409r1, sect571k1, sect571r1, secp256k1}.  Specifying EC keySize <= 521 would only allow {sect571k1, sect571r1} to be used
  • The server builds a supported list of cipher suites, signature algorithm, and elliptic curve names

Client starts up

  • As with the server, the client builds up a list of supported cipher suites, signature algorithms, and supported Elliptic curve names.
  • The client sends “ClientHello” and the list to the server.

Server processes

  • The server takes this list and iterates over it  to find the first acceptable certificate ( or for wlp, if <ssl … serverKeyAlias=”…” />, then the specified aliases  is used )
    • if the cipher suite name is like  TLS_AAA_BBB_WITH…   BBB must match the servers certificate type ( RSA, Elliptic Curve, DSA)
    • the signature algorithm.   This is the algorithm for encrypting the payload, and the algorithm for calculating the hash of the payload
    • if the certificate is EC,  check the  elliptic curve name is valid.  A  server’s certificate created with openssl ecparam -name prime256v1, would be blocked if EC keySize <= 256 was specified in the client resulting.
  • If no certificate was found in the trust store which passed all of the checks, it throws javax.net.ssl.SSLHandshakeException: no cipher suites in common , and closes the connection
  • The server sends “ServerHello” and the server’s public key to the client.
  • The server sends the types of certificate it will accept.   This is typically RSA, DSA, and EC
  • The server sends down the Elliptic Curve names it will accept, if present  – I dont think it is used on the java client
  • The server uses the jdk.certpath.disabledAlgorithm to filter the list of Signature Algorithms, and sends this filtered list to the client.
  • The server extracts the CAs and self signed certificates from the trust store and sends them down to the client.
  • The server sends “ServerHelloDone”, saying over to you to respond.

Client processing:

  • Checks the server’s certificate is valid, including
    • Checks the public certificate of the servers CA chain is allowed according to the client’s jdk.certpath.disabledAlgorithm..  So jdk.certpath.disabledAlgorithm =…, SHA256withECDSA would not allow a  server’s certificate with Signature algorithm:SHA256withECDSA .
  • The client takes the list of certificate types ( RSA, DSA, EC), and CAs and iterates over the keystore and selects the records where
    • the certificate type in the list
    • the signature algorithm is in the list
    • the certificate signed by one of the CA’s in the list
  • Displays this list for the end user to select from.  It looks like the most recently added certificate is first in the list.
  • The client sends “Finished” and  the selected certificate to the server

Server processing

  • The server checks the certificate and any imbeded CA certificates from the client matches.
    • Checks the signature algorithm
    • Checks the constraints, for example RSA keySize < 2048
    • Checks the certificate and CA are valid

End of handshake.

 

This is a useful link for describing the java.security parameters.

This specification  describes the handshake, with the “ClientHello” etc.

How to stop weak people from accessing your web server (liberty) and mqweb

In the ongoing security battle arena, standards of protection are improving.  What may have been an acceptable cipher suite last year may be less acceptable today.  For many years a RSA certificate was considered the best.  Now certificates using Elliptic Curves are considered superior as they require smaller keys for the same level of protection, are harder to break, and use less CPU when they are used.

Google wants to have digital certificates renewed every year rather than the two years it tends to be today.  (From what I have read Certificate Authorities certificates can be around for years).

Your security team should be updating the user certificates regularly, so this should not be your problem.  You need to review your servers and do regular “housekeeping” to bring them inline with current best practice.  It is easy to forget a service which just sits there and runs, and so is a weak entry point to your environment.

The house keeping tasks:

  1. Ensuring your Certificate Authority certificates use current options: type of certificate, key size etc.
  2. Ensure the server’s certificate uses current options
  3. You restrict what certificates can be used to connect to the server.  For example do not allow RSA certificate with RSA with a small key size.

Renewing certificates

In some enterprises you have one certificate which identifies you, and then have controls within each server which manage which users are allowed to connect.  In other enterprises you may have a general certificate, plus a certificate for “monitoring” which has been signed by the “monitoring CA”.  This makes it easier to control access – if your certificate does not have the “monitoring CA” you do not get into the server.

If you are an enterprise you are likely to have a process which automatically renews individual’s certificates.  This is likely to be “go on-site, plug into the ethernet, ( so people cannot intercept your wireless), get  a new certificate”.   It is harder to renew your server’s certificates.

You will need steps along the lines of

  • Strengthen your CA if required
    • Generate a new, stronger, CA certificate;   for example using Elliptic Curves instead of RSA
    • Every machine needs to install the new CA certificate in the trust stores/browsers, alongside the old CA certificate.  You need a period when both CA certificates are available to allow for a transition period, as you need to change both ends of the TLS connection.
    • You can now deploy new individual certificates with their CA, to your users which go into their key store. Users connecting to a server with the new certificates will be validated with the  server’s new CA. Users which have certificates with the old CA will continue to use the old CA.
  • You need a new certificate for your web server.  If this is using the new CA, then all end users will need to have the new CA in their trust store, so the web server certificate can be validated.  If you are not using a new CA, then the existing CA can be used with no change.
  • Once all the users have their new certificate, you can remove the old CA from the server and from the user’s trust stores/browsers.  If a user tries to use the old certificate, if will fail to verify as the CA is no longer present.

Restricting what certificates and cipher suites can be used.

You can configure java security to restrict the certificates and cipher suites are used.  For example enforce no RSA certificate, or RSA certificate with a key sizes < 2048 cannot connect to the web server.  You configure java.security.  See here.

(Not) Using TLS 1.3

The TLS standards are being improved, for example TLS 1.3 protocol has improved algorithms, which are faster and more secure than in TLS 1.2.  The handshake has been improved, and weak algorithms have been dropped.   Most browsers already support TLS v1.3.

In theory this is a simple matter of changing the protocol parameter from TLSv1.2 to TLSv1.3 and it will all magically work.  This support is in java release 11.  Unfortunately the java used by mqweb is not at this level (it it java version 8) so cannot support it.   This means you have to make the changes yourself.

Change it, pray, answer the phone, and backout the change.

In an ideal world your server would produce a report of which certificates were used, with the Distinguished Name(the owner), what algorithms are used, and what CAs are used.  You make a note of all those which need attention, get the certificates upgraded, and when all the certificates are good – you can enable the stronger functions and it will all work first time.

I have not been able to get this list from my servers.

What I expect will happen as you enable the stronger functions, is that people who are not ready, will stop working, and will report problems.   You turn off the stronger functions, fix the machines which had problems and repeat until the phone stops ringing.
Some changes, for example changing java -D…. options in the jvm.options will require a server restart.  Other changes within the mqwebuser.xml, may get picked up periodically (and so avoids the need to restart the server), but you may need to restart the server.  On my laptop it takes over 10 seconds to stop and restart the server.

This upgrade process is very disruptive and does not provide a highly available server your enterprise needs.  Think how many change requests you will need to submit before it all works!  The best solution is to start with a very secure web server.

Using java -Djavax.net.debug=… to examine data flows, including TLS

There are different levels you can get from the -Djavax.net.debug option.  You can display its options using -Djavax.net.debug=help .

With this you get

all            turn on all debugging
ssl            turn on ssl debugging

The following can be used with ssl:

    record       enable per-record tracing
    handshake    print each handshake message
    keygen       print key generation data
    session      print session activity
    defaultctx   print default SSL initialization
    sslctx       print SSLContext tracing
    sessioncache print session cache tracing
    keymanager   print key manager tracing
    trustmanager print trust manager tracing
    pluggability print pluggability tracing

    handshake debugging can be widened with:
    data         hex dump of each handshake message
    verbose      verbose handshake message printing

    record debugging can be widened with:
    plaintext    hex dump of record plaintext
    packet       print raw SSL/TLS packets

 

and your program exits.

You can use

-Djavax.net.debug=ssl:record or -Djavax.net.debug=ssl:handshake

To display specific levels of detail.