Debugging external smart cards and external pkcs11 keystores.

There is an open source package (opensc) which provides access to smart cards and external keystores. It provides some good tools for diagnosing problems.

There is a wiki with good information.

Opensc return codes are here, and the printable text is here

Monitor traffic to and from the device.

You can monitor the traffic to and from the device by using an intermediate “spy” module which displays the traffic.

In your configuration (for example a CCDT), where you specified the name of the module /usr/lib64/pkcs11/opensc-pkcs11.so, replace this with /usr/lib64/pkcs11/pkcs11-spy.so. Specify the environment variable

export PKCS11SPY=/usr/lib/x86_64-linux-gnu/pkcs11/opensc-pkcs11.so

The spy module is invoked, prints out the parameters, and then invokes the module specified in the environment variable.

The output is like

19: C_Login
2021-03-10 14:22:47.947
[in] hSession = 0x21fc030
[in] userType = CKU_USER
[in] pPin[ulPinLen] 00000000021fb2a0 / 8
00000000 5B C7 E7 BB E5 FC 6A BE […..j.
Returned: 160 CKR_PIN_INCORRECT

Detailed internal trace

You can specify the environment variable OPENSC_DEBUG to give a very detailed trace. The higher the number the more detailed the trac.

export OPENSC_DEBUG=9

and use unset OPENSC_DEBUG to reset it.

You can use OPENSC_CONF to specify a configuration file with more parameters, such as file name for the output.

The output from this trace (showing a logon with pin number 12345678) is like

0x7f96e2dca740 14:13:16.756 [opensc-pkcs11] framework-pkcs15.c:1494:pkcs15_login: pkcs15-login: userType 0x1, PIN length 8
0x7f96e2dca740 14:13:16.756 [opensc-pkcs11] pkcs15-pin.c:301:sc_pkcs15_verify_pin: called
….
0x7f96e2dca740 14:13:16.757 [opensc-pkcs11] reader-pcsc.c:283:pcsc_transmit: reader ‘Nitrokey Nitrokey HSM (DENK01051600000 ) 00 00’
0x7f96e2dca740 14:13:16.757 [opensc-pkcs11] reader-pcsc.c:284:pcsc_transmit:
Outgoing APDU (13 bytes):
00 20 00 81 08 31 32 33 34 35 36 37 38 . …12345678

GSKIT return codes

If you are using the MQ C Client interface, this uses GSKIT. There is documentation for the z/OS version, and the return codes are here.

Using the runmqakm commands and an HSM (but not strmqikm).

I tried to use strmqikm but it gave an exception.

You can use some of the runmqakm commands you know and love, to access a certificate with an HSM. For example

The command to list the database available to the runmqakm command,

runmqakm -keydb -list -crypto /usr/lib/x86_64-linux-gnu/opensc-pkcs11.so

Gives

/usr/lib/x86_64-linux-gnu/opensc-pkcs11.so : UserPIN (mytoken)

You can then use the token label UserPIN (mytoken) and password to use the key store, for example

runmqakm -cert -list all -crypto /usr/lib/x86_64-linux-gnu/opensc-pkcs11.so
-tokenlabel “UserPIN (mytoken)” -pw 12345678

gives

Certificates found
* default, - personal, ! trusted, # secret key
-	my_key3

and

runmqakm -cert -details -crypto /usr/lib/x86_64-linux-gnu/opensc-pkcs11.so
-tokenlabel “UserPIN (mytoken)” -pw 12345678
-label my_key3

displays the details of the certificate with label my_key3.

If the -tokenlabel was wrong or the -pw was wrong, I got the unhelpful messages

  • CTGSK3026W The key file “pkcs11” does not exist or cannot be read.
  • CTGSK2137W The label does not exist on the PKCS#11 device.

Create your certificate request

The following command create a new RSA private-public key pair and a PKCS10 certificate request. The documentation for runmqakm says it supports RSA. If you want to use an Elliptic Curve you will need to use an alternative method, for example openssl.

runmqakm -certreq -create -crypto /usr/lib/x86_64-linux-gnu/opensc-pkcs11.so
-tokenlabel “UserPIN (mytoken)” -pw 12345678
-dn “cn=colin,o=SSS” -file runmq.csr -label runmqlab -size 1024

Sign it

openssl ca -config openssl-ca-user.cnf -policy signing_policy -md sha256 -cert carsa1024.pem -keyfile carsa1024.key.pem -out runmq.pem -in runmq.csr

Store it back into the HSM keystore

I could not get the runmqakm command to receive the signed certificate and store it into the HSM keystore.

runmqakm -cert -receive -crypto /usr/lib/x86_64-linux-gnu/opensc-pkcs11.so -tokenlabel “UserPIN (mytoken)” -file runmq.pem -pw 12345678

It failed with

CTGSK3034W The certificate request created for the certificate is not in the key database.

I could use

openssl x509 -inform pem -outform der -in runmq.pem -out runmq.der
pkcs11-tool –write-object runmq.der –type cert –label “runmqlab” -l –pin 12345678

The openssl command converts the file from .pem format, to .der format as .der format is required by pkcs11-tool.

Using strmqikm – the theory

If you want to use the strmqikm GUI, you have to configure the java.security file. For example edit /opt/mqm/java/jre64/jre/lib/security/java.security and add the next security.provider in the list.

security.provider.12=com.ibm.crypto.pkcs11impl.provider.IBMPKCS11Impl /home/colinpaice/mq/nitrokey.cfg

Where /home/colinpaice/mq/nitrokey.cfg is the configuration file, with

name = nitrokey
library = /usr/lib/x86_64-linux-gnu/opensc-pkcs11.so
slot=0

You can then use Ctrl+O, which brings up a pop up with “Key database type”. In this list should be PKCS11Config, if not check your java.security file. Select this, leave File Name and Location empty, and click “OK”. It pops up “Open Cryptographic Token” with the “Token Label” value taken from the configuration file name = nitrokey. This is strange as the runmqakm command uses a TokenLabel of “UserPIN (mytoken)”.

In practice…

I then got an exception java.lang.RuntimeException: PKCS11KeyStore.java: findSigner(): Failure while executing cobj.getX509Certificate(certFactory, session), and strmqikm ended.

Using a hardware security module USB as a keystore for a browser.

Background to certificates and keystores

When using TLS(SSL) you have two keystores

  • A keystore for holding the public part and private key of your certificate
  • A trust store which holds the public keys of certificate sent to you which you need to authenticate.

Your certificate has two parts

  • The private key which contains information needed to encrypt information you send. This needs to be kept private.
  • The public part,which has information that is needed to decrypt information you have encrypted, along with information such as your Distinguished Dame (DN) such as CN=ColinPaice C=GB,O=StromnessSoftware

The process of creating a signed certificate is

  • Create a private key and public key. This can be done using an external device Hardware Security Module (HSM), such as the Nitrogen HSM USB, or software, for example using OPENSSL. This produces a private key file, and a certificate request file containing the public information.
  • Send the public information to your certificate authority which signs it, and returns it
  • Import the signed public certificate into your keystore.

Creating a certificate using an HSM as the key repository

I used openssl to process my certificates, I’ve discussed the openssl setup here.

I use a bash script because it is easy to parametrize, and makes it easy to rerun until it works. I’ll give the script, then explain what it does

  • enddate=”-enddate 20240130164600Z”
  • name=”hw”
  • rm $name.key.pem
  • rm $name.csr
  • rm $name.pem
  • ca=”carsa1024″
  • pkcs11-tool –keypairgen –key-type rsa:2048 –login –pin 648219 –label “my_key3”
  • OPENSSL_CONF=eccert.config openssl req -new -engine pkcs11 -keyform engine -key label_my_key3 -out $name.csr -sha256 -subj “/C=GB/O=HW/CN=colinpaice” -nodes
  • openssl ca -config openssl-ca-user.cnf -policy signing_policy -md sha256 -cert $ca.pem -keyfile $ca.key.pem -out $name.pem -in $name.csr $enddate
  • openssl x509 -inform pem -outform der -in $name.pem -out $name.der
  • pkcs11-tool –write-object $name.der –type cert –label “my_key3” -l –pin 648219

What does the script do ?

enddate=”-enddate 20240130164600Z”

This sets the end date for the certificate – the end date is set when it is signed.

name=”hw”

This is used within the script to ensure the correct files are being used.

Remove old intermediate files
  • rm $name.key.pem
  • rm $name.csr
  • rm $name.pem
ca=”carsa1024″

Define the name of the CA files to use at signing time. The $ca.pem and $ca.key.pem are both needed.

pkcs11-tool –keypairgen –key-type rsa:2048 –login –pin 648219 –label “my_key3”
  • pkcs11-tool use this tool
  • –keypairgen to create a key pair (private and public pair)
  • –key-type rsa:2048 use this key type and key length
  • –login –pin 648219 login with the pin number
  • –label “my_key3” use this label to identify the key
OPENSSL_CONF=eccert.config openssl req -new -engine pkcs11 -keyform engine -key label_my_key3 -out $name.csr -sha256 -subj “/C=GB/O=HW/CN=colinpaice”
  • OPENSSL_CONF=eccert.config this sets up the openssl config file. Having -config eccert.config does not work. See here.
  • openssl
  • req this is to create a certificate requests – create a .csr.
  • -new it is a new request
  • -engine pkcs11 use the named engine, pkcs11, defined to the system
  • -keyform engine this says use the engine (HSM). Other choices are der and pem
  • -key label_my_key3 go to the engine and look for the my_key3 label
  • -out $name.csr create this request file with this name.
  • -sha256 using this signature
  • -subj “/C=GB/O=HW/CN=colinpaice” the name to go in the certificate. It uses colinpaice as the certificate will be used to authenticate with the mq web server, and this is the userid the mq web server should use.

Send the .csr file to the CA for signing (which is the same machine in my case).

openssl ca -config openssl-ca-user.cnf -policy signing_policy -md sha256 -cert $ca.pem -keyfile $ca.key.pem -out $name.pem -in $name.csr $enddate
  • openssl ca Use this command to sign the certificate
  • -config openssl-ca-user.cnf use this configuration file
  • -policy signing_policy use this policy within the config file
  • -md sha256 use this for the message digest
  • -cert $ca.pem use the public certificate of the CA
  • -keyfile $ca.key.pem use this private key of the CA to encrypt information about the csr request’s certificate
  • -out $name.pem whee to store the output
  • -in $name.csr the input .csr request
  • $enddate specify the certificate expiry date – set at the top of the script

Send the signed certificate back to the requester.b

openssl x509 -inform pem -outform der -in $name.pem -out $name.der

The pkcs11-tool uses .der files so convert the .pem file to .der format

  • openssl x509
  • -inform pem input format
  • -outform der output format
  • -in $name.pem hw.pem
  • -out $name.der hw.der
pkcs11-tool –write-object $name.der –type cert –label “my_key3” -l –pin 648219

Read the signed certificate and write it to the HSM

  • pkcs11-tool
  • –write-object $name.der write onto the HSM the file hw.der coverted above
  • –type cert import type (cert|pubkey|privkey)
  • –label “my_key3” use this name
  • -l –pin 648219 and logon with this pin number

Define the HSM to Chrome browser

Stop the browser because you need to update the keystore.
The command was issued in the home directory, because key store is in the home directory/.pki .

modutil -dbdir sql:.pki/nssdb/ -add “my_HSM” -libfile opensc-pkcs11.so

  • modutil use this command
  • -dbdir sql:.pki/nssdb/ to up date this keystore (in ~)
  • -add “my_HSM” give it this name
  • -libfile opensc-pkcs11.so and use this file to communicate to it

Display the contents of the browser’s keystore

modutil -dbdir sql:.pki/nssdb/ -list

This gave me

Listing of PKCS #11 Modules
 NSS Internal PKCS #11 Module
...
 Mozilla Root Certs
 library name: /usr/lib/x86_64-linux-gnu/nss/libnssckbi.so
...
my_HSM
 library name: opensc-pkcs11.so
    uri: pkcs11:library-manufacturer=OpenSC%20Project;library-description=OpenSC%20smartcard%20framework;library-version=0.17
  slots: 1 slot attached
 status: loaded
 slot: Nitrokey Nitrokey HSM (DENK01051600000         ) 00 00
 token: UserPIN (SmartCard-HSM)
   uri: pkcs11:token=UserPIN%20(SmartCard-HSM);manufacturer=www.CardContact.de;serial=DENK0105160;model=PKCS%2315%20emulated 

Restart the browser.

Use an URL which needs a certificate for authentication.

The browser prompts for the pin number (twice), and displays the list of valid certificate CNs. Pick one. When I connected to the mqweb server, I had 3 certificates displayed. I had to remember which one I wanted from the Issuer’s CN and serial number. For example

SubjectIssuerSerial
colinpaiceSSCARSA1024019c
ibmsys1SSCARSA1024019a
170594SSCARSA10240197
Select a certificate

(Having a CA just for HSM keys, such as SSSCAHSM would make it more obvious.)

Setting up digital certificates for identification in your enterprise.

You can use digital certificate for authentication, for example you can logon onto the MQ Web server using a certificate to identify you, and you do not have to enter a userid or password.

Many systems have Multi Factor Authentication (MFA) to logon which usually means you authenticate with something you have, and with something you know. Something you have is the private certificate, something you know is userid and password.

At the bottom I discuss having an external device for your keystore to make your keystore more secure.

General background and information

  • Your certificate has a private key (which should not leave your machine), and a public part, which anyone can have.
  • You can have a key store which has your private key in it. This is often just a file which could be copied to another machine. This is not a very secure way of keeping your certificates, as there is usually a stash file with the password in it, which could easily be copied along with the keystore.
  • You have a trust store which contains the public part of the certificates you want to validate (demonstrate trust) with. This is usually a set of Certificate Authority public keys, and any self signed certificates. The information in these certificates is commonly available and can be world read. You will want to protect this for write, so people cannot insert CAs from the bad guys.
  • You can use Hardware Security Module, a piece of hardware which can store your private keys, and does encryption for you. This is a secure way of keeping your certificates. You need physical access to the machine to be able to physically access the HSM hardware.
  • Certificates are based on trust. When I create a public certificate, I can get this signed by a Certificate Authority. When I send my public certificate to you, and you have the same Certificate Authority, you can check what I sent you using the Certificate Authority. My public certificate give information on how to decrypt stuff I send you.
  • When a connection is made between a client and a server. The server sends down its certificate for the client to validate and accept, and the client can then send up a certificate for the server to validate and accept. This is known as the handshake
  • A certificate has a Distinguished Name. This is like “CN=COLIN,OU=TEST,O=SSS.ORG” so my Common Name is COLIN, The Organizational Unit is TEST, and my Organization is SSS.ORG.
    • Some products like the mid-range MQ Web Server map the CN to a userid.
    • As part of the logon a client or server can check the certificate sent to it, for example allow any certificate with OU=TEST, and O=SSS.ORG.

Planning for TLS and certificate

Consider a simple scenario of two MQ Servers, and people from my.org and your.org want to work with MQ. Leaving aside the task of creating the certificate, you need to decide

  • What name hierarchy you want, for example CN=”COLIN PAICE”, OU=TEST, C=GB, O=SSS.ORG,
    • do you want to have a CN with a name in it, or a userid, or a personnel number. This is used by the MQWeb as a userid. You could have CN=MQPROD1, etc to give each server its own CN.
    • Do you want to have the country code in it C=GB? What happens if someone moves country. You might decide to have servers with CN=MQPROD1,OU=PROD… or OU=TEST… .
  • What CA hierarchy do you want. You could have a CA for OU=PROD, O=SSS.ORG at the PROD level, or CN=CA,O=SSS.ORG at the organisation level. Some servers can check the issuer is OU=PROD, O=SSS.ORG and so only allow certificates signed by that CA. Someone connecting with a certificate signed with OU=TEST,O=SSS.ORG would not be allowed access.
  • You could give each server the same DN, for example CN=MQSERVER,OU=PROD,O=SSS.ORG, or individual ones CN=MQSERVER1,OU=PROD,O=SSS.ORG
  • You can have a server check that a certificate is still valid by using Online Certificate Status Protocol (OCSP). After the handshake, a request goes to a remote server asking if the certificate is still valid. Ive written a blog post Are my digital certificates still valid and are they slowing down my channel start? z/OS does not support OCSP. MQ on z/OS supports a LDAP repository of Certificate Revocation Lists. If you intend to use OCSP you need to set up the OCSP infrastructure.
  • With the MQ mover, you can set up CHLAUTH records to allow or disallow DN’s or CA certificates.
  • The clients from my.org have a DN like CN=COLIN,OU=TEST,O=myorg.com. The clients from your.org have a DN like CN=170594,c=GB,o=your.org. You cannot have one string (SSLPEER) to allow both format certificates.
    • For connections to the chinit(mover) you can use CHLAUTH to give find grained control.
    • For the MQWeb on z/OS you can control which certificates (or Issuers) map to a userid.
    • For mid-range MQWEB you have no control beyond a successful handshake. CN=COLIN,o=MY.ORG, and CN=COLIN,o=YOUR.ORG would both map to userid COLIN even though they are from different organisations. The CN is used as a userid, and you map userids or groups to security profiles.

Setting up your certificates

As your private key should not leave your machine, the standard way of generating a certificate is

  • The client machine creates a certificate request. This has the public certificate, and the private key.
  • The public certificate is sent to the appropriate authority (a department in your organization) which signs the certificate. Signing the certificate consists of doing a check sum of the public certificate, encrypting the check sum value, and packaging the public certificate, the encrypted checksum, and the CA public certificate into one file. This file is sent back to the requester
  • The originator reads the package stores it in a keystore, and uses this as its public key.
  • Often this request for a certificate is allowed only when the machine is connected locally to the network, rather than over the internet. This means people need to bring their portable machines into the office to renew a certificate.

If you create the private certificate centrally and email it to the end user, someone who is snooping on the email will get a copy of it!

A machine can have more than one keystore and a keystore can have one or more certificates. With some servers you can configure the default certificate to use. If not they the “best” certificate is chosen. This could depend on the strength and selection of the cipher specs.

What if’s

Once you have set up your certificate strategy it is difficult to change it, so it is worth setting up a prototype to make sure the end to end solutions work, then throwing the prototype away and starting again.

You need to consider how to solve problems like

  • What if someone leaves my organisation, how do I inactivate the certificate
  • What happens of someone loses their laptop, how do I inactivate the certificate
  • Certificates have expiry dates. What do I need to do to renew the certificate before it expires – for example you could email the owner and tell them to bring the laptop to the office to renew the certificate
  • What happens if a CA expires?
  • Someone joins the department how do I update the access lists. Usually this is done using a repository like LDAP.
  • Are the CHLAUTH records restrictive enough to prevent the wrong people from getting access, but broad enough that you do not need to change them when someone joins the organisation.
  • What if you open up your business to a new organisation with a different standard of DN? What do you need to change to support it.

Use of physical keystores.

You can have a physical keystore to store your private key. This can range from a USB device up to integrated devices.
With these people cannot just copy the keystore and stash file, they need physical access to the device.

You need to plan how these will be used in your organisation for example you have two machines for HA reasons. Each has a USB store. Does each machine need its own private key? How do you handle disaster recovery when someone loses/breaks the keystore.

Physical keystores can have have a secure export and import capability. You configure a key onto the device, for example saying it needs 3 partial keys, needing three people to enter their portion of it. When you export the key, it comes out encrypted.

In this scenario the configuration process could be

  • Configure the first device. 3 people enter their password.
  • Create a private key
  • Export the private key and send it to the second machine. It is encrypted so can safely be sent.
  • Go to the second machine, and configure the second device.
  • As before, the three people have to configure the device.
  • Import the encrypted certificate to the device
  • Go the the next machine etc.
  • In some cases you can say that n out of m people are required to configure the device. So any 3 out of a team of 6 is enough.

Would you lock your front door and leave the key under the mat? So why do you do it with digital keys?

Where I live it is Island Mentality. Someone said to me that they do not lock their front door. Sometimes, when they come home, they find some eggs or tray-bakes on the kitchen table. They went on a celebration cruise, but could not find the key to the front door, and so left the house unlocked the two weeks they were away.

Digital certificates and keys are used for identification authentication. Often these are stored in a key store, just a file in Windows or Unix. You typically need a password to be able to read the file. If you got hold of a keystore, you could try “password” with an “o”, “passw0rd” with zero etc. There is no limit to the number of attempts you can have. Don’t worry, the password is stored in a stash file , which is just another file. If you have the key store and the stash file you can open the keystore using standard commands. Having both the keystore and the stash file is like finding the front door unlocked.

If someone is an administrator on the machine, they can access any file and so can get the keystore and the stash file. IBM says you need superuser access to install MQ – so the MQ administrator can access these files. I heard that one enterprise was doing backups from the user’s machines to a remote site. The files were encrypted at the remote site, but not the network link to the remote site – whoops! The files could have been stolen en route.

Use external security devices.

You can get round this problem by using an external Hardware Security Module. Instead of storing the keys in a file, they are stored on an external device. You can get USB like devices. Some HSM can store keys, other HSMs can encrypt data. For example my bank gives its user’s a small machine. You put in your debit card, enter your pin. It encrypts the data and generates a one time key which you enter into the bank’s web site.

To steal the keystore you now need access to the physical machine to be able to unplug the USB.

Built in devices that cannot be removed.

On some machines, such as z hardware, they have a tamper resistant “cryptographic chip” built in. If you remove it from the machine, it is useless. When you configure it you need three keys, so you have three people each with their own key. When you install the backup machine, the three people have to go on site, and re enter their keys. They have mechanisms like three wrong passwords and it self destructs (perhaps in a cloud of smoke, as it does in the movies).

“Cloud”

One of the selling points of cloud is flexibility. You can deploy an image anywhere; you can wheel in new machines, and wheel out old machines; and you can have different “tenants” on the same hardware. This makes it difficult to use an HSM device to store your keys, as each machine needs the same keys, and the HSM could have all the keys from all the tenants. So you have the problem, of having your key store as a file with its stash file, and even more people have access to these files.

Would you lock your front door and leave the key under the mat? So why do you do it with digital keys

It is all down to the management of risk. Digital certificates do not give absolute protection. Strong encryption just means it takes longer to crack!

Certificate logon to MQWEB on z/OS, the hard way.

I described here different ways of logging on to the MQ Web Server on z/OS. This post describes how to use a digital certificate to logon. There is a lot of description, but the RACF statements needed are listed at the bottom.

I had set up my keystore and could logon to MQWEB on z/OS using certificates. I just wanted to not be prompted for a password.

Once it is set up it works well. I thought I would deliberately try to get as many things wrong, so I could document the symptoms and the cure. Despite this, I often had my head in the hands, asking “Why! – it worked yesterday”.

Can I use CHLAUTH ? No – because that is for the CHINIT, and you do not need to have the CHINIT running to run the web server.

Within one MQ Web Server, you can use both “certificate only” logon as well as using “certificate, userid and password” logon.

When using the SAF interface you specify parameters in the mqwebuser.xml file, such as keyrings, and what level of certificate checking you want.

Enable SAF messages.

If you use <safCredentials suppressAuthFailureMessage=”false” …> in the mqwebuser.xml then if a SAF request fails, there will be a message on the z/OS console. You would normally have this value set to “true” because when the browser (or REST client) reauthenticates (it could be every 10 seconds) you will get a message saying a userid does not have access to an APPL, or EJBROLE profile. If you change this (or make any change the mqwebuser.xnml file), issue the command

f CSQ9WEB,refresh,config

To pick up the changes.

Configure the server name

In the mqwebuser.xml file is <safCredentials profilePrefix=”MQWEB“…> there MQWEB identifies the server, and is used in the security profiles (see below).

SSL parameters

In the mqwebuser.xml file you specify

  • <ssl …
  • clientAuthenticationSupported=”true”|”false. The doc says The server requests that a client sends a certificate. The client’s certificate is optional
  • clientAuthentication=”true”|”false” if true, then client must send a certificate.
  • ssslProtocol=”TLSV1.2″
  • keyStoreRef=”…”
  • trustStoreRef=”…”
  • id=”…”
  • <sslDefault … sslRef=”…” this points to a particular <ssl id=…> definition. It allows you to have more than one <ssl definition, and pick one.

I think it would have been clearer if the parameters were clientAuthentication=”yes”|”no”|”optional”. See my interpretation of what these mean here.

Client authentication

The client certificate maps to a userid on z/OS, and this userid is used for access control.

The TLS handshake: You have a certificate on your client machine. There is a handshake with the server, where the certificate from the server is sent to the client, and the client verifies it. With TLS client authentication the client sends a certificate to the server. The server validates it.

If any of the following are false, it drops through to Connecting with a client certificate, and authenticate with userid and password below.

Find the z/OS userid for the certificate

The certificate is looked up in a RACDCERT MAP to get a userid for the certificate (see below for example statements). It could be a one to one mapping, or depending on say OU=TEST or C=GB, it can check on part of the DN. If this fails you get

ICH408I USER(START1 ) GROUP(SYS1 ) NAME(####################)
DIGITAL CERTIFICATE IS NOT DEFINED. CERTIFICATE SERIAL NUMBER(0194)
SUBJECT(CN=ADCDC.O=cpwebuser.C=GB) ISSUER(CN=SSCARSA1024.OU=CA.O=SSS.
C=GB).

Check the userid against the APPL class.

The userid is checked against the MQWEB profiles in the APPL class. (Where MQWEB is the name you configured in the web server configuration files). If this fails you get

ICH408I USER(ADCDE ) GROUP(TEST ) NAME(ADCDE ) MQWEB CL(APPL )
WARNING: INSUFFICIENT AUTHORITY ACCESS INTENT(READ ) ACCESS ALLOWED(NONE )

Pick the EJBROLE for the userid

There are several profiles in the EJBROLES class. If the userid has read access to the class, it userid gets the attribute. For example for the profile MQWEB.com.ibm.mq.console.MQWebAdmin, if the userid has at least READ access to the profile, it gets MQWEBADMIN privileges.
If these fail you get messages in the MQWEB message logs(s).

To suppress the RACF messages use option suppressAuthFailureMessage=”false” described above.

The userid needs access to at least one profile to be able to use the MQ Web server.

Use the right URL

The URL is like https://10.1.1.2:9443/ibmmq/console/

No password is needed to logon. If you get this far, displaying the userid information (click on the ⓘ icon) gives you Principal:ADCDE – Read-Only Administrator (Client Certificate Authentication) where ADCDE is the userid from the RACDDEF MAP mapping.

Connecting with a client certificate, and authenticate with userid and password.

The handshake as described above is done as above. If clientAuthentication=”true” is specified, and the handshake fails, then the client gets This site can’t be reached or similar message.

If the site can be reached, and a URL like https://10.1.1.2:9443/ibmmq/console/login.html is used, this displays a userid and password panel.

The password is verified, and if successful the specified userid is looked up in the APPL and EJBROLES profiles as described above.

If you get this far, and have logged on, displaying the userid information (click on the ⓘ icon) gives you Principal:colin – Read-Only Administrator (Client Certificate Authentication) where colin is the userid I entered.

The short solution to implement certificate authentication

If you already have TLS certificates for connecting to the MQ Web Server, you may be able to use a URL like https://10.1.1.2:9443/ibmmq/console/ to do the logon. If you use an invalid URL, it will substitute it with https://10.1.1.2:9443/ibmmq/console/login.html .

My set up.

I set up a certificate on Linux with a DN of C=GB,O=cpwebuser,CN=ADCDC and signed by C=GB,O=SSS,OU=CA,CN=SSCARSA1024. The Linux CA had been added to the trust store on z/OS.

Associate a certificate with a z/OS userid

I set up a RACF MAP of certificate to userid. It is sensible to run these using JCL, and to save the JCL for each definition.

 /*RACDCERT DELMAP( LABEL('ADCDZXX'  )) ID(ADCDE  ) 
 /*RACDCERT DELMAP( LABEL('CA'  )) ID(ADCDZ  )   
RACDCERT MAP ID(ADCDE  )  - 
    SDNFILTER('CN=ADCDC.O=cpwebuser.C=GB') - 
    WITHLABEL('ADCDZXX') 
                                                 
 RACDCERT MAP ID(ADCDZ  )  - 
    IDNFILTER('CN=SSCARSA1024.OU=CA.O=SSS.C=GB') 
    WITHLABEL('CA       ') 
                                                 
 RACDCERT LISTMAP ID(ADCDE) 
 RACDCERT LISTMAP ID(ADCDZ) 
 SETROPTS RACLIST(DIGTNMAP, DIGTCRIT) REFRESH 

This mapped the certificate CN=ADCDC.OU=cpwebuser.C=GB to userid ADCDE. Note the “.” between the parts, and the order has changed from least significant to most significant. For other certificates coming in with the Issuer CA of CN=SSCARSA1024.OU=CA.O=SSS.C=GB they will get a userid of ADCDZ.

You do not need to refresh anything as this change becomes visible when the SETROPTS RACLIST REFESH is issued.

First logon attempt

I stopped and restarted my Chrome browser, and used the URL https://10.1.1.2:9443/ibmmq/console. I was prompted for a list of valid certificates. I chose “Subject:ADCD: Issuer:SSCARSA1024 Serial:0194”.

Sometimes it gave me a blank screen, other times it gave me the logon screen with username and Password fields. It had a URL of https://10.1.1.2:9443/ibmmq/console/login.html.

On the z/OS console I got

ICH408I USER(ADCDE ) GROUP(TEST ) NAME(ADCDE ) MQWEB CL(APPL )
WARNING: INSUFFICIENT AUTHORITY ACCESS INTENT(READ ) ACCESS ALLOWED(NONE )

I could see the the userid(ADCDE) from the RACDCERT MAP was being used (as expected). To give the userid access to the MQWEB resource, I issued the commands

 /* RDEFINE APPL MQWEB UACC(NONE)
PERMIT MQWEB CLASS(APPL) ACCESS(READ) ID(ADCDE)
SETROPTS RACLIST(APPL) REFRESH

And tried again. The web screen remained blank (even with the correct URL). There were no messages on the MQWEB job log. Within the MQWEB stdout (and /u/mqweb/servers/mqweb/logs/messages.log) were messages like

[AUDIT ] CWWKS9104A: Authorization failed for user ADCDE while invoking com.ibm.mq.console on
/ui/userregistry/userinfo. The user is not granted access to any of the required roles: [MQWebAdmin, MQWebAdminRO, MQWebUser].

Give the userid access to the EJBroles

In my mqwebuser.xml I have <safCredentials profilePrefix=”MQWEB”. The MQWEB is the prefix of the EJBROLE resource name. I had set up a group MQPA Web Readonly Admin (MQPAWRA) to make the administration easier. Give the group permission, and connect the userid to the group.

 /* RDEFINE EJBROLE MQWEB.com.ibm.mq.console.MQWebAdminRO  UACC(NONE) 
PERMIT MQWEB.com.ibm.mq.console.MQWebAdminRO CLASS(EJBROLE) - 
  ACCESS(READ) ID(MQPAWRA) 
CONNECT ADCDE group(MQPAWRA)
SETROPTS RACLIST(EJBROLE) REFRESH

Once the security change has been made, it is visible immediately to the MQWEB server. I clicked the browser’s refresh button and successfully got the IBM MQ welcome page (without having to enter a userid or password). When I clicked on the ⓘ icon it said

Principal:ADCDE – Read-Only Administrator (Client Certificate Authentication)

Logoff doesn’t

If you click the logoff icon, you get logged off – but immediately get logged on again – that’s what certificate authorisation does for you. You need to go to a different web site. If you come back to the ibmmq/console web site, it will use the same certificate as you used before.

Ways of logging on to MQWEB on z/OS.

There are different ways of connecting to the MQ Web Server on z/OS (this is based on the z/OS Liberty Web server). Some ways use the SAF interface. This is an interface to the z/OS security manager. IBM provides RACF, there are other security managers such as TOP SECRET, and ACF2. Userid information is stored in the security manager database.

The ways of connecting to the MQ Web server on z/OS.

  • No security. Use no_security.xml to set up the MQ Web Server.
  • Hard coded userids and passwords in a file. Using the basic_registry.xml. This defines userid information like <user name=”mqadmin” password=”mqadmin”> . This is suitable only for a sandbox. The password can be obscured or left in plain text.
  • Logon by z/OS userid and password. Use zos_saf_registry.xml. Logon is by userid and password and checked by a SAF call to the z/OS security manager. The userid is checked for access to a resource like MQWEB.com.ibm.mq.console.MQWebAdmin in class(EJBROLE) and MQWEB in class(APPL).
  • Connect with a client certificate, and authenticate using userid and password. This uses zos_saf_registry.xml plus additional configuration. The userid, password and access to the EJBROLE and APPL resources is checked by the SAF interface. The certificate id is not used to check access, it is just used to do the TLS handshake.
  • Certificate authentication, a password is not required. Connecting use a client certificate. This uses zos_saf_registry.xml plus additional configuration. Using the SAF interface, the certificate maps to a z/OS userid; this ID is used for checking access to the EJBROLE and APPL resource.

The configuration for using TLS is not clear.

I found the documentation for the TLS configuration to be unclear. Two parameters are <ssl clientAuthentication clientAuthenticationSupported…/> The documentation says

  • If you specify clientAuthentication="true", the server requests that a client sends a certificate. However, if the client does not have a certificate, or the certificate is not trusted by the server, the handshake does not succeed.
  • If you specify clientAuthenticationSupported="true", the server requests that a client sends a certificate. However, if the client does not have a certificate, or the certificate is not trusted by the server, the handshake might still succeed.
  • If you do not specify either clientAuthentication or clientAuthenticationSupported, or you specify clientAuthentication="false" or clientAuthenticationSupported="false", the server does not request that a client send a certificate during the handshake.

I experimented with the different options and the results are below.

  1. I used a web browser with several possible certificates that could be used for authentication. I was given a pop up which listed them. Chrome remembers the choice. With Firefox, you can click an option “set as default“. If this is unticked you get prompted every time.
  2. I used a browser with no certificates for authentication.

When a session was not allowed, I got (from Firefox) Secure Connection Failed. An error occurred during a connection to 10.1.1.2:9443. PR_END_OF_FILE_ERROR

Client AuthenticationClient Authentication SupportedBrowser with certificatesBrowser without certificates
trueignoredPick certificate, userid and password NOT requiredPR_END_OF_FILE_ERROR
falsetruePick certificate, userid and password NOT requiredA variety of results. One of
  1. PR_END_OF_FILE_ERROR,
  2. Blank screen
  3. Userid and password required
falsefalseUserid and password requiredUserid and password required

When using certificates, you can chose to specify userid and password instead of client authentication, by using the appropriate URL with https://10.1.1.2:9443/ibmmq/console/login.html, instead of https://10.1.1.2:9443/ibmmq/console .

Note well.

The server caches credential information. If you change the configuration and refresh the server, the change may not be picked up immediately.

Once you have logged on successfully, a cookie is stored in your browser. This may be used to authenticate, until the token has expired. To be sure of clearing this token I restarted my browser.

Are you going crazy with Chrome giving fatal: certificate_unknown, bad_certificate NET::ERR_CERT_INVALID? Me too!

I came back to using Chrome with my MQ Web browser. It was working last week, but yesterday and today it stopped working. In debugging it, I’ve learned even more ways of checking TLS handshakes to see why they fail!

Using Wireshark packet trace on Linux, and TLS trace in the web server, I could see the Client Hello, Server hello, worked; but the response was Chrome giving NET::ERR_CERT_INVALID, and the traces showing Alert Level: Fatal, Description: Certificate Unknown.

  • I had been through the checks to make sure the CA was in the client key store.
  • I double checked, and tripled checked to make sure it was the right CA.
  • I exported the z/OS CA certificate, downloaded it and imported it.
  • I displayed the z/OS version, and the Chrome keystore’s version and they matched ( ok – the not-after and not-before times were different due to different time zones).
  • I shut every thing down and restarted it the next day.
  • I felt like screaming “AHH -it worked last week it works on FireFox – it should all work on Chrome”

After a day workingb in the garden, I had time to consider a different approach.

How I traced the problem down.

From your browser you can display the “problem” certificate by clicking on the icon in front of the URL. You get into “certificate viewer”. There is a “General” display which shows you useful information about the certificate. There is also the “Details”. At the bottom of the Details is a button labelled “Export”. Click it and export the certificate to a file such as SERVER.PEM.

You can now use openssl on this. For example

openssl x509 -in ~/Downloads/SERVER.cert.bad -text -noout|less

This shows you all the details of the certificate, so you can check them again!

You can also export the CA you think is being used, from the browser keystore, eg CA.pem

You can use the openssl verify command to do its validation of the server certificate and the CA certificate.

openssl verify -CAfile ca.pem -show_chain ~/Downloads/SERVER.cert

worked, it gave

OK
Chain:
depth=0: O = ZZZZ, OU = SSS, CN = SERVER (untrusted)
depth=1: O = TEMP, OU = TEST, CN = TEMP4Certification Authority

Which shows the server certificate, and the CA certificate were OK. However the same command with -x509_strict gave

openssl verify -CAfile ca.pem -show_chain -x509_strict ~/Downloads/SERVER.cert

gave

O = ZZZZ, OU = SSS, CN = SERVER
error 24 at 0 depth lookup: invalid CA certificate
error /home/colinpaice/Downloads/SERVER.cert.bad: verification failed

I felt I was on the trail of the problem.

Depth 0 is the server certificate. Depth 1 is the signers certificate, Depth 2 is the next level up. It was strange that it said the server certificate was an invalid CA.

Looking in the openssl source openssl/crypto/x509/x509_vfy.c showed several lines giving the error code X509_V_ERR_INVALID_CALL.

It turns out that the server’s certificate had been configured to to “certsign” (I had mis-copied a line in the certificates definition from an earlier test). Certsign means the certificate is a CA because it can sign things – but the CA flag was not set in the certificate – so clearly it was a bad_certificate. The validation failed – as it failed the consistency check it reported unknown certificate, and I was getting very frustrated.

What keyusage is required?

Any keyusage from no keysusage to KEYUSAGE( DATAENCRYPT, DOCSIGN, HANDSHAKE) works.

Just do not use KEYUSAGE(CERTSIGN) .

Recreate the certificate, and add it to the key store, and use

f CSQ9WEB,refresh,keystore

to get the MQ web server to pick up the change to the keystore. You may need to restart your web browser.

I think this is a useful technique which I will use in the future when I stumble over the next TLS set up problem.

HA Liberty web server – implementing VIPA with distributing connections.

Overview of VIPA solutions

You can implement VIPA, where you give your application its own IP address, across multiple TCPIP images.   This solves the problem of certificates not matching the host IP address.

You can have

  • One TCPIP image processing the connection requests. You have multiple TCPIP images – but only one TCPIP image at a time processes the connections.   If the TCPIP image stops, another can take over.
  • Multiple TCPIP stacks can process connection requests. A front end TCPIP image takes the connection requests and distributes them to TCPIP instances where the application is running.   You can use load balancing across multiple TCPIP images such distributing the connection techniques such as Round Robin, or Hot Standby.  This is based on Sysplex Distributor technology.

This blog post discusses the second case.

To provide background information, I created

Sysplex Distributor background

Sysplex Distributor is like having a router inside your TCPIP on z/OS; it can route traffic transparently to other TCP images in the environment.  The Sysplex Distributor can distribute connection requests for a  VIPA requests to TCPIP images where the application is running.    This set up is called Distributed VIPA (DVIPA).

It took me about 2 weeks to get a Sysplex DVIPA working – about a week understanding the documentation on VIPA, and the other week trying to understand why it didn’t work – and the simple configuration error I had.

I’ll break it down into simple stages which should help you understand the documentation.

The scenario

The scenario I used was going from my laptop, to z/OS running under zPDT on my laptop.  In effect there was an OSA connection between Ubuntu and my z/OS LPAR.  If you do not know what an OSA is, think of it as an Ethernet connection which can plug into multiple z/OS LPARs.

  1. My Ubuntu had an address of 10.1.1.1 over the tunnel connection
  2. I had one LPAR  with three TCPIP images
    1. TCPIP, host address 10.1.1.1 for the primary “front end”
    2. TCPIP2, host address 10.1.1.2 for the backup “front end” and where a server instance was running
    3. TCPIP3, host address 10.1.1.3 with a server instance running.
  3. I used a VIPA address of 10.1.3.10

The steps are

  1. Connect from Ubuntu to the z/OS
  2. Configure the LPAR(s)
  3. Define the VIPA configuration
  4. Define the “routing” to where the server was running.
  5. Getting the server to use the VIPA
  6. Commands to see what is going on (or not as the case may be)

Connect from Ubuntu to the z/OS

I had an existing tunnel connection from Ubuntu to z/OS.

I used the ip route command

sudo ip r add 10.1.3.10 link tap0

to define the route to 10.1.3.10 via the tunnelling device tap0 .  This looks like an OSA connection into z/OS.

Configure the LPAR(s)

The Sysplex Router uses XCF communications between LPARs and TCPIP images on the LPARs.

You configure each TCPIP with a statement

IFCONFIG DYNAMICXCF 172.1.2.x

My “frontend” TCP/IP had  IFCONFIG DYNAMICXCF 172.1.2.1, the other two TCP/IP images had 172.1.2.2 and 172.1.2.3.

The 172.1.2.x address can be any address not used by your enterprise.  It is internal to the Sysplex configuration.

Define the VIPA configuration

You define the configuration once, in the front end TCPIP.  It is visible from the other TCP/IP images because the information is shared via the DYNAMICXCF.

You define the VIPA in the front end TCP/IP image with a VIPADEFINE netmask address.  I used

VIPADYNAMIC
  VIPADEFINE 255.255.255.0 10.1.3.10
...
ENDVIPADYNAMIC

You can define VIPABACKUP in another TCPIP image, so if the main front end TCP/IP is not available then a backup can take the traffic and distribute it to the other TCP/IP stacks.

When the main front end TCP/IP image is restarted, you can have it take back the routing.

Define the “routing” to where the server was running.

You can define a variety of ways of routing the work

  • A Hot Standby – where the input to the front end TCP/IP image is routed to a single “backend” application’s TCP/IP image.  If this fails, the work is routed to a running backup application.
  • A Round Robin – where requests are routed in turn to each TCPIP with an active application.
  • Routing depending on WLM or other load characteristics.

This routing is done by the VIPADISTRIBUTE command in the front end TCP/IP image.

The definitions for the front end TCPIP

IPCONFIG SYSPLEXROUTING 
    DYNAMICXCF 172.1.1.1 255.255.255.0 3 

VIPADYNAMIC 
   VIPADEFINE 255.255.255.0 10.1.3.10 

   VIPADISTRIBUTE DEFINE DISTM ROUNDROBIN 10.1 .3.10 PORT 8443 
      DESTIP 
         172.1.1.2 
         172.1.1.3 
ENDVIPADYNAMIC 

This routes connection requests to the TCPIP images with the DYNAMICXCF of 172.1.1.2 (TCPIP2) and 172.1.1.3 (TCPIP3)

The definitions for TCPIP2

IPCONFIG SYSPLEXROUTING 
DYNAMICXCF 172.1.1.2 255.255.255.0 3 

The definitions for TCIP3

IPCONFIG SYSPLEXROUTING 
DYNAMICXCF 172.1.1.3 255.255.255.0 3

Use of VIPABACKUP

If the backup TCPIP front end image it used, it can have its own VIPADISTRIBUTE statement, or just use the same statement shared from the main front end TCP/IP image.  It is better to have the VIPADISTIBUTE statements, for the case when the backup TCPIP is started before the front end TCPIP.   The backup needs the VIPADISTRIBUTE statements. (These statements can be put into a PDS, and included using the INCLUDE dataset(member) statement in both primary and backup environments.)

To define TCPIP2 as a backup I used

VIPADYNAMIC 
    VIPABACKUP MOVEABLE IMMEDIATE 255.255.255.0 10.1.3.10 
    VIPADISTRIBUTE DEFINE DISTM ROUNDROBIN 10.1.3.10 PORT 8443 
        DESTIP 
        172.1.1.2 
        172.1.1.3 
ENDVIPADYNAMIC 

Getting the server to use the VIPA

The TCP/IP images hosting the applications just have the IFCONFIG DYNAMICXCF aa.bb.cc.dd statement.  They do not have any VIPADYNAMIC … ENDVIPADYNAMIC statements unless they are the main or backup front end TCP/IP images.

The application can connect using the VIPA address, for example create the SSLSOCKET Listener passing the VIPA address.   You can also configure TCP/IP so when a port is used, it binds to a particular IP address for example

PORT 8443 BIND 10.1.3.10

So an application using port 8443 to listen, will get IP address 10.1.3.10 – which in my case is a VIPA address.

You can use

PORT
9443 TCP * SHAREPORT BIND 10.1.3.7

to allow the port to be shared by many applications on a TCPIP Instance.

How are the connections distributed?

The VIPADISTRIBUTE  has many routing options. I used Hot Standby and Round Robin.

With RoundRobin, I had

  • the front end TCPIP
  • TCPIP2 with two Liberty servers
  • TCPIP3 with  one Liberty server

I ran some workload and found that the server on TCPIP3 had half the requests, and each of the two servers on TCPIP2 had a quarter of the overall requests.  This shows the routing is done at the TCPIP level – not the number of servers.

HA Liberty web server – implementing VIPA using the simpler VIPARANGE technology

Overview of VIPA solutions

You can implement VIPA, where you give your application its own IP address, across multiple TCPIP images.   This solves the problem of certificates not matching the host IP address.

  • One TCPIP image processes the connection requests. You have multiple TCPIP images – but only one TCPIP image at a time processes the connections.   If the TCPIP image stops, another can take over.
  • Multiple TCPIP stacks can process connection requests. This uses Sysplex Distributor;  a front end TCPIP image takes the connection requests and distributes them to TCPIP instances where the application is running.   You can use load balancing such as Round Robin, or Hot Standby.

This blog post discusses the first case.

To provide background information, I created

Using VIPARANGE configuration

The technique uses the VIPARANGE configuration statement.

The concept is that many LPARs can be attached to an OSA adapter, one, just one,  TCPIP stack (I dont know which of the available images) takes the connection requests and passes them on to the application on that TCPIP image.

You allocate a range of TCPIP address for your applications, with the same network prefix, for example 9.4.6.x   Allocate a host id to a Liberty, for example 9.4.6.7.   The Liberty instance keeps this address for whereever it runs.  You configure your routers so that  9.4.5.* are routed to the OSA adapter.

For each TCPIP image where you want to run Liberty, add to the  TCPIP startup configuration (or to an OBEY file)

VIPADYNAMIC 
   VIPARANGE DEFINE 255.255.255.0 9.4.6.7 
ENDVIPADYNAMIC

The 255.255.255.0 is the  subnet mask.  If your organisation uses a different subnet mask, it affects the IP addresses used.

These instructions say that they are defining a range of IP addresses on this LPAR, for  9.4.6.1 to 9.4.6.254

If an application connects to TCPIP, and the bind specifies a value in this range (9.4.6.1 to 9.4.6.254) then it is considered a VIPA address.  If the application used 9.4.6.7 this would count as a VIPA address.

When your application (Liberty) connects to TCP and uses an address in the VIPARANGE,  the TCPIP instance will create a dynamic IP address.   When I started my server application,   I got a z/OS console message

EZD1205I DYNAMIC VIPA 9.4.6.7 WAS CREATED USING BIND BY jobname ON TCPIP2.

When I shut down the server I got

EZD1298I DYNAMIC VIPA  9.4.6.7 DELETED FROM TCPIP2
EZD1207I DYNAMIC VIPA 9.4.6.7 WAS DELETED USING CLOSE API BY jobname ON TCPIP2

If the VIPA address is active on more than one TCPIP image, just one image will get all of the requests.  If you stop this image, another TCPIP image can take over.

If you have a different server using the same IP address, but a different port number, because they use the same IP address, the same LPAR will process the requests.

With VIPAROUTE you do not get connections distributed to more than one TCPIP image.

In your browser use  9.4.6.7:9443 address, the network router, routes this to the OSA, a TCPIP captures this and passes it to the application (Liberty).   As part of the handshake Liberty sends down its certificate, which has a SAN of  9.4.6.7 which matches the IP address, so this works.

On another day, when a different z/OS image is capturing the VIPA address connections,  the TCPIP address is 9.4.6.7 as before, so this matches the SAN in the certificate.

Testing it

In a test I used “ping -R 9.4.6.7 ” to the VIPA address.
This reported it was sent to TCPIP stack with 10.1.1.2. When I shut this TCPIP image down, ping reported the request was sent to 10.1.1.3.  It did this with no manual intervention.