Customising for MQWEB Liberty on z/OS, things the documentation does not tell you about

This post covers the customising you need to consider enterprise use of the Liberty MQWEB server.  It covers

  • Setup the USS path and defining an alias for the mq executable’s  directory
  • Do you have common configuration across mqwebuser.xml files?
  • Decide if you want to use setmqweb.
  • Setting up the server’s certificate and keyring
  • Setting up the trust store
  • Setting up the Angel process(es)
  • Customising the jvm.option
    • To prevent the web server coming up if the Angel process is missing
    • Setting the time zone
  • Customising the mqwebuser.xml
    • SAF definitions
    • Setting the log sizes so the logs can be viewed
  • Letting requests in from outside of the LPAR
  • dspmqweb/setmqweb – which instance to use?
  • Selecting which IP stack to use
  • Customising ISPF option 3.17 – Unix Directory List

Setup the USS path and defining an alias for the mq executable’s directory

To be able to use the dspmweb and setmqweb commands you need to point to the command location.

You can add to the user’s .profile file, or the /etc/profile the statement

export PATH=/usr/lpp/mqm/V9R1M1/web/bin:$PATH

If you have multiple releases of MQ in your environment you could set up shell commands like v913dspmqweb.sh

/usr/lpp/mqm/V9R1M3/web/bin/dspmqweb "$@"

But this causes extra work when you need to migrate to the new release.  It might be better to set up an alias

ls -s  /usr/lpp/mqm/V9R1M3/web/bin /v913
ls -s  /usr/lpp/mqm/V9R1M3/web/bin /mqcur

so you just need to type /v913/dspmweb or /mqcur/setmqweb

As part of the migration to a newer release you just change the alias.

Do you have common configuration across mqwebuser.xml files?

If you have multiple mqweb instances, either because you have multiple LPARs in a sysplex, or you have to support different release of MQ concurrently, you may want to put common configuration in an include file. For example created a file common.xml to hold the configuration and put

<include location=”common.xml” optional=”false”/>

in the mqwebuser.xml file.

Decide if you want to use setmqweb.

You can update your *.xml configuration files, or use setmqweb to update mqwebuser.xml for you.

Some organisations do not allow manual changes to configuration. You have to change a configuration file, have it reviewed, and use automation to deploy it.

For test systems it may be ok to use the setmqweb command and change things dynamically.

If you make a change using setmqweb, it updates the mqwebuser.xml file, by adding/changing a <variable name=”…” value=”..”/>  statement.

If you are using SAF authentication and certificate authentication

You will need keyring with the certificate to identify the server (the key store).  You will need a keyring to identify the certificates you trust (the trust store).  You could use the same keyring for both – but this is not good practice.

The server’s certificate and key store keyring

You need to decide if the MQWEB server uses the same certificate as CICS, WAS and z/OS Connect etc. on the same LPAR.  You could have a common certificate to simplify administration. The certificate needs a Subject Alternative Name, to identify the machine the certificate came from. This can be the DNS name or the dotted address (9.20.4.6) depending on your set up.  It might be easier to define both. Note the RACF command

RACDCERT .. ALTNAME(IP(10.1.1.2) IP(10.1.1.3) DOMAIN(‘WWW.ME.COM’) DOMAIN(‘WWW.LAST.COM’))…

accepts multiple entries, but only uses the last one. The above command gave produced a certificate with

Subject's AltNames: 
IP: 10.1.1.3 
Domain: WWW.LAST.COM

This means you many only be able to use the certificate only on the LPAR that has been defined, (if you move the server to a different LPA, it will have a different IP address, and your clients will complain – see below).   You may be able to something clever things with VIPA (Virtual IP addressing) where your Sysplex has one IP address and this maps to different IP addresses on each LPAR.

If you have the wrong IP or Domain then the browser gets  a message like “Your connection is not private. Attackers may try to steal your information from 10.1.1.2.  NET:ERR_COMMON_NAME_INVALID”

The trust store keyring.

The trust store keyring has the certificates to authenticate what has been sent from the client.  For example, a copy of any self signed certificate, or the Certificate Authorities of the Web Browser’s certificate.

This keyring could be sysplex wide, and shared by CICS, WAS, Z/OS connect etc – assuming they have the same people connecting to them.

The certificates may have been configured with owner CERTAUTH rather than an userid.

My definitions

<sslDefault sslRef="defaultSSLConfig"/> 
<ssl id="defaultSSLConfig" 
   sslProtocol="TLSv1.2" 
   keyStoreRef="racfKeyStore" 
   trustStoreRef="racfTrustStore" 
   clientAuthenticationSupported="true" 
   clientAuthentication="true" 
   serverKeyAlias="MYMQWEB/> 

<keyStore filebased="false" id="racfKeyStore" 
   location="safkeyring://START1/KEY" 
   password="password" 
   readOnly="true" 
   type="JCERACFKS"/> 

 <keyStore filebased="false" id="racfTrustStore" 
   location="safkeyring://START1/TRUST" 
   password="password" 
   readOnly="true" 
   type="JCERACFKS"/> 

<webAppSecurity allowFailOverToBasicAuth="false"/>
  • The sslDefault  points to the ssl with the same ID
  • The ssl points to
    • the key store with the servers certificate with the id racfKeyStore
    • the trust store to validate connecting clients, with the id racfTrustStore

Create an angel

You need an Angel process to handle the SAF (RACF) security requests – the MQ documentation tells you this.

Typically the Angel started task is started at IPL, and shut down at system shut down.
All instances of Liberty Web Server running on an LPAR can all use the same Angel.

You cannot shut down the Angle process if it is in use, but if you cancel it, the servers using it will stop working (hang) and may abend.

You may want to consider more than one Angel process, and not share it.

When the Angel process has started, it uses no CPU, as the Web Servers execute code within the  Angel address space, on the Web Server’s threads – just like MQ, DB2 etc.

Customise  jvm.options

Stop if there is no Angel  process

If the Angel process is not running at Liberty startup,  then the Web Server may continue to come up.  People will not be authorised to access it, but the Web Server will be running.   This is pretty useless.

You can specify an option so the liberty server (MQWEB) does not start if the Angel task is not running.

I use

-Dcom.ibm.ws.zos.core.angelRequired=true
#-Dcom.ibm.ws.zos.core.angelName=MYANGEL

-Dcom.ibm.ws.zos.core.angelRequired=true

If the angel process is not available then the MQWEB stops when it detects the angel is not available.

#-Dcom.ibm.ws.zos.core.angelName=MYANGEL

If you are using a names Angel, uncomment this and specify the Angel name.

If you are using the unnamed Angel, leave this commented.

Set the time zone

The time zone is picke up from TZ in /etc/profile, but you can override it by specifying

-Duser.timezone=Europe/London

This sets the time-zone of the messages in the message.log and trace.log files.

Customise mqwebuser.xml

Message log and trace file settings

If the trace or message files are too big, you cannot view them. You have to use edit to look at them, but if the file is too large, browse is substituted and browse does not do code page conversion, so you are looked at raw ascii characters in an EBCDIC browser.

<variable name=”maxTraceFileSize” value=”20″/>
<variable name=”maxTraceFiles” value=”20″/>
<variable name=”maxMsgTraceFileSize” value=”20″/>
<variable name=”maxMsgTraceFiles” value=”20″/>

The file size values are in MB.

You should consider keeping you messages.log files for a week or so, so make the number of files large enough.

SAF – Access to RACF

If you are using SAF (RACF or other z/OS security manager) to manage access and authorisation you will have a default entry like

<!-- 
Example SAF Registry 
--> 
<safAuthorization racRouteLog="NONE" id="saf" /> 

<safRegistry id="saf" /> 
<safCredentials unauthenticatedUser="WSGUEST" profilePrefix="MQWEB" 
suppressAuthFailureMessages="false" /> 

I use <safAuthorization racRouteLog=”ASIS”… to get RACF violation messages on the joblog during set up.  See here.

<safRegistry suppressAuthFailureMessages=”false”…  prints out violation messages.  See here.

Let request in from outside z/OS

For this to work you have to edit the mqwebuser.xml file and uncomment

<variable name="httpHost" value="*"/> 
<!-- 
-->

By default it only allows request from the same z/OS system – so not allowing browsers access.

dspmqweb/setmqweb – which instance to use?

This page  says you must use

export WLP_USER_DIR=WLP_user_directory

This is fine when you have one mqweb instance on one LPAR.  You might want a shell program to set this every time.  For example,  the program disMQPAweb.sh

export WLP_USER_DIR=/u/mqmweb/MQPA
/usr/lpp/mqm/V9R1M1/web/bin/dspmqweb "$@"

Then you can use /usr/lpp/mqm/V9R1M1/web/bin/dspmqweb as before.

If you have multiple releases of MQ in your environment you might want to point to the command in the script, so dspMQPA.sh might have

export WLP_USER_DIR=/u/mqmweb/MQPA
/usr/lpp/mqm/V9R1M1/web/bin/dspmqweb "$@"

Though it might be better to have a shell script mq911 with an optional queue manager parameter

Selecting which IP stacks to use.

There is an article from IBM, which gives two ways of configuring it.  Changing the httpEndpoint, or specifying an environment variable

Customise ISPF z/OS UNIX Directory List

In the MWEB directory are message logs and trace logs.  When the file fills up, it renames the old file to include the date and time, for example messages_20.07.29_16.49.29.0.log , and creates a new message.log or trace.log

If you are using ISPF 3.17 (z/OS UNIX Directory List) to use the files, it only displays the first 15 characters of the file name, so you get lots of files with a name like “messages_20.07.” where 20 is the year, and 07 is the month.

The default layout for the z/OS UNIX Directory List  displays by default some unhelpful fields.   You can arrange the fields, (but not make the filename field wider).
If you go to the OPTIONS on the top line, and select “2. Directory List Column Arrangement… ” you can change what fields are displayed, and the order.  I set the widths of all fields to 0, except for

  • Type 04
  • Modified 19 (if you specify a smaller value you only get the YYYY-MM…  not the time)
  • Size 10

The documentation says

  • Modified The date and time the file was last changed.
  • Changed The date and time the status of the file was last changed.

I do not know the difference between these two.

Controlling what is displayed

In the directory list you can use sort commands

  • sort file A
  • sort mod D

Looking at a log or trace file

If you sort by Modified A the newer files will be at the top, so you can look at the “modified” column to look for the time the file was created, and so get the order of the files.

You can use the line command / to display the options.

You can use e to edit, or V to use edit in browse mode.

Browse displays a mess because it does not do conversion

 


	

Liberty on z/OS: Mapping an incoming certificate to a z/OS userid for client certificate authentication – and don’t forget the cookies!

I thought I understood how this worked, I found I didn’t, then had a few days hunting around for the problem

The basics

You can use a digital certificate from a web browser ( curl, or other tools) to authenticate to z/OS.  You need to map the certificate to a userid.

A certificate coming in can have a Distinguished Name like CN=adcdd.O=cpwebuser.C=GB  (Note the ‘.’not ‘,’ between elements).

Your userid needs to have SPECIAL define to be able to use the RACDCERT command (SPECIAL, not just GROUP-SPECIAL).

You will need a definition like (see here for the command)

RACDCERT MAP ID(ADCDD ) - 
    SDNFILTER('CN=adcdd.O=cpwebuser.C=GB') - 
    WITHLABEL('adcdd')

or a general definition for those certificate with  O=cpwebuser.C=GB, ignoring the CN part

RACDCERT MAP ID(ADCDB ) - 
   SDNFILTER('O=cpwebuser.C=GB') - 
   WITHLABEL('cpwerbusergb') 

or using the Issuing Distinguished Name (the Certificate Authority)

IDNFILTER(‘CN=TESTCA.OU=SSSCA.C=GB)

Using a generic

SDNFILTER(‘CN=a*.O=cpwebuser.C=GB’)

does not work.

If you attempt to use a certificate which is not mapped you get

ICH408I USER(START1 ) GROUP(SYS1 ) NAME(COLIN)
DIGITAL CERTIFICATE IS NOT DEFINED. CERTIFICATE SERIAL NUMBER(0163)  SUBJECT(CN=adcdd.O=cpwebuser.C=GB) ISSUER(CN=SSCA8.OU=CA.O=SSS.C=GB).

It is worth defining these using JCL, because if you try to add it, and it already exists then you get a message saying it exists already.  If you know the userid, you can list the maps associated with it.   If you do not know the userid, there is no practical way of finding out – you have to logon with the certificate, and display the userid from the web browser, or extract the list of all users, and use LISTMAP on all of them.

Once you have set up the userid, you can connect them to the group to give them access to the EJBROLE profiles.  For example use group names

  • MQPAWCO MQPAMQWebAdminRO Console Read Only.
  • MQPAWCU MQPAMQWebUser  Console User only.  The request operates under the signed on userid authority.
  • MQPAWCA MQPAMQWebAdmin Console Admin.

for queue manager MQPA, Web  Console (rather than REST) and the access.

You may want to set up  userids solely for client authentication.  If the userid has NOPASSWORD, it cannot be used to logon with userid and password, and of course the lack of password means the password will not expire.

Having a set of userids just for certificate access makes it easier to manage the RACDCERT MAPping.    You have a job with

RACDCERT ID(adcd1) LISTMAP
RACDCERT ID(adcd2) LISTMAP
etc

and search the output for the certificate of interest.

It gets more complicated…

Often the user’s certificate is in the form CN=Colin Paice,o=SSS,C=GB so if you want to allow all people in the MQADMIN team access, you will need to to specify them individually.  It would be easier if DN had CN=Colin Paice,OU=MQADMIN,o=SSS,C=GB, then you can filter on the OU=MQADMIN.   These could map to a userid MQADM1.

It gets more complication if someone can work with MQ, and CICS or z/OS Connect, and you have to decide a userid – MQADM1 or CICSADM1?

Setting up a one to one mapping may be the best solution, so CN=Colin Paice,o=SSS,C=GB maps to CPAICE (or GB070594).   This userid is then added to the appropriate RACF groups to give access to the EJBROLEs, to give access to the servers.

How do I tell what is being used?

I could not get Liberty to record an audit record for the logon/matching.   I tried altering the userid to have UADIT – but it did not work either.

If you have audit defined on the class EJBROLE profile MQWEB.com.ibm.mq.console.MQWebUser, you will get a audit record in SMF.   This has many fields including

  • Date
  • Time
  • ACCESS
  • SUCCESS – or INSACC (INSufficient Access)
  • ADCDC – userid being used
  • READ – Requested access
  • READ – permitted access
  • EJBROLE – the class
  • MQWEB.com.ibm.mq.console.MQWebUser – the profile
  • CN=adcdd.O=cpwebuser.C=GB – the Distinguished Name of the certificate
  • CN=SSCA8.OU=CA.O=SSS.C=GB – the Issuers (Certificate Authority) of the certificate

From this you can see the userid being used ACDC, and the certificate DN CN=adcdd.O=cpwebuser.C=GB.

And to make it more complicated

I deleted the RACDCERT MAP entry, but the web browser continued to work with the user.  I had a cup of tea and a cookie, and the web browser stopped working.   Was problem this connected to a cup of tea and a cookie?

Setting up the initial handshake is expensive.  The system has to do a logon with the certificate to get the userid from the RACDCERT mapping.  It then checks the userid has access to the SERVER profile, then it checks to see if it is MQWebAdmin, MQWebAdminRO, or  MQWebUser.

Once it has done this it it takes the userid and information, encrypts it, and creates the LTPA cookie.   This is sent down to the web browser.

The next time the web browser sends some data, it also sends the cookie. The MQWEB server decrypts the cookie, checks the time stamp to make sure the information is current, and if so, uses it.  The timeline I had was

  • create the RACDCERT mapping from certificate DN to userid
  • use browser to logon to mqweb, using the certificate with the DN
  • it works, mqweb sends down the cooke
  • delete the RACDCERT mapping for the DN
  • restart the browser, logon to mqweb, using the certificate with the DN.  The cookie is passed up – the logon works
  • clear the browser’s cookies – and retry the logon.  It fails as expected.

So ensure the browser cookie is cleared if you change the mapping or ejbrole access for the user.

Tracing Liberty logon on z/OS – is difficult

I had a few problems logging on to the MQWEB server using certificates, and found there was no documentation to help resolve problems.  The debug information you can get is often not very helpful!

As an extra twist, a userid having no access and getting a “not authorised to the resource” is OK.  For example my userid may have access to MQWebAdmin, but not to MQWebAdminRO – it may be wrong to have access to both, so you will get at least one “not authorised to the resource” violation.

I looked at

  • MQWEB traces not enough information provided
  • RACF traces – looks wrong to me
  • RACF audit data in SMF – this is all you need

The only way of getting out the data, is to enable RACF audit for the profile, and set an option in the mqwebuser.xml file.

To make it even more difficult to resolve problems. When a request arrives at the web server, it encrypts the information, and sends down an LTPA token.    The next request from the browser sends this token, and bypasses some of the initial checks, so you will not see trace entries.  After the LTPA token has expired, the next request will do the full logon again.
To prevent this from happening, clear your browser history and cache before retesting.

MQWEB trace provides information – none of it usable.

I used the trace string

traceSpecification=”*=info:zos.native.03=fine”

I also included :UserRegistry=all:Credentials=all which gave more information, not all of it useful.

This provides information like

Description: Entry: checkAuthorizationFast 
serviceResults: 00000050868103e7 
suppresMessages: 0 
logOption: 3 
requestor: 
raco_cb: 0000005082d08290 
acee: 0000000000000000 
accessLevel: 2 
applName: MQWEB 
className: EJBROLE 
entityName: MQWEB.com.ibm.mq.console.MQWebAdminRO 
...
Description: RACROUTE REQUEST=FASTAUTH return 
   returnCode: 0 
   safReturnCode: 0 
   racfReturnCode: 0 
   racfReasonCode: 0

But does not tell you which userid the request was being made for!

Sometime it gives you full control blocks, other times truncated like MQWEB.com.ibm.mq so you do not know if this is for Admin, AdminRO, or User.

MQWEB safAuthorization racRouteLog

I enabled RACF AUDIT for the MQWEB.com.ibm.mq.console.MQWebAdminRO and MQWEB.com.ibm.mq.console.MQWebAdmin.

In the mqwebuser.xml you can have to display audit messages about EJBROLE

<safAuthorization racRouteLog=”NONE” id=”saf”
reportAuthorizationCheckDetails=”false” />

See here – which says

Specifies the types of access attempts to log.

  • ASIS Records the event in the manner specified in the profile that protects the resource, or by other methods such as the SETROPTS option.
  • NOFAIL If the authorization check fails, the attempt is not recorded. If the authorization check succeeds, the attempt is recorded as in ASIS.
  • NONE The attempt is not recorded.
  • NOSTAT The attempt is not recorded. No logging occurs and no resource statistics are updated.

With AUDIT enabled, and with racRouteLog=”ASIS”  I got the following “failures” (every 10 seconds) due to the web server doing auto-refresh. The checks to MQWebUser worked, and were not recorded in the joblog.

  • 15.51.54 ICH408I USER(ADCDC ) GROUP(TEST ) NAME(ADCDC) MQWEB.com.ibm.mq.console.MQWebAdmin CL(EJBROLE )  INSUFFICIENT ACCESS AUTHORITY ACCESS INTENT(READ ) ACCESS ALLOWED(NONE )
  • 15.51.54 ICH408I USER(ADCDC ) GROUP(TEST ) NAME(ADCDC ) MQWEB.com.ibm.mq.console.MQWebAdmin CL(EJBROLE ) INSUFFICIENT ACCESS AUTHORITY ACCESS INTENT(READ ) ACCESS ALLOWED(NONE )
  • 15.51.55 ICH408I USER(ADCDC ) GROUP(TEST ) NAME(ADCDC ) MQWEB.com.ibm.mq.console.MQWebAdminRO CL(EJBROLE ) INSUFFICIENT ACCESS AUTHORITY ACCESS INTENT(READ ) ACCESS ALLOWED(NONE )
  • 15.51.55 ICH408I USER(ADCDC ) GROUP(TEST ) NAME(ADCDC )MQWEB.com.ibm.mq.console.MQWebAdminRO CL(EJBROLE ) INSUFFICIENT ACCESS AUTHORITY ACCESS INTENT(READ ) ACCESS ALLOWED(NONE )
  • 15.51.55 ICH408I USER(ADCDC ) GROUP(TEST ) NAME(ADCDC )MQWEB.com.ibm.mq.console.MQWebAdmin CL(EJBROLE ) INSUFFICIENT ACCESS AUTHORITY ACCESS INTENT(READ ) ACCESS ALLOWED(NONE )
  • 15.51.55 ICH408I USER(ADCDC ) GROUP(TEST ) NAME(ADCDC )MQWEB.com.ibm.mq.console.MQWebAdminRO CL(EJBROLE ) INSUFFICIENT ACCESS AUTHORITY ACCESS INTENT(READ ) ACCESS ALLOWED(NONE )

When I changed it to racRouteLog=”NONE” the messages were not produced on the joblog.   There were still records produced in the SMF audit data with the profiles having AUDIT ALL(READ) specified.

I think you should usually run with racRouteLog=”NONE” , and change it to racRouteLog=”ASIS”  when you have a problem – but be careful not to generate a flood of messages.

To display SAF messages about other violations use

<safCredentials unauthenticatedUser=”WSGUEST” profilePrefix=”MQWEB”
suppressAuthFailureMessages=”false” mapDistributedIdentities=”false”/>

RACF Trace

This gives data as to what was accessed, but reports the userid of the web server, not the userid being checked – so not very useful.

I used the command

  • #set trace(RACROUTE(ALL),JOBNAME(CSQ9WEB))
  • traced to GTF with USRP(F44)
  • formatted it with the IPCS command GTF USR(ALL)

This had data like

  • Trace Type: RACFPOST – this is the “AFTER” request
  • Service number: 00000002 – this is RACROUTE 2, verify
  • RACF Return code: 00000008
  • RACF Reason code: 00000004
  • MQWEB.com.ibm.mq.console.MQWebAdmin – this profile
  • EJBROLE – this class
  • MQWEB – this application
  • ACEE ( userid block) with userid=START1, Group=SYS1, Jobname=CSQ9WEB.   This had a userid of START1 (from the job),  but the userid being tested was for ADCDC which was not in the control blocks – so no good for telling you which userid had accces or not.

So all we can tell is, that for the profile EJBROLEMQWEB.com.ibm.mq.console.MQWebAdmin someone got a ‘no access’ return code.

Turn off the RACF trace using

#SET TRACE(NORACROUTE,NOJOBNAME)

RACF Auditing – this worked and gave me most of the information

I turned on RACF auditing using

  • RALTER EJBROLE MQWEB.com.ibm.mq.console.MQWebAdmin AUDIT(ALL,READ)
  • RALTER EJBROLE MQWEB.com.ibm.mq.console.MQWebAdminRO AUDIT(ALL,READ)
  • RALTER EJBROLE MQWEB.com.ibm.mq.console.MQWebUser  AUDIT(ALL,READ)
  • SETROPTS RACLIST(EJBROLE)

This writes a record to SMF for ALL(failed  and successful) request which were READ or above.

I used the RACF provided exits to the SMF dump program(IFASMFDP).   This produces readable file with all of the data.

I wrote an ISPF edit macro in rexx to take the data, and extract the key fields.

Below are the records produced for

  • an SSL connection using a certificate which mapped to userid ADCDB,
  • a logon with a userid and password with userid ADCDC
RESULT  USERID  WANT ALLOWED CLASS   RESOURCE
SUCCESS START1  READ /READ   SERVER  BBG.SECPFX.MQWEB 
SUCCESS ADCDB   READ /UPDATE APPL    MQWEB 
SUCCESS ADCDB   READ /UPDATE APPL    MQWEB 
SUCCESS ADCDC   READ /READ   APPL    MQWEB 
SUCCESS WSGUEST READ /READ   APPL    MQWEB 
INSAUTH ADCDC   READ /NONE   EJBROLE MQWEB.com.ibm.mq.console.MQWebAdmin 
INSAUTH ADCDC   READ /NONE   EJBROLE MQWEB.com.ibm.mq.console.MQWebAdminRO 
SUCCESS ADCDC   READ /READ   EJBROLE MQWEB.com.ibm.mq.console.MQWebUser 
INSAUTH ADCDC   READ /NONE   EJBROLE MQWEB.com.ibm.mq.console.MQWebAdmin 
INSAUTH ADCDC   READ /NONE   EJBROLE MQWEB.com.ibm.mq.console.MQWebAdmin 
INSAUTH ADCDC   READ /NONE   EJBROLE MQWEB.com.ibm.mq.console.MQWebAdmin 
INSAUTH ADCDC   READ /NONE   EJBROLE MQWEB.com.ibm.mq.console.MQWebAdminRO 
INSAUTH ADCDC   READ /NONE   EJBROLE MQWEB.com.ibm.mq.console.MQWebAdmin 
INSAUTH ADCDC   READ /NONE   EJBROLE MQWEB.com.ibm.mq.console.MQWebAdminRO 
INSAUTH ADCDC   READ /NONE   EJBROLE MQWEB.com.ibm.mq.console.MQWebAdminRO 
SUCCESS ADCDC   READ /READ   EJBROLE MQWEB.com.ibm.mq.console.MQWebUser 
INSAUTH ADCDC   READ /NONE   EJBROLE MQWEB.com.ibm.mq.console.MQWebAdminRO 
SUCCESS ADCDC   READ /READ   EJBROLE MQWEB.com.ibm.mq.console.MQWebUser 
SUCCESS ADCDC   READ /READ   EJBROLE MQWEB.com.ibm.mq.console.MQWebUser 
SUCCESS ADCDC   READ /READ   EJBROLE MQWEB.com.ibm.mq.console.MQWebUser 
INSAUTH ADCDC   READ /NONE   EJBROLE MQWEB.com.ibm.mq.console.MQWebAdmin 
INSAUTH ADCDC   READ /NONE   EJBROLE MQWEB.com.ibm.mq.console.MQWebAdminRO 

From the RESULT column, the userid ADCDC had

  • only READ access to MQWEB.com.ibm.mq.console.MQWebUser,
  • no access to MQWEB.com.ibm.mq.console.MQWebAdmin and MQWEB.com.ibm.mq.console.MQWebAdminRO, as we can see from the INSufficientAUTHority in the RESULT column.  The resource wanted READ – but had NONE.    This is OK, as we want the MQWebUser permissions.

From SUCCESS ADCDB READ /UPDATE APPL MQWEB  we can see that userid ADCDB (from the certificate) wanted READ access, but had UPDATE access to to MQWEB in the class APPL.

It does not look very optimised code, it looks like the logic is like

  • hmm it looks like this userid does not have access to this EJBROLE, let me check again
  • and again
  • and again
  • and again
  • and again
  • and again
  • OK give up, and try another resource
  • hmm it looks like you do have access to this other resource -let me check again..
  • and again…
  • … ok  you still have access – let’s go with this.

I expect is because a high level java program called a class to do some work, which checked the access; it called another class which did its own checks etc.  Understandable, but not efficient coding.

This all worked, I could see all of the access requests, but sadly I did not get a record saying “this certificate…. was mapped to userid ADCDB”.

You need a SAF Angel for Liberty Web Server on z/OS

I spent a week trying to get MQWEB on z/OS to work using digital certificates using RACF as a repository, and had lots of frustrations, and some successes. I found that to get security working you need an Angel process.

Some parts are well document, some other parts are not, so I’ll try to fill some of the gaps.

How does MQWEB do security?

The Liberty server has a couple of ways of Authentication and Authorisation.

  • The basic repository where you define users, passwords and their role (MQWebAdmin, MQWebAdminRO or MQWebUser) within the xml configuration file.   This provides the most basic levels of protection – for example the MQ administrator has to create and maintain the passwords.  It is easy to set up, but not very secure.
  • Using the System Authorization Facility (SAF) interface.  This is an interface which applications can use to get to the security back end.  There is a choice of back-ends, for example RACF from IBM, and Top Secret from CA.  This can be configured to be very secure, but has more administrator work to set up.

How does SAF work?

There are RACF profiles which control access to restricted facilities.  If you have access to the profile you can perform the function.  For example when you logon to the a server, the server thread has to run as your userid, to allow your userid access to resources.  You set up a profile, and explicitly give the server access to the profile permitting the server to “run on behalf of another userid”.

There is a C function on z/OS that allows my userid to check to see if your userid is authorised to a resource.    If my userid has access to the correct profile, my userid can get the information about your userid.

Some “system-like” functions assume the requester is authorised, and bypass some of the checks.  For example the RACROUTE FASTAUTH check request.   These have another level of control in that they need to be in a restricted, Supervisor (kernel) state, to be able to issue the requests.

What is the angel process?

Instead of giving Java programs access to this restricted Supervisor mode, there is an Angel address space which can execute the restricted requests, and the Java program can call the code running in the Angel address space.  (Under the covers there is a Program Call to execute the code, in the same way that MQ and DB2 requests do.)   Of course, there is a RACF profile to control which address spaces can access the Angel, and other profiles to configure what restricted functions the Angel process can execute.

If the Angel process was not running, I could not logon to MQWEB.   There were messages saying my userid did not have access to MQWebAdmin, MQWebAdminRO or MQWebUser.   The checking is done within the LIberty and is very simple, and restrictive – and does not work because it looks for group names  MQWebAdmin etc… but MQWebAdmin is too long for a z/OS group name which can have up to 8 characters in it.

Setting up the Angel process.

  • The Angel process can be shared by all of the web servers on the LPAR.
  • In a highly available environment you need more than one Angel on an LPAR.
  • MQ ships Angel code, but it may be at a different level to other Liberty servers’s Angel code.  You should run on the latest level of code, but I expect if they are reasonably current( within 1 year) they should all work ok.
  • This gives a good overview of the Angel.
  • This also gives a good overview and instructions
  • See here for configuration instructions.
  • I think the only Angel interface used by MQWEB  is SAFCRED  (SAF Credentials) and PRODMGR, but it does no harm having access to services you do not use.
  • You cannot stop an Angel process if it has servers “connected” to it.
  • If you cancel the Angel, the Web Server stops working,  it may get  CEE3250C The system or user abend S0D6 R=00000027 was issued  in the message.log file,  and abend.
  • For availability you may want two queue managers on an LPAR, so you need two Angel process.
  • There is no configuration you do to an Angel, so the only reason why you may want to shut down the Angel during normal running is to apply fixes.
  • You need to ensure the Angel process is started before the Web Server is started, as the Web Server only connects at start up, see below.
  • Each Angel running in an LPAR needs a unique name.  You can have a default, unnamed Angel, or give your Angel a name.

I called my angel process ANGEL.   From WAS and z/OS Connect they talk about their Angels with names like BBZANGL, BAQZANGL, which are not very memorable, hard to remember, and easy to mistype.

You start an angel process (once the JCL has been configured) using a command like

  • S ANGEL
  • S ANGEL,NAME=ANGEL1
  • S ANGEL.ANGEL1,NAME=ANGEL1
  • S ANGEL.ANGEL2,NAME=ANGEL2

The first command starts the default, unnamed, Angel.

Using the S ANGEL.name command is useful if you have more than one angel as it means you can use the STOP name command to that particular instance.

Configure your web server to end if the angel is not available

You can configure your Liberty server to end if the Angel process is not running.    Without it, the Web Server would start, and requests to use it would fail, because the unauthorised interface would be used. Using this option means you find out Sunday night and not when your business starts at 0900 on Monday morning!

You configure it by editing the server jvm.options file by adding

-Dcom.ibm.ws.zos.core.angelRequired=true

If the Angel is not available, this statement prevents the Web Server from starting

You can specify the angel name using

-Dcom.ibm.ws.zos.core.angelName=MYANGEL

If you are using the default angle name, comment this out

# -Dcom.ibm.ws.zos.core.angelName=

When your Web Server starts you should get messages in the message.log  like

CWWKB0103I: Authorized service group SAFCRED is available.

Tracing violations.

You can start an Angel using

S ANGEL,SAFLOG=YES

I gave my MWEB userid no access to SAFCRED, and restarted MQWEB.  With SAFLOG=YES, I got the following on the joblog

ICH408I USER(START1 ) GROUP(SYS1 ) NAME(####################)
BBG.AUTHMOD.BBGZSAFM.SAFCRED CL(SERVER )
INSUFFICIENT ACCESS AUTHORITY
ACCESS INTENT(READ ) ACCESS ALLOWED(NONE )

and in the MQWEB directory/logs/message.log file

CWWKB0104I: Authorized service group SAFCRED is not available.

With SAFLOG=NO, there was no message on the joblog, but the same message in  MQWEB directory/log/message.log file

CWWKB0104I: Authorized service group SAFCRED is not available.

From this we can see that you should always specify SAFLOG=YES.

Angel commands

You can use

F ANGEL,DISPLAY,SERVERS

This gives a message like

CWWKB0052I ACTIVE SERVER ASID 3f JOBNAME CSQ9WEB

You can use

F ANGEL,TRACE=Y

This writes a trace to //STDOUT in the Angel job.

The output is like

Trace: 2020/07/25 15:38:07.877974 t=8D5E88 key=S2 (04002008)
Description: write_to_operator_location, entry
message_p: CWWKB0057I WEBSPHERE FOR Z/OS ANGEL PROCESS ENDED NORMALLY

Which is not very helpful, as it traces what the Angel process is doing – rather than the Web Servers using it.

How to use RACF callable services from C

Trying to use the RACF callable services was like trying to find treasure with an incomplete map.  I found it hard to create a C program to query a repository for ID information.   This post is mainly about using the RACF callable services.

I was trying to understand the mapping of a digital certificate to a z/OS userid, but with little success.  I found a RACF callable service which appeared to do what I wanted – but it did not give the answers – because,  like many treasure maps, I was looking in the wrong place.

RACF has two repositories for mapping identities to userid.

  • RACDCERT MAP which was the original way of mapping names.  As far as I can tell, the only way of getting the certificate to userid mapping programmatically, is to use the certificate to logon, and then find the userid!   This is used by Liberty Web Server.
  • RACMAP MAP which is part of Enterprise wide identification.   It maps identity strings, as you may get from LDAP,  to a userid. You can use the r_usermap callable service to get this information.

It took me some time to realise that these are different entities, and explains why there was no documentation on getting Liberty to work with RACMAP to handle certificates.  I found out RACMAP does not map certificate, after I got my program working.

The r_usermap service documentation is accurate – but incomplete, so I’ll document some of the things I learned in getting this to work.

The callable service to extract the userid  from identity information is documented here.  In essence you call the assembler routine r_usermap or IRRSIM00.

Building it

When you compile it, you need to provide the stub IRRSIM00 at bind time.  I used JCL

//S1 JCLLIB ORDER=CBC.SCCNPRC 
//DOCLG EXEC PROC=EDCCBG,INFILE='ADCD.C.SOURCE(C)', 
// CPARM='OPTF(DD:COPTS)' 
//COMPILE.COPTS DD * 
LIST,SSCOMM,SOURCE,LANGLVL(EXTENDED) 
TEST 
/* 
//COMPILE.SYSIN DD * 
  Source program goes here
//BIND.CSS DD DISP=SHR,DSN=SYS1.CSSLIB 
//BIND.SYSIN DD * 
INCLUDE CSS(IRRSIM00) 
/*

You need code like

#pragma linkage(IRRSIM00, OS) 
int main(){...
...
char * workarea ; 
workarea = (char *) malloc(1024)   ; 
long ALET1= 0; 
...
long SAF_RC,RACF_RC,RACF_RS; 
...
rc=  IRRSIM00(workarea, // WORKAREA 
             &ALET1  , // ALET 
             &SAF_RC, // SAF RC 
...

Some fields are in UTF-8.

To covert from EBCDIC to UTF-8, (it looks like ASCII )  I used

cd = iconv_open("UTF-8", "IBM-1047"); 
...
struct 
{ 
  short length ; // length of string following or 0 if ommitted 
  char value[248]; 
} DN; 
char * sDN= "CN=COLIN.C=GB"; 
size_t llinput= strlen(sDN); 
size_t lloutput= sizeof(DN.value); 
char * pOutValue= &DN.value[0]; 
rc = iconv(cd,        // from  iconv_open
           &sDN,      // input string
           &llinput,  // length of input 
           &pOutValue,// output 
           &lloutput);// length of output 
if (rc == -1) // problem
{ 
  perror("iconv"); 
  exit(99); 
} 
DN.length  =sizeof(DN.value) - ll2; // calculate true length

What access do I need?

You need

permit IRR.RUSERMAP class(FACILITY) access(READ) ID(....)
SETROPTS RACLIST(facility ) REFRESH

Output

Once I had got the program to compile and bind, and got the authorisation it worked great.

It only works with the RACFMAP …  command, not the RACFDCERT command, obvious now I know!  To get the information from the RACDCERT MAP, you need to use initACEE.

What no one tells you about setting up your RACF groups – and how to do it for MQ.

Introduction

The RACF documentation has a lot of excellent reference materials describing the syntax of the commands, but I could not find much useful information on how to set up RACF specifically for products like MQ, CICS, Liberty etc.

It is bit like saying programming has the following commands, load, store, branch; but fails to tell you that you can do wonderful things like draw Mandelbrot pictures using these instructions.

You need to plan your group structure before you try to implement security, as it is hard to change once it is in place.

The big picture

You can set up a hierarchy of groups so the site RACF person can set up a group called MQ, and give the MQ team manager authority to this group.

The manager can

  • define groups within it
  • connect users to the group
  • give other people authority to manage the group.

We can set up the following group structure

  • MQM
    • MQOPS – for the MQ operators
      • MQOPSR for operators who are allow to issue only Read (display) commands
      • MQOPSW for operators who can issue all command, display and update
    • MQADMS – for the MQ administrators
      • MQADMR – for MQ administrators who can only use display commands
      • MQADMW – for MQ administrators who can use all commands
    • MQWEB….

You should place an operator’s userid in only one group MQOPSR or MQOPSW as these are used to control access.  MQM, MQOPS, MQADMS, MQWEB are just used for administration.

You permit groups MQOPSR and MQOPSW to issue a display command, but only permit group MQOPSW to issue the SET command.

Setting up groups to make it easy to administer

A group needs an owner which administers the group.  The owner can be a userid or a group.

A group has been set up called MQM, and my manager has been made the owner of it.

My manager has connected my userid PAICE to the MQM group with group special.

CONNECT PAICE GROUP(MQM) SPECIAL

I can define a new group MQOPS for example

ADDGROUP MQOPS SUPGROUP(MQM) OWNER(MQM) DATA(‘MQ operators’)

The SUPGROUP says it is part of the hierarchy under MQM.  I can create the group under MQM because I am authorised,  If I try to create a group with SUPGROUP(SYS1) this will fail because I am not authorised to SYS1.

The OWNER(MQM) says people in the group MQM with group special can administer this new group.

Because my userid (PAICE) has group special for MQM, I can now connect users to the new group, for example

CONNECT ADCDB GROUP(MQMD ) AUTHORITY(USE ).

I can create another group under MQMD called MQMX, and connect a userid to it.

ADDGROUP MQMX  SUPGROUP(MQMD) OWNER(MQMD) DATA(‘MQ Bottom group’)
CONNECT ADCDE GROUP(MQMX ) AUTHORITY(USE )

My userid PAICE can administer this because of the OWNER() inheritance up to GROUP(MQM)

If I list the groups I get

LISTGRP MQM 
INFORMATION FOR GROUP MQM 
    SUPERIOR GROUP=SYS1 OWNER=IBMUSER 
    SUBGROUP(S)= MQM2 MQMD  
    USER(S)= ACCESS= ACCESS COUNT= UNIVERSAL ACCESS= 
       PAICE    JOIN        000000              NONE 
         CONNECT ATTRIBUTES=SPECIAL 

LISTGRP MQMD 
INFORMATION FOR GROUP MQMD 
    SUPERIOR GROUP=MQM OWNER=MQM 
       SUBGROUP(S)= MQMX 
    USER(S)= ACCESS= ACCESS COUNT= UNIVERSAL ACCESS=  
       ADCDB USE 000000 NONE CONNECT ATTRIBUTES=NONE
 
LISTGRP MQMX 
    INFORMATION FOR GROUP MQMX 
    SUPERIOR GROUP=MQMD OWNER=MQMD 
    NO SUBGROUPS 
    USER(S)= ACCESS= ACCESS COUNT= UNIVERSAL ACCESS= 
       ADCDE     USE        000000              NONE 
         CONNECT ATTRIBUTES=NONE

All SUPGROUP() does is to define the hierarchy as we can see from the LISTGRP.    We can display the groups  and draw up a picture of the hierarchy.   You can use the LISTGRP command repeatedly,  or use the DSMON program(EXEC PGM=ICHDSM00) and use option
USEROPT RACGRP to get a picture like

 LEVEL GROUP 
1 SYS1 (IBMUSER ) 
2 | MQM (IBMUSER ) 
3 | | MQMD 
4 | | | MQMX 
3 | | MQM2 (IBMUSER )

Using OWNER(group) instead of OWNER(userid)

  • If you have OWNER(groupname) it is easy to administer the groups.  When someone joins or leaves the department, you add or remove the userid from groupname.  One change.
  • If you have OWNER(userid), then you have to explicitly connect the userid to each group with group special.  When there is a new person you have to add the userid to each group individually.  When someone leaves the team you have to remove the persons userid from all of the groups. This could be a lot of work.

Delegation.

You could define an operator MQOP1 and give the userid group-special for group MQOPS.   This userid (MQOP1) can be used to add or remove userids in the MQOPSR and MQOPSW groups.

Looking at the MQOPS groups we could have groups and connected userids

  • MQM with MQ security userids PAICE, BOB having group-special
    • MQOPS with the operations manager and deputy MQOP1, MQOP2 having group special
      • MQOPSR with STUDENT1, STUDENT2 who are only allowed to issue display commands
      • MQOPSW with PAICE, TONYH, CHARLIE
    • MQADMS….

and similarly for the MQ administration eam.

Userid PAICE can connect userids to all groups.  MOP1 can only connect userids to the MQOPSR and MQOPSW, and not connect to the MQ ADMIN groups.

You use groups MQOPSR and MQOPSW for accessing resources. Groups MQM and MQOPS have no authority to access a resource, they are just to make the administration easier.

You may also want to consider having a group for application development.  The group called PAYRDEVT is under MQM, is owned by the manager of the payroll development team.

When the annual userid validation check is done, the development manager does the checks, and tells the security department it has been done.

Permissions

There is no inheritance of permissions.  If a userid needs functions available to groups MQMD and MQMX, the user needs to be connected to both groups.

You only connect userids to groups, you cannot have groups within groups.  There may be many groups of userids which are allowed to issue an MQ display command, but only one group who can issue the SET command.

 

Suggested MQ groups

You need to consider

  • production and test environments
  • resources shared by queue managers, queue managers with the same configurations in a sysplex which can share definitions
  • queue managers as part of a Queue Sharing Group
  • queue manages that need isolation and so may have common operations groups, but different administration and programming groups.

You might define

  • Group MQPA for the queue manager super group. (MQ, Production system, A)
  • Groups for MQWEB. The Web server roles are described here.
  • Groups for controlling MQ, operations and administrations, read only or update
  • Groups for who can connect via batch, CICS etc
  • Groups for application usage, who can use which queues

Groups for MQWEB

For MQWEB the MQ documentation describes 4 roles: MQWebAdmin, MQWebAdminRO, MQWebUser, MFTWebAdmin; and there is console and REST access.

Each role should have its own group.  The requests from “Admin” and “Read Only” run with the userid of the MQWEB started task.   The request from “User” run with the signed on user’s authority.

You might set up groups

  • MQPAWCO MQPAMQWebAdminRO Console Read Only.
  • MQPAWCU MQPA – MQWebUser  Console User only.  The request operates under the signed on userid authority.
  • MQPAWCA MQPA – MQWebAdmin Console Admin.
  • MQPAWRO MQPA – MQWebAdminRO REST Read Only.
  • MQPAWRU MQPA – MQWebUser  REST User only.   The request operates under the signed on userid authority.
  • MQPAWRA MQPA – MQWebAdmin REST Admin Only.
  • MQPAWFA MQPA – MFTWebAdmin MFT REST Admin. 
  • MQPAWFO MQPA-  MFTWebAdmin MFT REST Read Only.

I would expect most people to be in

  • MQPAWCU MQPA – MQWebUser  Console User only.  The request operates under the signed on userid authority.
  • MQPAWRU MQPA – MQWebUser  REST User only.   The request operates under the signed on userid authority.

so you can control who does what, and get reports on any violations etc.  If people use the MQWEB ADMIN you do not know who tried to issue a command.

Groups for operations

The operations team may be managing multiple queue managers, so you may need groups

  • PMQOPS for Production
    • PMQOPSR
    • PMQOPSW
  • TMQOPS for Test
    • PMQOPSR
    • PMQOPSW

If some operators are permitted to manage only a subset of the queue managers you will need a group structure that can handle this, so have a special group XMQOPS for this.

  • XMQOPS for  the special queue manager
    • XMQOPSR
    • XMQOPSW

Groups for administration.

This will be similar to operations.

Groups for end users.

This is for people running work using MQ.

Usually there are checks to make sure a userid can connect to the queue manager, using the MQCONN resource.  Some customers have a loose security set up, and rely on the CICS to check to see if the userid is allowed to use a CICS transaction, rather than if the userid is allowed to access a queue.

Playing twister with Liberty and falling over

Twister is a game where you have you put your left foot here, your left hand there, your right foot here, and in trying to put your right hand over there you fall over.  This is how I felt when I was trying to understand the SSL definitions in MQWEB.  In the end I printed off the definitions and used coloured pens to mark the relevant data.

Let’s start with the easy bit.

  • When mqweb starts, it reads configuration information from a file server.xml
  • This includes two files the “IBM” stuff in  /usr/lpp/mqm/V9R1M1/web/mq/etc/mqweb.xml and the user stuff in /u/mqweb/servers/mqweb/mqwebusr.xml .
  • SSL parameters are defined with and <ssl… id=”thisSSLConfig”   keyStoreRef=”defaultKeyStore” .. /> tag.    This points to the keystore to use.
  • The keystore has <keyStore id=”defaultKeyStore”  ….>.   There is a simple link from SSL to the keystore.   You could have multiple keystores,  if so you just change the  keyStoreRef= to point to a different one.
  • You can have more than one <ssl…/> definition.  You might have one, and there is one in the “IBM stuff”, so you need <sslDefault sslRef=”thisSSLConfig”/>  to point to the ssl statement to use.

That should all be clear, and make sense.   A bit like saying you have a right foot, a left foot and two hands.

The zos_saf_registry.xml used when you want to use the SAF interface on z/OS has some SSL definitions.  I was trying to understand them.   This one here(put a finger on it) points to that one, (put a finger on it), which points to this other one (put a finger on it), which has an end-comment.  Whoops that didn’t work.    As I said a bit like playing Twister.

<sslDefault sslRef=”mqDefaultSSLConfig”/> in the user mqwebuser.xml  points to content in the “IBM stuff”.  By the various levels of indirection this points to <keyStore id=”defaultKeyStore” location=”key.jks” type=”JKS” password=”password”/> .  This keystore has a self signed certificate provided by IBM. If you find your browser complains about using a self signed certificate, this may well be the cause.

In the zos_saf_registry.xml are commented statements

  • <keyStore id=”defaultKeyStore” location=”safkeyring://userId/keyring” …/>
  • <ssl id=”thisSSLConfig”  keyStoreRef=”defaultKeyStore” …/>
  • <sslDefault sslRef=”thisSSLConfig”/>

To me these have been defined upside down, sslDefault should come first.

As these are after the sslDefault sslRef=”mqDefaultSSLConfig statment, if you uncomment them, they will be picked up and the “IBM stuff” will not be processed.

You can uncomment these statements and use them to add your definitions.

My definitions are

<sslDefault sslRef="defaultSSLConfig"/> 
<ssl id="defaultSSLConfig" keyStoreRef="racfKeyStore" 
   sslProtocol="TLSv1.2" 
   clientAuthenticationSupported="true" 
   clientAuthentication="true" 
   serverKeyAlias="LABELMQWEBHSCEKE"/> 

<keyStore filebased="false" id="racfKeyStore" 
   location="safkeyring://START1/MQRING" 
   password="password" 
   readOnly="true" 
   type="JCERACFKS"/> 

<webAppSecurity allowFailOverToBasicAuth="false"/>

Why cant I logoff from mqconsole?

If you are using mqweb using certificates to identify yourself, if you logoff, or close the tab, then open a new tab, you will get a session using the same certificate as before.

This little problem has been a tough one to investigate, and turns out to be lack of function in Chromium browser.

The scenario is you connect to mqweb using a digital certificate. You want to logoff and logon again with a different certificate, for example you do most of your work with a read only userid, and want to logon with a more powerful id to make a change.  You click logoff, and your screen flashes and logs you on again with the same userid as before.

At first glance this may look like a security hole, but if someone has access to your web browser, then the can click on the mqweb site, and just pick a certificate – so it is no different.

Under the covers,  the TLS handshake can pass up the previous session ID.   If the server  recognises this, then it is a short handshake instead of a full hand shake, so helping performance.

To reset the certificate if you are using Firefox

To clear your SSL session state in Firefox choose History -> Clear Recent History… and then select “Active Logins” and click “OK”. Then the next time you connect to your SSL server Firefox will prompt for which certificate to use, you may need to reset the URL.

You should check Firefox preferences, certificates, “Ask you every time” is selected, rather than “Select one automatically”.

Chrome does not support this reset of the certificate.

There has been discussion over the last 9 years along the lines of, seeing as Internet Explorer, and Firefox have there, should we do it to met the end user demand?

If you set up an additional browser instance, you get the same problem. With Chrome you have to close down all instances of the browser and restart chrome to be able to select a different certificate.

It looks like there is code which has a cache of url, and certificate to use.   If you open up another tab using the same IP address you will reuse the same certificate.

If you localhost instead of 127.0.0.1 – it will prompt for certificate, and then cache it, so you can have one tab open with one certificate, and another tab, with a different URL and another certificate.

What is in a cipher suite name? or how to tell your RSA from your ephemeral

Why do we need stronger encryption?

  • To make keys more resilient to attack you need longer keys.
  • There are newer ways of providing better private keys than just using large prime numbers.   For example using the equation y**2= a* x**3 + b * x**2  + c*x + d.  Which you may recognize as a cubic equation, but comes under the name of Elliptic Curves(EC).  (For some values of a,b,c,d if you plot the curve it is an ellipse.)  These Elliptic Curves with small keys are harder to crack than RSA with longer.  They also use less resources during encryption and decryption.
  • Originally public/private certificates were used for both authentication and encryption.  This has the disadvantage that if I monitor your traffic for a year, then steal your private key (for example from the corporate backups) then I can decrypt all of your traffic.  You need to use a technique called Forward Secrecy to prevent this.   This gives assurances that session keys will not be compromised even if the private key of the server is compromised.   With Forward Secrecy
    • You use the public/private key for authentication, and generate a secret for the encryption.   A technique called Diffie-Hellman(DH) can be used to agree an agreed  common secret without a man-in-the middle being able to determine the secret key.   See Wikipedia  for a good description.   This is good – but repeated conversations may use the same common secret and over repeated use, people may be able to guess your key.
    • This problem is fixed by using Ephemeral(E) keys, known as one-time keys, which are valid for just one conversation.  A second conversation will get a different secret key.

You need to support and use ECDHE (Elliptic Curves – Diffie-Hellman – Ephemeral)  suites in order to enable forward secrecy (having the private key means you cannot decrypt the message) with modern web browsers.  Avoid the RSA key exchange unless absolutely necessary.

What does a cipher spec tell us?

This is a good web site which tells you what the cipher spec means.

A cipher spec describes the techniques to be used for authentication, encryption and hashing the data.  This is negotiated between the two ends when setting up a TLS handshake.  The conversation is like “Client: I support the following cipher specs; Server: I like this one…”,  or “Client: I support the following cipher specs; Server: hmm none match”

If you look at the names of cipher suites available with TLS v1.2 you find names like

  • TLS_RSA_WITH_…  this is for a key with public certificate generated with RSA
  • TLS_ECDH_RSA_WITH… this is for a key with public certificate generated with Elliptic Curve(EC) and uses Diffie-Hellman(DH)
  • TLS_ECDHE_ECDSA_WITH…  this is for a key with public certificate generated with Elliptic Curve(EC) and uses Diffie-Hellman(DH) and Ephemeral key (E)

I found this document which is a good introduction to cipher specs TLS 1.2, TLS 1.3 etc

A cipher spec which I use a lot is TLS_ECDHE_ECDSA_WITH_AES_256_CBC_SHA384

Let me break that down into the components

  • TLS indicates the protocol ( old versions might have SSL)
  • ECDHE : Key Exchange Algorithm.  What is used to generate a secret number. Think of a telephone conversation between the UK and China. “Should we talk in English or Chinese”.  “I prefer English”. “OK my information is …”.   In this example  we have  ECDHE:  Elliptic Curve, Diffie-Hellman, Ephemeral.   This can be
    • ECDHE
    • ECDH
    • RSA
    • DH
  • ECDSA:  Authentication/Digital Signature Algorithm: What sort of certificate can be used.
    • ECDSA the server public key is an Elliptic Curvesignifies.  ECDSA is Elliptic Curve + Digital Signing Algorithm.   The TLS spec says it should be signed using a CA with EC public certificate – but it works even if it is signed with an RSA certificate
    • RSA the server public key is created with RSA public certificate.  The TLS spec says it should be signed using a CA with EC public certificate – but it works even if it is signed with an ECDSA certificate
  • “WITH” splits the authentication and encryption from the encryption of the data itself.
  • AES_256_CBC indicates the bulk encryption algorithm: Once the handshake has completed, the encryption of the payload is done using symmetric encryption.  They keys are determined during the handshake.    AES_256 is a symmetric encryption with a 256 bit key using Cipher Block Chaining. (CBC is like using a “running total” of the data encrypted so far, as an input to the encryption).   TLSv1.3 drops support for CBC;  GCM can be used instead. It is faster and can exploit pipeline processors.
  • SHA384 indicates the algorithm for hashing the message (MAC =  Message Authentication Code)

Notes:

  • DSS is a different authentication algorithm. For example TLS_DHE_DSS…   It also stands for Digital for Digital Signature Standard which covers all algorithms – so a touch confusing.
  • Because RSA tends to be used for authentication and encryption, I think of TLS_RSA_WITH… as TLS_RSA_RSA_WITH.   So the secret number generation algorithm is RSA, and then the certificate with an RSA public key is used.

For TLS 1.3 the cipher specs are like  TLS_AES_256_GCM_SHA384 because the key exchange algorithm will be either ECDHE or RSA.

How to restrict what certificates and algorithms clients can use to connect to java web servers

As part of your regular housekeeping you want to limit connections to your web server from weak keys and algorithms.   Making changes to the TLS configuration could be dangerous, as there is no “warning mode” or statistics to tell you if weak algorithms etc are being used.  You have to make a change and be prepared to have problems.

In this posting I’ll explain how to do it, then explain some of the details behind it.

How to restrict what certificates and algorithms can be used by web servers and java programs doing TLS.

One way which does not work.

The jvm.options file provided by mqweb includes commented out

-Djdk.tls.disabledAlgorithms=… and  -Djdk.tls.disabledAlgorithms=…..

These is the wrong way of specifying information, as you do it via the java.security file, not -D… .

Create an mqweb specific private disabled algorithm file

Java uses a java.security file to define security properties.

On my Ubuntu, this file if in /usr/lib/jvm/…/jre/lib/security/java.security  .

Create a file mqweb.java.security.  It can go anywhere – you pass the name using a java system property.

Copy  from the java.security file to your file, the lines with  with jdk.tls.disabledAlgorithms=..  and jdk.certpath.disabledAlgorithms=… . 

On my system, the lines are (but your security people may have changed them – if so,  you might want to talk to them before making any changes)

jdk.tls.disabledAlgorithms=SSLv3, RC4, DES, MD5withRSA, DH keySize < 1024,     EC keySize < 224, 3DES_EDE_CBC, anon, NULL

jdk.certpath.disabledAlgorithms=jdk.certpath.disabledAlgorithms=MD2, MD5,    SHA1 jdkCA & usage TLSServer,    RSA keySize < 1024, DSA keySize < 1024, EC keySize < 224

The jvm.options file provided by IBM has

-Djdk.tls.disabledAlgorithms=SSLv3, TLSv1, TLSv1.1, RC4, MD5withRSA, DH keySize < 768, 3DES_EDE_CBC, DESede, EC keySize < 224, SHA1 jdkCA & usage TLSServer

So you may want to add this as in your override file ( without the -D), so add “, SHA1 jdkCA & usage TLSServer” to  jdk.certpath.disabledAlgorithms .

Tell mqweb to use this file

Create a java system property in the mqweb jvm.options file

“-Djava.security.properties=/home/colinpaice/eclipse-workspace-C/sslJava/bin/serverdisabled.properties”

Restart your web server.  You have not changed anything – just copied some definitions into an mqweb specific file, so it should work as before.

Limit what can be used

I set up several certificates with combination of RSA and Elliptic Curves, varying keysize, signatures;  and signed with CAs with RSA, and Elliptic Curve, and different signatures.

For example RSA4096,SHA256withECDSA,/EC256,SHA384with ECDSA means

  • RSA4096 certificate is RSA with a key size of 4096
  • SHA256withECDSA signed with this
  • /EC256 the CA has a public key of EC 256
  • SHA384with ECDSA and the CA was signed with this

I then specified different options in the servers’ file, and recorded if they TLS connection worked or not; if not – why not.

certpath: RSA keySize <= 2048

Server EC407,   SHA256withRSA,   /RSA4096,   SHA512withRSA

  • ✅RSA4096,SHA256withECDSA,   /EC256,SW=SHA384with ECDSA
  • RSA2048, SHA256withRSA,      /RSA4096,/SHA512withRSA
  • ✅ EC407,      SHA256withRSA,      /RSA4096, SHA512withRSA
  • ✅EC384,       SHA256withECDSA, /EC256,      SHA384withECDSA

certpath: RSA keySize <= 4096

Server EC407,      SHA256withRSA,      /RSA4096, SHA512withRSA

  • RSA4096,SHA256withECDSA, /EC256,       SHA384with ECDSA
  • RSA2048,SHA256withRSA,     /RSA4096,  SHA512withRSA
  • ❌EC407,      SHA256withRSA,      /RSA4096, SHA512withRSA
  • ✅EC384,       SHA256withECDSA, /EC256,      SHA384withECDSA

certpath:EC keySize <= 256

Server EC407,      SHA256withRSA,      /RSA4096, SHA512withRSA

  • ❌  RSA4096,SHA256withECDSA, /EC256,       SHA384with ECDSA
  • ✅ RSA2048,SHA256withRSA,       /RSA4096,  SHA512withRSA
  • ✅ EC407,      SHA256withRSA,      /RSA4096, SHA512withRSA
  • ❌ EC384,       SHA256withECDSA, /EC256,      SHA384withECDSA

tls:EC keySize <= 256

Server EC407,      SHA256withRSA,      /RSA4096, SHA512withRSA

  • ✅ RSA4096,SHA256withECDSA,   /EC256,       SHA384with ECDSA
  • ✅ RSA2048,SHA256withRSA,        /RSA4096,  SHA512withRSA
  • ✅ EC407,      SHA256withRSA,       /RSA4096, SHA512withRSA
  • ✅ EC384,       SHA256withECDSA, /EC256,      SHA384withECDSA

certpath: SHA256withRSA

Server EC407,      SHA256withRSA,      /RSA4096, SHA512withRSA

  • ✅ RSA4096,SHA256withECDSA, /EC256,       SHA384with ECDSA
  • ❌RSA2048,SHA256withRSA,     /RSA4096,  SHA512withRSA
  • ❌ EC407,      SHA256withRSA,    /RSA4096, SHA512withRSA
  • ✅ EC384,       SHA256withECDSA, /EC256,      SHA384withECDSA

tls:SHA256withRSA

Server EC407,      SHA256withRSA,      /RSA4096, SHA512withRSA

  • ✅ RSA4096,SHA256withECDSA, /EC256,       SHA384with ECDSA
  • ❌RSA2048,SHA256withRSA,     /RSA4096,  SHA512withRSA
  • ❌ EC407,      SHA256withRSA,    /RSA4096, SHA512withRSA
  • ✅ EC384,       SHA256withECDSA, /EC256,      SHA384withECDSA

either: RSA

Server RSA4096,SHA256withECDSA, /EC256,       SHA384with ECDSA

All requests failed due to the server’s RSA.   Only 18 out of 50 cipher suites were available.  Server reported javax.net.ssl.SSLHandshakeException: no cipher suites in common

  • ❌RSA4096,SHA256withECDSA, /EC256,       SHA384with ECDSA
  • ❌ RSA2048,SHA256withRSA,     /RSA4096,  SHA512withRSA
  • ❌ EC407,      SHA256withRSA,      /RSA4096, SHA512withRSA
  • ❌EC384,       SHA256withECDSA, /EC256,      SHA384withECDSA

certpath: RSA keySize == 4096

Server RSA4096,SHA256withECDSA, /EC256,       SHA384with ECDSA

This was a surprise as I did not think this would work!

  • RSA4096,SHA256withECDSA, /EC256,       SHA384with ECDSA
  • RSA2048,SHA256withRSA,     /RSA4096,  SHA512withRSA
  • ❌EC407,      SHA256withRSA,      /RSA4096, SHA512withRSA
  • ✅EC384,       SHA256withECDSA, /EC256,      SHA384withECDSA

Summary of overriding.

You can specify restrictions in the server’s jdk.certpath.disabledAlgorithms and jdk.certpath.disabledAlgorithms. The restrictions apply to the how the certificate has been signed and the CA certificate.

You should check that the server’s certificate is not affected.

More details and what happens under the covers

The section below may be too much information, unless you are trying to work out why something is not working.

In theory jdk.tls.disabledAlgorithms and jdk.certpath.disabledAlgorithms are used for different areas of checking – reading certificates from key files, and what is passed during the handshake – but this does not seem to be true.  I found that it was best to put restrictions on both lines.

A certificate is of type RSA, EC, or DSA.

A certificate is signed for example Signature Algorithm: SHA256withECDSA.   This comes from the CA which signed it, message digest SHA256, and the CA is an Elliptic Curve.  See How do I create a certificate with Elliptic Curve (or RSA).

Signature Algorithms: is a combination of Hash Algorithm and Signature Type.   There are 6 hash algorithms: md5, sha1, sha224, sha256, sha384, sha512, and three types:  rsa, dsa, ecdsa.    These can be combined to to give 14 combinations of Signature Algorithms used in TLSv1.2

You can use java.security to control what TLS does.  On my Ubuntu this file /usr/lib/jvm/java-8-oracle/jre/lib/security/java.security.

This includes

  • jdk.certpath.disabledAlgorithms: Algorithm restrictions for certification path (CertPath) processing:  In some environments, certain algorithms or key lengths may be undesirable for certification path building and validation. For example, “MD2” is generally no longer considered to be a secure hash algorithm. This section describes the mechanism for disabling algorithms based on algorithm name
    and/or key length. This includes algorithms used in certificates, as well as revocation information such as CRLs and signed OCSP Responses.
  • jdk.tls.disabledAlgorithms: Algorithm restrictions for Secure Socket Layer/Transport Layer Security  (SSL/TLS) processing.  In some environments, certain algorithms or key lengths may be undesirable when using SSL/TLS. This section describes the mechanism for disabling algorithms during SSL/TLS security parameters negotiation, including protocol version negotiation, cipher suites selection, peer authentication and key exchange mechanisms.

I found you get better diagnostics if you put the restrictions on both statements.

The TLS Handshake (relating to java.security)

Server starts  up

  • I had 50 available cipher suites
  • Using -Djdk.tls.server.cipherSuites=…,… you can specify a comma separated list of  which cipher suites you want make available.  I recommend you do not specify this and use the defaults.
  • Using jdk.tls.disabledAlgorithm you can specify which handshake information is not allowed.  For example
    •  Any of the following would stop cipher suite TLS_ECDHE_ECDSA_WITH_AES_256_CBC_SHA384 from being used
      • java.security.tls = AES_256_CBC
      • java.security.tls= SHA384 – this loses 4 certificate TLS_ECDHE_ECDSA_WITH_AES_256_CBC_SHA384, TLS_ECDHE_RSA_WITH_AES_256_CBC_SHA384 etc
      • java.security.tls=TLS_ECDHE_ECDSA
    • Elliptic curves names Extension elliptic_curves, curve names: {secp256r1, secp384r1, secp521r1, sect283k1, sect283r1, sect409k1, sect409r1, sect571k1, sect571r1, secp256k1}.  Specifying EC keySize <= 521 would only allow {sect571k1, sect571r1} to be used
  • The server builds a supported list of cipher suites, signature algorithm, and elliptic curve names

Client starts up

  • As with the server, the client builds up a list of supported cipher suites, signature algorithms, and supported Elliptic curve names.
  • The client sends “ClientHello” and the list to the server.

Server processes

  • The server takes this list and iterates over it  to find the first acceptable certificate ( or for wlp, if <ssl … serverKeyAlias=”…” />, then the specified aliases  is used )
    • if the cipher suite name is like  TLS_AAA_BBB_WITH…   BBB must match the servers certificate type ( RSA, Elliptic Curve, DSA)
    • the signature algorithm.   This is the algorithm for encrypting the payload, and the algorithm for calculating the hash of the payload
    • if the certificate is EC,  check the  elliptic curve name is valid.  A  server’s certificate created with openssl ecparam -name prime256v1, would be blocked if EC keySize <= 256 was specified in the client resulting.
  • If no certificate was found in the trust store which passed all of the checks, it throws javax.net.ssl.SSLHandshakeException: no cipher suites in common , and closes the connection
  • The server sends “ServerHello” and the server’s public key to the client.
  • The server sends the types of certificate it will accept.   This is typically RSA, DSA, and EC
  • The server sends down the Elliptic Curve names it will accept, if present  – I dont think it is used on the java client
  • The server uses the jdk.certpath.disabledAlgorithm to filter the list of Signature Algorithms, and sends this filtered list to the client.
  • The server extracts the CAs and self signed certificates from the trust store and sends them down to the client.
  • The server sends “ServerHelloDone”, saying over to you to respond.

Client processing:

  • Checks the server’s certificate is valid, including
    • Checks the public certificate of the servers CA chain is allowed according to the client’s jdk.certpath.disabledAlgorithm..  So jdk.certpath.disabledAlgorithm =…, SHA256withECDSA would not allow a  server’s certificate with Signature algorithm:SHA256withECDSA .
  • The client takes the list of certificate types ( RSA, DSA, EC), and CAs and iterates over the keystore and selects the records where
    • the certificate type in the list
    • the signature algorithm is in the list
    • the certificate signed by one of the CA’s in the list
  • Displays this list for the end user to select from.  It looks like the most recently added certificate is first in the list.
  • The client sends “Finished” and  the selected certificate to the server

Server processing

  • The server checks the certificate and any imbeded CA certificates from the client matches.
    • Checks the signature algorithm
    • Checks the constraints, for example RSA keySize < 2048
    • Checks the certificate and CA are valid

End of handshake.

 

This is a useful link for describing the java.security parameters.

This specification  describes the handshake, with the “ClientHello” etc.