A practical path to installing Liberty and z/OS Connect servers – 6 Enabling TLS

Introduction

I’ll cover the instructions to install z/OS Connect, but the instructions are similar for other products. The steps are to create the minimum server configuration and gradually add more function to it.

The steps below guide you through

  1. Overview
  2. planning to help you decide what you need to create, and what options you have to choose
  3. initial customisation and creating a server,  creating defaults and creating function specific configuration files,  for example a file for SAF
  4. starting the server
  5. enable logon security and add SAF definitions
  6. add keystores for TLS, and client authentication
  7. adding an API and service application
  8. protecting the API and service applications
  9. collecting monitoring data including SMF
  10. use the MQ sample
  11. using WLM to classify a service

With each step there are instructions on how to check the work has been successful.

Configuring TLS

  1. You can configure the server to creates a keystore file on its first use. This creates a self signed certificate. This is good enough to provide encryption of the traffic. Certificates sent from the client are ignored as the trust store does not have the Certificate Authority certificate to validate them.
  2. You can use your site’s keystore and trust store. The server can use them to process certificate sent from the client for authentication.

Decide how you want to authenticate

Most of the functions require an https connection. This will require a keystore.

You can decide if

  1. The server uses the client’s certificate for authentication,
    1. if that does not work then use userid and password
    2. if that does not work, then fail the request; there is no fall back to userid and password.
  2. The server does not use the clients certificate.
    1. You can configure that userid and password will used for authentication
    2. There is no authentication

Have the server create a keystore.

You can get Liberty to create a keystore for you. This creates a self signed certificate and is used to encrypt the traffic between client and server. This is a good start, while you validate the set up, but is not a good long term solution.

Create keystore.xml with

<server>
<keyStore id="defaultKeyStore" password="${keystore_password}" /> 

<ssl clientAuthentication="false" 
    clientAuthenticationSupported="false" 
    keyStoreRef="defaultKeyStore" 
    id="defaultSSLSettings" 
    sslProtocol="TLSv1.2" 
/> 
</server>

Add to the bottom of the server.xml file

 <include location="${server.config.dir}/keystore.xml"/>

If you have keyStore id=”defaultKeyStore”, (it must be defaultKeyStore) and do not have a keystore defined, the the server will create the keystore in the default location (${server.output.dir}/resources/security/key.p12) with the password taken from the server.env file.  See here.

Restart the server.

I got the messages

CWWKO0219I: TCP Channel defaultHttpEndpoint-ssl has been started 
and is now listening for requests on host 10.1.3.10  
(IPv4: 10.1.3.10) port 9443.

Showing TLS was active, and listening on the 9443 port.

If the keystore was created, you will get messages like

[AUDIT   ] CWPKI0803A: SSL certificate created in 87.578 seconds. 
SSL key file: /var/zosconnect/servers/d3/resources/security/key.p12 
[INFO    ] Successfully loaded default keystore: 
/var/zosconnect/servers/d3/resources/security/key.p12 of type: PKCS12

The certificate has a problem (a bug). It has been generated with CN:localhost, O:ibm: ou:d3 where d3 is the server name. The Subject Alternative Name (SAN) is DNS:localhost. It should have a SAN of the server’s IP address (10.3.1.10 in my case).

Clients check the SAN and compare it with the server’s IP address.

  1. Chrome complain. “Your connection is not private NET:ERROR_CERT_AUTHORITY_INVALID”, and the option to accept it
  2. Firefox gives “Warning: Potential Security Risk Ahead”, and the option to accept it.
  3. Z/OS explorer gives a Server certificate alert pop up, saying “Host:10.1.3.10 does not match certificate:localhost” and gives two buttons Decline or Accept.
  4. With curl I got SSL_ERROR_SYSCALL.

You can accept it, and use it until you have your own keystores set up. You can also reset this decision.

Using a RACF keyring as the keystore.

You can use a file based keystore or a RACF keying.  Below are the definitions for my RACF keyrings. The started task userid is START1. The keystore (containing the private key for the server is keyring START1/KEY. The server should use key ZZZZ.

The trust store, containing the Certificate Authority certificates and any self signed certificates from clients, is START/TRUST.

The <ssl.. /> points to the different keystores, so it makes sense to keep all these definitions in one file.  You may already have a file of these definitions which you can use from another Liberty server.

<server>

<sslDefault sslRef="defaultSSLSettings"/> 
<ssl clientAuthentication="true" 
    clientAuthenticationSupported="true" 
    id="defaultSSLSettings" keyStoreRef="racfKeyStore"  
    serverKeyAlias="ZZZZ" 
    sslProtocol="TLSv1.2" 
    trustStoreRef="racfTrustStore"/> 
                                                                                                                  
  <keyStore filebased="false" id="racfKeyStore" 
     location="safkeyring://START1/KEY" 
     password="password" 
     readOnly="true" 
     type="JCERACFKS"/> 
                                                                                                                  
  <keyStore filebased="false" id="racfTrustStore" 
     location="safkeyring://START1/TRUST" 
     password="password" 
     readOnly="true" 
     type="JCERACFKS"/>                                                                                                                  
</server>

This sets clientAuthentication=”true” and clientAuthenticationSupported=”true”

Specify if you want to use a client certificate for authentication

If you specify clientAuthenticationSupported=”true”… the server requests that a client sends a certificate. However, if the client does not have a certificate, or the certificate is not trusted by the server, the handshake might still succeed.

The default keystore will not be able to validate any certificates sent from the client. When connecting to Chrome with certificates set up, I got an FFDC and messages

  • [INFO ] FFDC1015I: An FFDC Incident has been created: “java.security.cert.CertPathBuilderException: PKIXCertPathBuilderImpl could not build a valid CertPath.; internal cause is: java.security.cert.CertPathValidatorException: The certificate issued by CN=SSCA8, OU=CA, O=SSS, C=GB is not trusted; internal cause is: java.security.cert.CertPathValidatorException:
  • [ERROR ] CWWKO0801E: Unable to initialize SSL connection. Unauthorized access was denied or security settings have expired.

If you specify clientAuthentication=”false” (the default) the server does not request that a client send a certificate during the handshake.

If you specify <webAppSecurity allowFailOverToBasicAuth=”true” />  the client certificate connection is not used or it fails,

  1. if  you specify<webAppSecurity allowFailOverToBasicAuth=”true” /> the user will be prompted for userid and password
  2. If you specify <webAppSecurity allowFailOverToBasicAuth= false > or not specified, the connection will fail.

If a userid and password can be used, the first time a browser uses the server it will be prompted for userid and password. As part of the handshake, the LTPA2 cookie is sent from the server. This has the userid and password encrypted within it. If you close down the browser and restart it (not just restart it from within the browser) you will be prompted again for userid and password. You can also be prompted for userid and password once the LPTA cookie has expired.

If you are using z/OS explorer and get a code 401, unauthorised, you may be using a certificate credential ( format userid@CertificateAuthority(CommonName)) rather than a userid and password with format of just the userid eg COLIN. Use “Set Credentials” to change credentials.

You can see what userid is being used for the requests, from the …/logs/http_access.log file.

To make it even more complex you can have different keystores for different connections or ports.  See here. But I would not try that just yet.

Map client certificates to a SAF userid

If you are using certificate authentication you will need to map the certificate to a userid using the RACDCERT MAP command.

Testing it

If the server starts successfully you can use a web browser with URL

  http:/10.1.3.10:9443/zosConnect/api-docs

and it should display json data.

If you get “Context Root Not Found” or code 404 you should wait and retry, as the https processing code is active, but the code to process the requests is not yet active.

Review the contents of …/servers/…/logs/http_access.log to see the request being issued and the http completion code.

If you have problems connecting clients over TLS add -Djavax.net.debug=ssl:handshake to the jvm.options file and restart the server.

If you connect to the z/OS Explorer, and logon to the z/OS Connect EE Server, you should have a folder for APIs and Services – which may have no elements.

A practical path to installing Liberty and z/OS Connect servers – 7 adding apis and services

Introduction

I’ll cover the instructions to install z/OS Connect, but the instructions are similar for other products. The steps are to create the minimum server configuration and gradually add more function to it.

The steps below guide you through

  1. Overview
  2. planning to help you decide what you need to create, and what options you have to choose
  3. initial customisation and creating a server,  creating defaults and creating function specific configuration files,  for example a file for SAF
  4. starting the server
  5. enable logon security and add SAF definitions
  6. add keystores for TLS, and client authentication
  7. adding an API and service application
  8. protecting the API and service applications
  9. collecting monitoring data including SMF
  10. use the MQ sample
  11. using WLM to classify a service

With each step there are instructions on how to check the work has been successful.

Adding APIs and Services

For services you have to include a feature. For example with mq you need

 <feature>zosconnect:mqService-1.0</feature>

These are listed here.   You need the CICS service to be able to use CICS applications etc.

You also need to copy the files application (*.aar) files into the apis directory, and the service(*.sar) files into the services directory.

There are several ways of doing this.

You can do this with the uss cp command. The command below copies an MQ one – you can use others

Find what applications and services are available

find  /usr/lpp/IBM/zosconnect/v3r0beta/runtime/* -name *.aar
find  /usr/lpp/IBM/zosconnect/v3r0beta/runtime/* -name *.sar   

copy the files to the servers directory

cp /usr/lpp/IBM/zosconnect/v3r0beta/runtime/templates/servers
  /sampleMqStockManager/resources/zosconnect/apis/*
  /var/zosc*/ser*/default/re*/z*/ap*                                                                                   
cp /usr/lpp/IBM/zosconnect/v3r0beta/runtime/templates/servers
  /sampleMqStockManager/resources/zosconnect/ser*/* 
  /var/zosc*/ser*/default/re*/z*/se*                                                                                        

Use a REST request or use the zOS explorer.

Once you have done this either restart the server or use the operator command

f ...zcon,refresh

and the application and services should be available

You should get in the message.log on the STDOUT (if you have configured it to print information messages – see earlier posts).

BAQR7000I: z/OS Connect EE API archive file stockmanager installed successfully.
BAQR7043I: z/OS Connect EE service archive stockQuery installed successfully.

Using a web browser, or curl with the URL https://10.1.3.10:9443/zosConnect/services gave me

{"zosConnectServices":
  [
    {"ServiceName":"stockQuery",
     "ServiceDescription":"A stock query service based on IBM MQ.",
     "ServiceProvider":"IBM MQ for z/OS",
     "ServiceURL": "https://10.1.3.10:9443/zosConnect/services/stockQuery"
      }
  ]
}

URL https://10.1.3.10:9443/zosConnect/apis gave me

{"apis":    
  [
    {"name":"stockmanager","version":"1.0.0","description":"",
     "adminUrl":"https://10.1.3.10:9443/zosConnect/apis/stockmanager"
     }
   ]
}

You can invoke the service with https://10.1.3.10:9443/zosConnect/services/stockQuery and get details of the stockQuery service.

Once it all works…

I played with the curl interface to deploy a service.

  • I copied the stockQuery.sar file down to linux in binary
  • I could use jar -tvf stockQuery.sar to display the contents
  • I used curl –insecure -v –header Content-Type:application/zip -i –cacert cacert.pem –cert adcdd.pem:password –key adcdd.key.pem -X post –data-binary @/home/zPDT/stock2.sar https://10.1.3.10:9443/zosConnect/services
    • Note:  If I used –data-binary “@/home/zPDT/stock2.sarin quotes I got
      • HTTP/1.1 415 Unsupported Media Type
      • “errorMessage” : “BAQR0418W: An unsupported media type of application/x-www-form-urlencoded was specified.”}
    • because the string was sent up – not the file.  I used the –verbose option to see the number of bytes sent.
    • I also got these messages if the file was not found.  It sends the string instead!

A practical path to installing Liberty and z/OS Connect servers – 8 protecting APIs and services

Introduction

I’ll cover the instructions to install z/OS Connect, but the instructions are similar for other products. The steps are to create the minimum server configuration and gradually add more function to it.

The steps below guide you through

  1. Overview
  2. planning to help you decide what you need to create, and what options you have to choose
  3. initial customisation and creating a server,  creating defaults and creating function specific configuration files,  for example a file for SAF
  4. starting the server
  5. enable logon security and add SAF definitions
  6. add keystores for TLS, and client authentication
  7. adding an API and service application
  8. protect the API and service applications
  9. collecting monitoring data including SMF
  10. use the MQ sample
  11. using WLM to classify a service

With each step there are instructions on how to check the work has been successful.

Protect the service and APIs

z/OS connect provides interceptors to allow the product function to be extended. These are like exits in other program products.

z/OS Connect provides (at least)  2 interceptors

  1. For authorisation checks, to see if a userid is allowed to perform an operation.
  2. Creating SMF records.

You can also write your own interceptors, for example for data validation, or for collecting statistics.

You can configure APIs and services to have a list of interceptors. One service can have authorisation and SMF records, another service can just have authorisation

You create a list like

<!-- Interceptor list configuration -->
<!-- this refers to the configuration elements following -->
<zosconnect_zosConnectInterceptors 
   id="interceptorList1" 
   interceptorRef="auditInterceptor,zosConnectAuthorizationInterceptor"
/>

<!-- Audit interceptor configuration -->
<zosconnect_auditInterceptor 
   id="auditInterceptor" 
   sequence="1"    
   apiProviderSmfVersion="2"
/>
<!-- Authorisation checking --> 
<zosconnect_authorizationInterceptor 
    id="zosConnectAuthorizationInterceptor"
/> 

To protect the server, and control the global roles, have you need to use the following where you provide lists of group names such as SYS1.

 <zosconnect_zosConnectManager 
     globalInterceptorsRef="interceptorList1" 
     globalAdminGroup="SYS1,SYSADMIN" 
     globalInvokeGroup="SYS1" 
     globalOperationsGroup="SYS1" 
     globalReaderGroup="SYS1" 
       /> 
<!-- "interceptorList1" above points to …  -->
 <zosconnect_zosConnectInterceptors 
     id="interceptorList1" 
     interceptorRef="IR1,..."/>

<!--  zosConnectAuthorizationInterceptor is defined    -->
 <zosconnect_authorizationInterceptor 
     id="IR1"/>

This shows the global security definitions. The globalInterceptorsRef=”interceptorList1″ points to the <zosconnect_zosConnectInterceptors .. which in turn points to the <zosconnect_authorizationInterceptor . There is a program or interceptor zosConnectAuthorizationInterceptor which does the actual checking of userid and roles.

With this set of definitions when I try to query the service using an unauthorised userid, I got

{"errorMessage":"BAQR0435W: The zosConnectAuthorization interceptor 
  encountered an error while processing a request. ",
"errorDetails":"BAQR0409W: User ADCDC is not authorized to 
  perform the request."}

I changed the definitions to globalReaderGroup=”TEST” , refreshed the configuration, and the request worked.

You can make API security more specific

 <zosconnect_zosConnectAPIs> 
   <zosConnectAPI name="stockmanager" 
     adminGroup="SYS1" 
     invokeGroup="TEST" 
     operationsGroup="TEST" 
    readerGroup="SYS1" 
    /> 
 </zosconnect_zosConnectAPIs>

and make the service security more specific.

<zosconnect_services> 
   <service name="stockQuery" 
     serviceDescription="stockQueryServiceDescriptionColin" 
     id="stockQueryService" 
     adminGroup="SYS1,TEST2" 
     invokeGroup="TES2" 
     operationsGroup="SYS1" 
    readerGroup="SYS1,TEST2" 
    /> 
</zosconnect_services> 

If you use the swagger to try it – and get the json data with

response body no content
response code 0
response header { “error”: no response from server}

This is what Swagger UI displays when a request fails due to a security issue such as an untrusted self-signed cert, invalid cert, or bad username:password.

A practical path to installing Liberty and z/OS Connect servers – 9 collecting monitoring data

Introduction

I’ll cover the instructions to install z/OS Connect, but the instructions are similar for other products. The steps are to create the minimum server configuration and gradually add more function to it.

The steps below guide you through

  1. Overview
  2. planning to help you decide what you need to create, and what options you have to choose
  3. initial customisation and creating a server,  creating defaults and creating function specific configuration files,  for example a file for SAF
  4. starting the server
  5. enable logon security and add SAF definitions
  6. add keystores for TLS, and client authentication
  7. adding an API and service application
  8. protecting the API and service applications
  9. collecting monitoring data including SMF
  10. use the MQ sample
  11. using WLM to classify a service

With each step there are instructions on how to check the work has been successful.

Monitoring data

You can collect SMF data and/or Http audit data to get reports on the performance and usage of your system.

There are some queries – you can issue via a REST query – but you do not get much data back.

http_access.log

If you have configured <httpEndpoint   ..  accessLoggingRef=…>  you can collect audit information for traffic through that end point.  If you have more than one httpEndpoint, for example with a different port, you can collect different information, or log it to a different file.

The information you can log (see here for a full description) includes

  • the client IP address
  • the userid
  • date time
  • the service requested
  • the response code
  • bytes sent and received
  • the response time ( in seconds, milliseconds, or microseconds)

You can include delimiters (for example quotes around a string, or !.. !)  in the output to simplify post processing.

If you have high throughout, this solution may not scale, and SMF may be a better solution.

Collecting SMF records

You can collect SMF 120 records from the Liberty base, and SMF 123 records from z/OS connect.

To collect SMF 120 records you need to add

<featureManager> 
    <feature>zosRequestLogging-1.0</feature> 
</featureManager>

to your configuration.

SMF 123 records are produced by another interceptor (exit). You need to define it, and add it to the list of global, API or service list of interceptors.

Configure the auditInterceptor

<zosconnect_auditInterceptor id="auditInterceptor" 
   sequence="1" 
   apiProviderSmfVersion="2"/>

and add it to the list of the intereptors

<zosconnect_zosConnectInterceptors 
    id="interceptorList1" 
    interceptorRef="zosConnectAuthorizationInterceptor,auditInterceptor"
/>

For both record types, the server started task needs access to the BPX.SMF class.

PERMIT BPX.SMF CLASS(FACILITY) ACCESS(READ) ID(USERID)
setropts raclist(facility) refresh

If the server does not have this permission it will get an FFDC with

Stack Dump = java.io.IOException: Failed to write SMF record, __smf_record errno/errno2 return code 139

Processing SMF 120. You can download SMF Browser for WebSphere Application Server for z/Osfrom

This is a java “formatter” here which provides just a dump of the records, and so is not very usable.

I wrote a formatter for this to summarise key information ( and ignore irrelevant stuff).   I’ll put this up on github when Ive got it documented.

Some of the interesting data is

  • Request start and stop time for example 2020/09/26 16:35:42.977709, from which you can calculate request duration
  • CPU for the request
  • The userid
  • The URI /zosConnect/services/stockQuery
  • TCPIP Origin and port 10.1.1.1 (33546) into the server port (9443)
  • Sysplex, LPAR, Server name, Server job number, level,

I took the data and accumulated it, so I could see which requests used all of the CPU, and report it by hour, and userid.

Processing SMF 123.

z/OS connect provides a sample C program, and JCL to compile it.   See here.

The SMF 123 records are written when the z/OS Connect server shuts down, or when the SMF buffer is full, so there is a risk that data from today, is not produced until tomorrow because there was no activity overnight.

I typically got about 20 services/APIs per SMF record.

Combing the records

I could not see how to correlate the SMF 123 and the SMF 120 records.   This would be useful to get the CPU used by each API or service.

Rest request

This page describes how to get REST statistics.  For example

https://10.1.3.10:9443/zosConnect/operations/getStatistics

This returned

{"zosConnectStatistics":
  [
   {"stockQuery":
     {
       "ServiceProvider":"IBM MQ for z\/OS",
       "InvokeRequestCount":21,
       "TimeOfRegistrationWithZosConnect":
       "2020-10-01 14:52:26:049 BST",
       "ServiceStatistics":{}
    }
   }
  ]
}

With nothing in the ServiceStatistics{}.

You can ask for a specific service https://10.1.3.10:9443/zosConnect/operations/getStatistics?service=stockQuery.  You get the same data back as above.

I could not find how to get information on APIs.

You can get real time statistics data see here.

I had

<zosconnect_zosConnectManager
globalInterceptorsRef=”interceptorList1″
globalAdminGroup=”TEST”
globalInvokeGroup=”SYS1″
globalOperationsGroup=”TEST”
globalReaderGroup=”TEST”
/>
<zosconnect_authorizationInterceptor id=”zosConnectAuthorizationInterceptor”/>
<zosconnect_auditInterceptor id=”zoscauditInterceptor” sequence=”1″ apiProviderSmfVersion=”2″/>
<auditInterceptor id=”auditInterceptor” sequence=”1″/>
<zosconnect_zosConnectInterceptors
id=“interceptorList1”
interceptorRef=”zosConnectAuthorizationInterceptor,auditInterceptor,zoscauditInterceptor “/>

A practical path to installing Liberty and z/OS Connect servers – 2 Planning

Introduction

I’ll cover the instructions to install z/OS Connect, but the instructions are similar for other products. The steps are to create the minimum server configuration and gradually add more function to it.

The steps below guide you through

  1. Overview
  2. planning to help you decide what you need to create, and what options you have to choose
  3. initial customisation and creating a server,  creating defaults and creating function specific configuration files,  for example a file for SAF
  4. starting the server
  5. enable logon security and add SAF definitions
  6. add keystores for TLS, and client authentication
  7. adding an API and service application
  8. protecting the API and service applications
  9. collecting monitoring data including SMF
  10. use the MQ sample
  11. using WLM to classify a service

With each step there are instructions on how to check the work has been successful.

Planning

Summary checklist

  1. Allocate HTTPS and HTTP ports
  2. Decide how many Started Task procedures you need – and what to call them.
  3. Decide where to install the product
  4. Where to put the server’s home directory – and how much space to allocate
  5. What Angel task will be used – do you need to create a task or use an existing one
  6. Security
    1. Can you share the profile prefix or do you need to allocate a new one
    2. Do you need to set up a new ejbrole profiles
    3. Decide what groups can access the ejbrole profile
    4. Decide what groups can access the global roles
    5. Decide what groups can have API and Service specific roles
  7. What SMF data do you want to collect
  8. Do you want to use WLM to classify the priority that URLs get?

TCPIP Port

Most of the work with Liberty is done with an HTTPS port. However most sites allocate an HTTP and an HTTPS port.  The default ports, http:9081 and https:9443, may already be in use by another Liberty instance.

You can see if a port is in use by using the command

tso netstat allconn tcp tcpip ( port 9081

If the port is in use, it will report the job name.

Customising the JCL

There will be updates to the SYS1.PROCLIB concatenation, and some security definitions to be done. If you have the authority, you can make these changes yourself. If not, you will need to do some planning, and request the changes.

Where does the executable code go?

Products are usually installed in /usr/lpp file path.

If you intend to have only one version of the product installed at a time, you can create a directory /usr/lpp/IBM/zosconnect/v3r0 and mount the product file system over this directory.

If you only plan to use more than one version in parallel, you can create /usr/lpp/IBM/zosconnect/v3r0beta and mount the beta file system over it.

I found it convenient to define an alias /usr/zosc to /usr/lpp/IBM/zosconnect/v3r0beta/bin. By changing the alias I could easily switch between versions, and had less typing!

How many JCL procedures do I need to create?

There are two ways of defining multiple servers.

  1. You have one JCL procedure and pass the server name as a parameter.
S BAQSTART,Parms=’server1’
S BAQSTART,Parms=’server2’

Note: If you use the z/OS command STOP BAQZSTRT then both servers will stop.

If you use the same JCL procedure for different servers you can use

S BAQSTART,Parms=’server1’,jobname=ZERVER1
S BAQSTART,Parms=’server2’jobaname=ZERVER2

and use the stop command P ZERVER1 to stop just the first one.

You can use WLM to classify ZERVER1 and ZERVER2 and give them different service classes.

  1. You can use a different JCL procedure for each server.
S BAQSTRT1,parms=”server”
S BAQSTRT2,parms=”server”

You can also issue S BAQSTRT1,parms=”server”,jobname=ZERVER1

I can see no major advantage either way.  Having one started task JCL per server means more JCL to support but you can upgrade the servers one at a time.

You could also set up the procedure so you use

S BAQSTRT1,parms=”server”,WLP="/u/zosc"

Server file system.

Each server has a “home” directory. This contains

  1. server configuration files – the servers only reads these files.
  2. a log directory where the server writes log files, trace files, and and FDC failure events.

You may want each server to have its own file system, so if it produces a lot of output and fills up the file system, it does not impact other servers using the same file system.

You might start with one file system shared by many servers, and move to dedicated file systems before going into production.

The default file system in the zOS Connect documentation is /var/zosconnect ; this cannot be shared across LPARs. You might want to create and use /u/zosc as a shared file system, and use /u/zosc/server1 etc. The Liberty shared directory would be /u/zosc/shared.

Before you decide where you put your server’s files you need to think about what your environment could be in a years time.

If you want to have more than one server using a shared configuration, you can include files into the server.xml file. Shared files could be keystore definitions, or security definitions, and these need to be on a shared file system.

Some file systems are specific to an LPAR and not shared, (/var/ /etc/tmp, /dev), other file systems can be shared across the SYSPLEX.

Include common configuration into the server.xml file

When you include configuration files (in server.xml)  the syntax is like

<include location="/u/zosc/servers/stockManager/mq.xml"/>
<include location=”${shared.config.dir}/security.xml”/> 
<include location="${server.config.dir}/saf.xml"/> 
<include location="${COLIN}/servers/d2/jms.xml"/>
<variable name="colin2" value="/ZZZ/zosconnect/"/>
<include location="${colin2}/servers/d3/jms.xml"/>  

Where you can

  • give the explicitly file path name,
  • use a Liberty property ${server.config.dir} which says in the servers directory,
  • use the Liberty property ${shared.config.dir} which points to a shared directory within the server’s environment.
  • Use an environment variable COLIN defined as
    • //STDENV DD *
    • COLIN=COLINJCL
  • Create and use, your own property – colin2
  • or combinations of these.

If you get the location wrong, it is easy to change, and to move the configuration files to a new directory.

As you move changed from test through to production you may want to use the same server.xml and included files.  If so, you could set an environment variable in the JCL whose value depends on the LPAR.

How much disk space is needed?

The configuration files do not need much disk space. If you use the trace capability then the trace files can be large, and have many of them , but you can control the number and size of the logs and traces. FDC’s are also stored in the file system, and these can also be large, and you may get a lot of them. ZFS can automatically expand the file system – and your automation can respond to the ZFS message on the console to notify you that your file system is filling up.

If the JVM abends, you can get SDUMPS taken. On my machine they were taken with the HLQ of the started task (START1).

Angel task needed

You need an ANGEL task to support authorised services. You can have only one unnamed Angel per LPAR. You need to decide if your server can use this, or if your server needs its own, named Angel.

You should use the Angel at the latest service level. If servers share an Angel, and the Angel is running back level, you will get a message informing you.

You configure the Liberty instance to point to a named Angel.

Planning for security.

Liberty requires a RACF APPL profile prefix set up. The default profile prefix is BBQZDFLT. This name is used as a prefix to the RACF profile which allows users to access Liberty. For example in the EJBROLE class

BBQZDFLT.zos.connect.access.roles.zosConnectAccess

To provide isolation, and security you may want to use a different profile prefix for different groups of servers. For example you may want to isolate MQWEB from z/OS Connect, and from WebSphere Application Servers.

In summary, there are three level of security

  1. A userid needs access to EJBPROF profile (above) to get access to the z/OS connect instance.
  2. There is Global access, with four predefined roles. You specify a list of groups and Liberty checks to see if the userid is a member of the groups. This is not a SAF check. This checking is done in an interceptor (exit) which you specify.
  3. You can specify security at the API or service level. This checking is done in an interceptor (exit) which you specify.

You will need to set up an EJBPROF profile and permit groups to connect to the server.

Once a user has access to the server, there is another layer of security with categories:

  • globalAdminGroup – Identifies the users that are able to use administrative functions on all APIs, services, service endpoints and API requesters.
  • globalOperationsGroup -Identifies the users that are able to perform operations such as starting, stopping or obtaining the status of all APIs, services, service endpoints and API requesters.
  • globalInvokeGroup – Identifies the users that are able to invoke all APIs, services, service endpoints and API requesters.
  • globalReaderGroup –Identifies the users that are able to get lists of, or information about, all APIs, services, service endpoints and API requesters, including Swagger documentation.

You can refine the security for the APIs, Services, and Service endpoints, using tags like

<zosconnect_services…

  • adminGroup
  • operationsGroup
  • invokeGroup
  • readerGroup

To be able to operate a service or API, you need to be in both globalOperationsGroup, and in the operationsGroup lists of groups.

If you have different applications within a server, you need to be careful how you set up the security profile. If someone is authorised through the global* profile  to operate service A, and you add service B, then by default the person will be allowed to operate service B. You need to define the zosconnect_services for service B, and specify the operationsGroup to restrict access to service B.

Because of this, you need to consider if you need separate default prefix for the servers to give application isolation from a security perspective.

During this planning stage you need to plan the default prefix you will be using, the groups of users for the different roles, and if you want to use both global and API/services level authorisation checks.

If you change the configuration and change the groups in the configuration, you an activate the change using the

f ….zcon,refresh

operator command.

Unauthenticated user.

When Liberty uses SAF to authenticate, it requires an Unauthenticated User which is usually “WSGUEST”. This userid can be used for all Liberty instances.

Liberty does most of its work using a https connection. If you specify some particular options, the server can set up a default keystore. This is fine while you are setting up – but not for the long term, as it does not validate certificates sent from clients.

You will need to set up a keystore to provide the server with a private certificate. You will need a trust store which contains the Certificate Authority and any client self signed certificates.   The keystores and truststores can be shared by all servers.

You can have different keystores depending on the IP address or port. See https://www.ibm.com/support/knowledgecenter/SSEQTP_liberty/com.ibm.websphere.wlp.doc/ae/rwlp_ssl_outbound_filter.html. I suggest you do not do this until you have basic TLS working.

SMF

Liberty can produce SMF 120 records. There are no good tools freely available to provide reports on usage.

Z/OS connect can produce SMF data record type 123. You will need to collect it. Some samples are provided to print out the data. There are no good tools to provide reports on usage.

Classifying request using WLM.

You can classify request to give priorities to particular services.  See here. You do not need to decide on the classification until the server is operational, and the services are available.  Essentially you configure services as a transaction class, then use WLM to classify the transaction class within the server.

<httpClassification transactionClass="TCIC" method="GET" 
resource="/catalogManager/items"/>

 

A practical path to installing Liberty and z/OS Connect servers – 1 Overview

The instructions I have seen for installing products based on Liberty that seem to be written as if there would only be one server; one server on the LPAR, and one server in the whole SYSPLEX. In reality you are likely to have the “same” server running on multiple LPARS sharing configuration to provide availability, and have more than one server running on an LPAR, for example MQWEB, WAS, z/OSMF, and z/OS connect. The series of blog post below are to help you implement multiple servers, across a sysplex.

Some of the areas not adequately addressed by the IBM product documentation include

  1. Sharing of definitions
  2. Sharing of keystore and trust stores
  3. Providing isolation, to prevent someone who has access to MQWEB from accessing Z/OS Connect.
  4. How many Angel tasks do I need – can one be shared?
  5. Some areas such as TLS can be hard to get working.

I’ll cover the instructions to install z/OS Connect, but the instructions are similar for other products. The steps are to create the minimum server configuration and gradually add more function to it.

The steps below guide you through

  1. Overview
  2. planning to help you decide what you need to create, and what options you have to choose
  3. initial customisation and creating a server,  creating defaults and creating function specific configuration files,  for example a file for SAF
  4. starting the server
  5. enable logon security and add SAF definitions
  6. add keystores for TLS, and client authentication
  7. adding an API and service application
  8. protecting the API and service applications
  9. collecting monitoring data including SMF
  10. use the MQ sample
  11. using WLM to classify a service

With each step there are instructions on how to check the work has been successful.

I wrote the blog post  How many servers do I need? Every one know this – or no one knows this. when I was first thinking about planning my servers.

Question: What time is it in year 2k42? Answer:time to be retired

Do you remember the Y2K problem where the date rolled into 2000?
I had to fly to the US on Jan 1st 2000, so I could be on site in case there were problems with a large bank running on the mainframe.   I have two memories

  • the vending machines had a message like Out of Cheese Error. Redo From Start. and would not vend.
  • someone had been taken to hospital with gunshot wounds, because people celebrated the new millennium by firing their guns up into the air, and what goes up, must come down, and if you are in a crowded street…
There is another year 2K type problem coming, it is when the System 390 clock wraps.  It is a 8 byte field.  When I was writing statistics and accounting code for MQ on z/OS, you time an event by issuing the STCK instruction before something,  STCK again afterwards and calculate the difference.
To solve this problem there is the STCK extended instruction which is 16 bytes – essentially there is one byte in front of the existing STCK, and some space space at the end.  So problem solved?   Not quite.
There are many control blocks with a field for the 8 byte STCK value.  If this is changed to a 16 byte STCKE field then the offsets of all the fields will change.   This is OK with a small program, but not for the operating system, where fields are fixed “for architecture reasons” to allow people to rely on the location of these fields.
Many products depend on a STCK to create a unique identifier, and given two STCK values you can tell which was created first – even across IPLs.  Changing this to use a STCKE will cause a migration and coexistance problem.
Some SMF records have a STCK to say when an event occurred,  the report processing may need logic to say if the value is small – then add 2042 years to it.
I had a routine which formatted a STCK into YY/MM/DD hh:mm:ss.tttttt.   This will no longer work, as the STCK(e) is now 9 bytes long.
Do you need to worry about this?  Not really – IBM will fix the operating system and products, vendors will fix their products.  It is down to your programs, and most people do not use the STCK values.  If you do use STCK I suggest you locate all references to STCK and put the operations in macros.  Then when you have to change the code – you change the macros, recompile the programs,  and with a bit of magic, and a good wind you’ll have no problems – just make sure you feed the flying pigs first.
The alternative is to retire and let someone else worry about it.

How do I format a STCK from a C program?

I’ve been writing a program which process SMF data which has STCK values for dates of events, and STCK values for durations.
In assembler there is a STCKCONV function which takes a STCK ( or STCKE) and converts this to printable date and time, for example 2020/09/21 09:19:18.719641  .

I wrote some code (at the bottom) to call the assembler routine to do the work.  it did not work for 64 bit programs.

I had some inspiration in the middle of the night for a much simpler way of doing it.

Quick digression.   There are 8 byte STCK values and 16 Byte STCKE which have an extended time stamp to handle the 2042 problem when a STCK will overflow.  A STCKE is the first 9 characters of the STCKE with an extra “overflow” byte at the front.

 

Simple way just using C – should work in 31 and 64 bit.

There are C routines for processing times.   For example gmtime (time) take unix time and returns a structure with pointers to the year, month etc.

The unix time is the time in seconds since 00:00:00 January 1 1970.

So to use the C routines, take a STCK(E) convert it to seconds – and subtract the number of seconds which represents midnight January 1 1970.

The logic takes an 8 byte string, shifts it right by 12 bits to get the microseconds bit into the bottom bit, calculates the number of seconds, and returns it.

typedef unsigned long long * ull; 
void STCKTM(char * pData, struct  timespec  * pTimespec) { 
      unsigned long long stck  = *(ull) pData; 
      stck = stck/4096; // 4096 for stck to get microseconds 
                        // in bottom bit  
      long  microseconds = stck%1000000; // save microseconds
      stck = stck/1000000;  // seconds from microseconds 
      stck = stck - 2208988800; // number of seconds to Jan 1 1970 
      pTimespec -> tv_sec  = stck; 
      pTimespec -> tv_nsec  = microseconds * 1000; 
    } 

You call this with

struct tm * tm2; 
   struct tm * tm2; 
  struct  timespec ts; 
   STCKETM((char *) headtimeZCentry, &ts ); 
   tm2= gmtime( &ts.tv_sec  ); 
   printf("GMTIME yy:%d mm:%d dd:%d h:%d m:%d s:%d\n", 
     1900+tm2->tm_year, 
     1+tm2->tm_mon, 
     tm2->tm_mday, 
     tm2->tm_hour, 
     tm2->tm_min, 
     tm2->tm_sec); 

This produces

GMTIME yy:2020 mm:9 dd:21 h:9 m:19 s:18 

For STCKE to TM.  The logic is nearly identical. The 9 byte string only needs to be shifted 4 bits to align the microseconds to the bottom bit.

void STCKETM(char * pData, struct timespec * pTimespec){ 
      unsigned long long stck  = *(ull)               pData; 
      stck = stck/16  ; // 4096 for stck 16 fot stcke as already 
                        // shifted by definion 
      long  microseconds = stck%1000000; 
      stck = stck/1000000;  // seconds from microseconds 
      stck = stck - 2208988800; // number of seconds to Jan 1 1970 
        pTimespec-> tv_sec  = stck; 
        pTimespec ->tv_nsec  = microseconds * 1000; 
    } 

 

The hard way, using the assembler STCKCONV macro.

I could find no function in C to do the same conversion.  I used to have some C code (of about 300 lines of code)  which did the tedious calculation of converting from microseconds to days, and then allowing for leap years etc.   Instead of rereating this,  I’ve written a bit of glue code which allows you to invoke the STCKCONV macro from C.

It works with non XPLINK amode 31 C programs.   I failed the challenge of getting it to work with XPLINK, and with 64 bit C programs (which has the extra challenge that parameters are passed in as 64 bit pointers.

In your C program you have

#pragma linkage(STCKEDT,OS)

rc = STCKEDT( stckvalue ,length, output);

Quick digression.   There are 8 byte STCK values and 16 Byte STCKE which have an extended time stamp to handle the 2042 problem when a STCK will overflow.

For a STCK value specify STCKEDT(stck,8,output).

For a STCKE value specify STCKEDT(stcke,16,output);

The output is a 27 character string with a trailing null.

The return code is either from STCKCONV routine  or 20 if the length is invalid.

The code is below

STCKEDT CSECT 
STCKEDT AMODE 31 
STCKEDT RMODE ANY 
******** 
* R1-> A(STCK) 
*   -> length of STCK 8 or 16 
*   -> Return buffer 
STCKEDT2 EDCPRLG DSALEN=DLEN The name appears in CEE traceback
         LA   15,20          preset the return code - invalid parms 
         USING DSA,13 
         L    2,0(,1)         address of input 
         L    5,4(,1)         a(length of STCK) 
         L    5,0(5)          the length 
         L    6,8(,1)         return area 
         CFI  5,8             Is the passed length of STCK 8? 
         BNE  TRYSTCKE 
         STCKCONV  STCKVAL=(2),                                        x 
               CONVVAL=BUFFER,                                         x 
               TIMETYPE=DEC,  hhmmsst....                              x 
               DATETYPE=YYYYMMDD 
         BNZ  GOBACK 
         B    COMMON 
TRYSTCKE DS   0H 
         CFI  5,16            is length 16? 
         BNE  GOBACK          r15 has been set already to error 
         STCKCONV  STCKEVAL=(2),                                       x 
               CONVVAL=BUFFER,                                         x 
               TIMETYPE=DEC,  hhmmsst....                              x 
               DATETYPE=YYYYMMDD 
         BNZ  GOBACK 
         B    COMMON 
COMMON   DS   0H 
*  the macro produced time, date, so rearrange it to date time 
         MVC  DT(4),BUFFER+8   Move the date 
         MVC  DT+4(8),BUFFER+0   Move the time 
* put the ED mask in the output field 
         MVC  DATETIME,DTMASK 
* and convert it from packed numbers to readable string 
         ED   DATETIME,DT 
* returned date time string is 26 + 1 for trailing null 
         MVC  0(27,6),DATETIME+1   +1 because of leading pad char 
         SR   15,15              reset the return code
GOBACK   DS    0H 
         EDCEPIL 
&DATEMASK  SETC '4021202020612120612121'  _dddd/dd/ddd
&TIMEMASK  SETC '4021207a21207a21204b21202020202040' _dd:dd:dd.dddddd_
DTMASK   DC   X'&DATEMASK.&TIMEMASK.00'  Add trailing null for C 
* Work area 
DSA      EDCDSAD 
BUFFER   DS    4F     Time.time ..date .. work d 
DT       DS    3F     Date, time,time 
DATETIME DS   CL28    Leading blank, date time, null 
DLEN     EQU  *-DSA 
         END 

I complied it with

//S1 EXEC PGM=ASMA90,PARM='DECK,NOOBJECT,LIST(133),XREF(SHORT),GOFF', 
//             REGION=4M 
//SYSLIB   DD DSN=SYS1.MACLIB,DISP=SHR 
//         DD DISP=SHR,DSN=CEE.SCEEMAC 
//SYSUT1   DD UNIT=SYSDA,SPACE=(CYL,(1,1)) 
//SYSPUNCH DD DISP=SHR,DSN=COLIN.OBJLIB(STCKEDT) 
//SYSPRINT DD SYSOUT=* 
//SYSIN    DD * 
...

/*

and included it in my C program JCL as

//BIND.OBJ DD DISP=SHR,DSN=COLIN.OBJLIB
//BIND.SYSIN DD *
  INCLUDE OBJ(STCKEDT)
  NAME COLIN(R)
//*

Why isnt my MQ RACF command working?

I was  trying to define an MQ queue using %CSQ9 DEFINE QL(AA)  and was getting

ICH408I USER(IBMUSER ) GROUP(SYS1 ) NAME( ) 
CSQ9.QUEUE.AA CL(MQADMIN ) 
PROFILE NOT FOUND - REQUIRED FOR AUTHORITY CHECKING 
ACCESS INTENT(ALTER ) ACCESS ALLOWED(NONE ) 

But the profile existed!
The command tso rlist MQADMIN CSQ9.QUEUE.AA  showed me the profile which would be used

CLASS NAME
----- ----
MQADMIN CSQ9.QUEUE.* (G)

It did not look like the class was being cached

SETR RACLIST CLASSES =  APPL CBIND CDT CONSOLE CSFKEYS CSFSERV DASDVOL     
                        DIGTCERT DIGTCRIT DIGTNMAP DIGTRING DSNR EJBROLE   
                        FACILITY IDIDMAP LOGSTRM OPERCMDS PTKTDATA PTKTVAL 
                        RDATALIB SDSF SERVAUTH SERVER STARTED SURROGAT     
                        TSOAUTH TSOPROC UNIXPRIV WBEM XCSFKEY XFACILIT     
                        ZMFAPLA ZMFCLOUD 

But I missed the

GLOBAL=YES RACLIST ONLY =  MQADMIN MQNLIST MQPROC MQQUEUE MXTOPIC      

I used the

TSO SETROPTS RACLIST(MQADMIN) REFRESH

and the define command worked. Another face palming moment.

Lesson learned -if  indoubt use the SETROPTS RACLIST(MQADMIN) REFRESH command

 

Looking for an MQ reason code in Liberty? Get your safari helmet, anti malarial tablets and follow me to find the treasure.

I was using an MQ application in Liberty, and rather do things the easy way, I did what I normally do, and did it the hard way.  On my z/OS I did not have the queue manager defined, because I wanted to see what happened.  I was not expecting the expedition.

You configure MQ in Liberty using configuration like

<jmsConnectionFactory jndiName="jms/cf1" connectionManagerRef="ConMgr1"> 
<properties.wmqJms transportType="BINDINGS" queueManager="MQPA"/>

 

I was expecting a message like the following in the job output.

Application COLINAPP MQCONN call to MQPA failed with compcode 
'2' ('MQCC_FAILED')reason '2058' ('MQRC_Q_MGR_NAME_ERROR').

Oh no, it was not that easy.  It was quite a trek into the jungle to find the information.

In the Liberty server’s logs directory there is a message.log file.  In this file I had

9/14/20 19:16:32:242 GMT 00000060 com.ibm.ws.logging.internal.impl.IncidentImpl I FFDC1015I: An FFDC Incident has been created: "com.ibm.mq.connector.DetailedResourceException: MQJCA1011: Failed to allocate a JMS connection., error code: MQJCA1011 An internal error caused an attempt to allocate a connection to fail. See the linked exception for  details of the failure. com.ibm.ejs.j2c.poolmanager.FreePool.createManagedConnectionWithMCWrapper 199" at 
ffdc_20.09.14_19.16.28.0.log

This was one long line, and I had to scroll sideways (just like you did) to see the content (or use the ISPF line prefix command “tf” to flow the text to the display width).  A key hint was the message MQJCA1011 An internal error caused an attempt to allocate a connection to fail  so I knew I was on the right trail.  I now knew the name of the file – ffdc_20.09.14_19.16.28.0.log.

Knowing the name of the file did not help very much, as if you use ISPF 3.17  (z/OS UNIX Directory List ) it showed a list of 40 files with the name ffdc_20.09.14_1 (ffdc_yy.mm.dd_h).   This is because it only displays the first part of the name. Thanks to Steve Porter who said ..

To increase column size in 3.17, >
Options
1. Directory List Options…
Width of filename column . . . . . . . . 15 (Default value – increase as necessary)

 

The file has a name ffdc_20.09.14_19.16.28.0.log and a displayed time stamp of 2020/09/14 18:16:32 which is close enough – allowing for the time zone difference and the time take to write the file.  I was fortunate not to be running a workload and producing many of these files.

I edited the file – and I could see the full file name at the top of the page, so I knew I was in the right file.

The file has long lines, so I had to scroll or use the “tf” line command to reformat it.

Near the top it had

Stack Dump = com.ibm.mq.connector.DetailedResourceException: 
MQJCA1011: Failed to allocate a JMS connection., error code:  
MQJCA1011 An internal error caused an attempt to allocate a connection to fail. 
See the linked exception for details of the  failure.

Further down it had

Caused by: com.ibm.msg.client.jms.DetailedJMSException: 
JMSWMQ0018: Failed to connect to queue manager 'MQPA' with connection 
mode 'Bindings' and host name 'localhost(1414)'.

and further further down (line 50) I found the treasure

Caused by: com.ibm.mq.MQException: JMSCMQ0001: IBM MQ call failed with 
compcode '2' ('MQCC_FAILED') reason '2058'  ('MQRC_Q_MGR_NAME_ERROR').

What a trek to find the information I needed!

Next time I’ll just list the logs/ffdc directory, edit (not browse) each file and search for “compcode”.   You cannot use “grep compcode” from uss because the file is in UTF8 and does not find it.  You can just use oedit file_name in uss.

It would be nice if the MQ code could be enhanced to have an option “makeErrorsHardToFind” which you could set to “no”, and still keep the default “yes”.