A practical path to installing Liberty and z/OS Connect servers – 9 collecting monitoring data

Introduction

I’ll cover the instructions to install z/OS Connect, but the instructions are similar for other products. The steps are to create the minimum server configuration and gradually add more function to it.

The steps below guide you through

  1. Overview
  2. planning to help you decide what you need to create, and what options you have to choose
  3. initial customisation and creating a server,  creating defaults and creating function specific configuration files,  for example a file for SAF
  4. starting the server
  5. enable logon security and add SAF definitions
  6. add keystores for TLS, and client authentication
  7. adding an API and service application
  8. protecting the API and service applications
  9. collecting monitoring data including SMF
  10. use the MQ sample
  11. using WLM to classify a service

With each step there are instructions on how to check the work has been successful.

Monitoring data

You can collect SMF data and/or Http audit data to get reports on the performance and usage of your system.

There are some queries – you can issue via a REST query – but you do not get much data back.

http_access.log

If you have configured <httpEndpoint   ..  accessLoggingRef=…>  you can collect audit information for traffic through that end point.  If you have more than one httpEndpoint, for example with a different port, you can collect different information, or log it to a different file.

The information you can log (see here for a full description) includes

  • the client IP address
  • the userid
  • date time
  • the service requested
  • the response code
  • bytes sent and received
  • the response time ( in seconds, milliseconds, or microseconds)

You can include delimiters (for example quotes around a string, or !.. !)  in the output to simplify post processing.

If you have high throughout, this solution may not scale, and SMF may be a better solution.

Collecting SMF records

You can collect SMF 120 records from the Liberty base, and SMF 123 records from z/OS connect.

To collect SMF 120 records you need to add

<featureManager> 
    <feature>zosRequestLogging-1.0</feature> 
</featureManager>

to your configuration.

SMF 123 records are produced by another interceptor (exit). You need to define it, and add it to the list of global, API or service list of interceptors.

Configure the auditInterceptor

<zosconnect_auditInterceptor id="auditInterceptor" 
   sequence="1" 
   apiProviderSmfVersion="2"/>

and add it to the list of the intereptors

<zosconnect_zosConnectInterceptors 
    id="interceptorList1" 
    interceptorRef="zosConnectAuthorizationInterceptor,auditInterceptor"
/>

For both record types, the server started task needs access to the BPX.SMF class.

PERMIT BPX.SMF CLASS(FACILITY) ACCESS(READ) ID(USERID)
setropts raclist(facility) refresh

If the server does not have this permission it will get an FFDC with

Stack Dump = java.io.IOException: Failed to write SMF record, __smf_record errno/errno2 return code 139

Processing SMF 120. You can download SMF Browser for WebSphere Application Server for z/Osfrom

This is a java “formatter” here which provides just a dump of the records, and so is not very usable.

I wrote a formatter for this to summarise key information ( and ignore irrelevant stuff).   I’ll put this up on github when Ive got it documented.

Some of the interesting data is

  • Request start and stop time for example 2020/09/26 16:35:42.977709, from which you can calculate request duration
  • CPU for the request
  • The userid
  • The URI /zosConnect/services/stockQuery
  • TCPIP Origin and port 10.1.1.1 (33546) into the server port (9443)
  • Sysplex, LPAR, Server name, Server job number, level,

I took the data and accumulated it, so I could see which requests used all of the CPU, and report it by hour, and userid.

Processing SMF 123.

z/OS connect provides a sample C program, and JCL to compile it.   See here.

The SMF 123 records are written when the z/OS Connect server shuts down, or when the SMF buffer is full, so there is a risk that data from today, is not produced until tomorrow because there was no activity overnight.

I typically got about 20 services/APIs per SMF record.

Combing the records

I could not see how to correlate the SMF 123 and the SMF 120 records.   This would be useful to get the CPU used by each API or service.

Rest request

This page describes how to get REST statistics.  For example

https://10.1.3.10:9443/zosConnect/operations/getStatistics

This returned

{"zosConnectStatistics":
  [
   {"stockQuery":
     {
       "ServiceProvider":"IBM MQ for z\/OS",
       "InvokeRequestCount":21,
       "TimeOfRegistrationWithZosConnect":
       "2020-10-01 14:52:26:049 BST",
       "ServiceStatistics":{}
    }
   }
  ]
}

With nothing in the ServiceStatistics{}.

You can ask for a specific service https://10.1.3.10:9443/zosConnect/operations/getStatistics?service=stockQuery.  You get the same data back as above.

I could not find how to get information on APIs.

You can get real time statistics data see here.

I had

<zosconnect_zosConnectManager
globalInterceptorsRef=”interceptorList1″
globalAdminGroup=”TEST”
globalInvokeGroup=”SYS1″
globalOperationsGroup=”TEST”
globalReaderGroup=”TEST”
/>
<zosconnect_authorizationInterceptor id=”zosConnectAuthorizationInterceptor”/>
<zosconnect_auditInterceptor id=”zoscauditInterceptor” sequence=”1″ apiProviderSmfVersion=”2″/>
<auditInterceptor id=”auditInterceptor” sequence=”1″/>
<zosconnect_zosConnectInterceptors
id=“interceptorList1”
interceptorRef=”zosConnectAuthorizationInterceptor,auditInterceptor,zoscauditInterceptor “/>

A practical path to installing Liberty and z/OS Connect servers – 2 Planning

Introduction

I’ll cover the instructions to install z/OS Connect, but the instructions are similar for other products. The steps are to create the minimum server configuration and gradually add more function to it.

The steps below guide you through

  1. Overview
  2. planning to help you decide what you need to create, and what options you have to choose
  3. initial customisation and creating a server,  creating defaults and creating function specific configuration files,  for example a file for SAF
  4. starting the server
  5. enable logon security and add SAF definitions
  6. add keystores for TLS, and client authentication
  7. adding an API and service application
  8. protecting the API and service applications
  9. collecting monitoring data including SMF
  10. use the MQ sample
  11. using WLM to classify a service

With each step there are instructions on how to check the work has been successful.

Planning

Summary checklist

  1. Allocate HTTPS and HTTP ports
  2. Decide how many Started Task procedures you need – and what to call them.
  3. Decide where to install the product
  4. Where to put the server’s home directory – and how much space to allocate
  5. What Angel task will be used – do you need to create a task or use an existing one
  6. Security
    1. Can you share the profile prefix or do you need to allocate a new one
    2. Do you need to set up a new ejbrole profiles
    3. Decide what groups can access the ejbrole profile
    4. Decide what groups can access the global roles
    5. Decide what groups can have API and Service specific roles
  7. What SMF data do you want to collect
  8. Do you want to use WLM to classify the priority that URLs get?

TCPIP Port

Most of the work with Liberty is done with an HTTPS port. However most sites allocate an HTTP and an HTTPS port.  The default ports, http:9081 and https:9443, may already be in use by another Liberty instance.

You can see if a port is in use by using the command

tso netstat allconn tcp tcpip ( port 9081

If the port is in use, it will report the job name.

Customising the JCL

There will be updates to the SYS1.PROCLIB concatenation, and some security definitions to be done. If you have the authority, you can make these changes yourself. If not, you will need to do some planning, and request the changes.

Where does the executable code go?

Products are usually installed in /usr/lpp file path.

If you intend to have only one version of the product installed at a time, you can create a directory /usr/lpp/IBM/zosconnect/v3r0 and mount the product file system over this directory.

If you only plan to use more than one version in parallel, you can create /usr/lpp/IBM/zosconnect/v3r0beta and mount the beta file system over it.

I found it convenient to define an alias /usr/zosc to /usr/lpp/IBM/zosconnect/v3r0beta/bin. By changing the alias I could easily switch between versions, and had less typing!

How many JCL procedures do I need to create?

There are two ways of defining multiple servers.

  1. You have one JCL procedure and pass the server name as a parameter.
S BAQSTART,Parms=’server1’
S BAQSTART,Parms=’server2’

Note: If you use the z/OS command STOP BAQZSTRT then both servers will stop.

If you use the same JCL procedure for different servers you can use

S BAQSTART,Parms=’server1’,jobname=ZERVER1
S BAQSTART,Parms=’server2’jobaname=ZERVER2

and use the stop command P ZERVER1 to stop just the first one.

You can use WLM to classify ZERVER1 and ZERVER2 and give them different service classes.

  1. You can use a different JCL procedure for each server.
S BAQSTRT1,parms=”server”
S BAQSTRT2,parms=”server”

You can also issue S BAQSTRT1,parms=”server”,jobname=ZERVER1

I can see no major advantage either way.  Having one started task JCL per server means more JCL to support but you can upgrade the servers one at a time.

You could also set up the procedure so you use

S BAQSTRT1,parms=”server”,WLP="/u/zosc"

Server file system.

Each server has a “home” directory. This contains

  1. server configuration files – the servers only reads these files.
  2. a log directory where the server writes log files, trace files, and and FDC failure events.

You may want each server to have its own file system, so if it produces a lot of output and fills up the file system, it does not impact other servers using the same file system.

You might start with one file system shared by many servers, and move to dedicated file systems before going into production.

The default file system in the zOS Connect documentation is /var/zosconnect ; this cannot be shared across LPARs. You might want to create and use /u/zosc as a shared file system, and use /u/zosc/server1 etc. The Liberty shared directory would be /u/zosc/shared.

Before you decide where you put your server’s files you need to think about what your environment could be in a years time.

If you want to have more than one server using a shared configuration, you can include files into the server.xml file. Shared files could be keystore definitions, or security definitions, and these need to be on a shared file system.

Some file systems are specific to an LPAR and not shared, (/var/ /etc/tmp, /dev), other file systems can be shared across the SYSPLEX.

Include common configuration into the server.xml file

When you include configuration files (in server.xml)  the syntax is like

<include location="/u/zosc/servers/stockManager/mq.xml"/>
<include location=”${shared.config.dir}/security.xml”/> 
<include location="${server.config.dir}/saf.xml"/> 
<include location="${COLIN}/servers/d2/jms.xml"/>
<variable name="colin2" value="/ZZZ/zosconnect/"/>
<include location="${colin2}/servers/d3/jms.xml"/>  

Where you can

  • give the explicitly file path name,
  • use a Liberty property ${server.config.dir} which says in the servers directory,
  • use the Liberty property ${shared.config.dir} which points to a shared directory within the server’s environment.
  • Use an environment variable COLIN defined as
    • //STDENV DD *
    • COLIN=COLINJCL
  • Create and use, your own property – colin2
  • or combinations of these.

If you get the location wrong, it is easy to change, and to move the configuration files to a new directory.

As you move changed from test through to production you may want to use the same server.xml and included files.  If so, you could set an environment variable in the JCL whose value depends on the LPAR.

How much disk space is needed?

The configuration files do not need much disk space. If you use the trace capability then the trace files can be large, and have many of them , but you can control the number and size of the logs and traces. FDC’s are also stored in the file system, and these can also be large, and you may get a lot of them. ZFS can automatically expand the file system – and your automation can respond to the ZFS message on the console to notify you that your file system is filling up.

If the JVM abends, you can get SDUMPS taken. On my machine they were taken with the HLQ of the started task (START1).

Angel task needed

You need an ANGEL task to support authorised services. You can have only one unnamed Angel per LPAR. You need to decide if your server can use this, or if your server needs its own, named Angel.

You should use the Angel at the latest service level. If servers share an Angel, and the Angel is running back level, you will get a message informing you.

You configure the Liberty instance to point to a named Angel.

Planning for security.

Liberty requires a RACF APPL profile prefix set up. The default profile prefix is BBQZDFLT. This name is used as a prefix to the RACF profile which allows users to access Liberty. For example in the EJBROLE class

BBQZDFLT.zos.connect.access.roles.zosConnectAccess

To provide isolation, and security you may want to use a different profile prefix for different groups of servers. For example you may want to isolate MQWEB from z/OS Connect, and from WebSphere Application Servers.

In summary, there are three level of security

  1. A userid needs access to EJBPROF profile (above) to get access to the z/OS connect instance.
  2. There is Global access, with four predefined roles. You specify a list of groups and Liberty checks to see if the userid is a member of the groups. This is not a SAF check. This checking is done in an interceptor (exit) which you specify.
  3. You can specify security at the API or service level. This checking is done in an interceptor (exit) which you specify.

You will need to set up an EJBPROF profile and permit groups to connect to the server.

Once a user has access to the server, there is another layer of security with categories:

  • globalAdminGroup – Identifies the users that are able to use administrative functions on all APIs, services, service endpoints and API requesters.
  • globalOperationsGroup -Identifies the users that are able to perform operations such as starting, stopping or obtaining the status of all APIs, services, service endpoints and API requesters.
  • globalInvokeGroup – Identifies the users that are able to invoke all APIs, services, service endpoints and API requesters.
  • globalReaderGroup –Identifies the users that are able to get lists of, or information about, all APIs, services, service endpoints and API requesters, including Swagger documentation.

You can refine the security for the APIs, Services, and Service endpoints, using tags like

<zosconnect_services…

  • adminGroup
  • operationsGroup
  • invokeGroup
  • readerGroup

To be able to operate a service or API, you need to be in both globalOperationsGroup, and in the operationsGroup lists of groups.

If you have different applications within a server, you need to be careful how you set up the security profile. If someone is authorised through the global* profile  to operate service A, and you add service B, then by default the person will be allowed to operate service B. You need to define the zosconnect_services for service B, and specify the operationsGroup to restrict access to service B.

Because of this, you need to consider if you need separate default prefix for the servers to give application isolation from a security perspective.

During this planning stage you need to plan the default prefix you will be using, the groups of users for the different roles, and if you want to use both global and API/services level authorisation checks.

If you change the configuration and change the groups in the configuration, you an activate the change using the

f ….zcon,refresh

operator command.

Unauthenticated user.

When Liberty uses SAF to authenticate, it requires an Unauthenticated User which is usually “WSGUEST”. This userid can be used for all Liberty instances.

Liberty does most of its work using a https connection. If you specify some particular options, the server can set up a default keystore. This is fine while you are setting up – but not for the long term, as it does not validate certificates sent from clients.

You will need to set up a keystore to provide the server with a private certificate. You will need a trust store which contains the Certificate Authority and any client self signed certificates.   The keystores and truststores can be shared by all servers.

You can have different keystores depending on the IP address or port. See https://www.ibm.com/support/knowledgecenter/SSEQTP_liberty/com.ibm.websphere.wlp.doc/ae/rwlp_ssl_outbound_filter.html. I suggest you do not do this until you have basic TLS working.

SMF

Liberty can produce SMF 120 records. There are no good tools freely available to provide reports on usage.

Z/OS connect can produce SMF data record type 123. You will need to collect it. Some samples are provided to print out the data. There are no good tools to provide reports on usage.

Classifying request using WLM.

You can classify request to give priorities to particular services.  See here. You do not need to decide on the classification until the server is operational, and the services are available.  Essentially you configure services as a transaction class, then use WLM to classify the transaction class within the server.

<httpClassification transactionClass="TCIC" method="GET" 
resource="/catalogManager/items"/>

 

A practical path to installing Liberty and z/OS Connect servers – 1 Overview

The instructions I have seen for installing products based on Liberty that seem to be written as if there would only be one server; one server on the LPAR, and one server in the whole SYSPLEX. In reality you are likely to have the “same” server running on multiple LPARS sharing configuration to provide availability, and have more than one server running on an LPAR, for example MQWEB, WAS, z/OSMF, and z/OS connect. The series of blog post below are to help you implement multiple servers, across a sysplex.

Some of the areas not adequately addressed by the IBM product documentation include

  1. Sharing of definitions
  2. Sharing of keystore and trust stores
  3. Providing isolation, to prevent someone who has access to MQWEB from accessing Z/OS Connect.
  4. How many Angel tasks do I need – can one be shared?
  5. Some areas such as TLS can be hard to get working.

I’ll cover the instructions to install z/OS Connect, but the instructions are similar for other products. The steps are to create the minimum server configuration and gradually add more function to it.

The steps below guide you through

  1. Overview
  2. planning to help you decide what you need to create, and what options you have to choose
  3. initial customisation and creating a server,  creating defaults and creating function specific configuration files,  for example a file for SAF
  4. starting the server
  5. enable logon security and add SAF definitions
  6. add keystores for TLS, and client authentication
  7. adding an API and service application
  8. protecting the API and service applications
  9. collecting monitoring data including SMF
  10. use the MQ sample
  11. using WLM to classify a service

With each step there are instructions on how to check the work has been successful.

I wrote the blog post  How many servers do I need? Every one know this – or no one knows this. when I was first thinking about planning my servers.

Question: What time is it in year 2k42? Answer:time to be retired

Do you remember the Y2K problem where the date rolled into 2000?
I had to fly to the US on Jan 1st 2000, so I could be on site in case there were problems with a large bank running on the mainframe.   I have two memories

  • the vending machines had a message like Out of Cheese Error. Redo From Start. and would not vend.
  • someone had been taken to hospital with gunshot wounds, because people celebrated the new millennium by firing their guns up into the air, and what goes up, must come down, and if you are in a crowded street…
There is another year 2K type problem coming, it is when the System 390 clock wraps.  It is a 8 byte field.  When I was writing statistics and accounting code for MQ on z/OS, you time an event by issuing the STCK instruction before something,  STCK again afterwards and calculate the difference.
To solve this problem there is the STCK extended instruction which is 16 bytes – essentially there is one byte in front of the existing STCK, and some space space at the end.  So problem solved?   Not quite.
There are many control blocks with a field for the 8 byte STCK value.  If this is changed to a 16 byte STCKE field then the offsets of all the fields will change.   This is OK with a small program, but not for the operating system, where fields are fixed “for architecture reasons” to allow people to rely on the location of these fields.
Many products depend on a STCK to create a unique identifier, and given two STCK values you can tell which was created first – even across IPLs.  Changing this to use a STCKE will cause a migration and coexistance problem.
Some SMF records have a STCK to say when an event occurred,  the report processing may need logic to say if the value is small – then add 2042 years to it.
I had a routine which formatted a STCK into YY/MM/DD hh:mm:ss.tttttt.   This will no longer work, as the STCK(e) is now 9 bytes long.
Do you need to worry about this?  Not really – IBM will fix the operating system and products, vendors will fix their products.  It is down to your programs, and most people do not use the STCK values.  If you do use STCK I suggest you locate all references to STCK and put the operations in macros.  Then when you have to change the code – you change the macros, recompile the programs,  and with a bit of magic, and a good wind you’ll have no problems – just make sure you feed the flying pigs first.
The alternative is to retire and let someone else worry about it.

How do I format a STCK from a C program?

I’ve been writing a program which process SMF data which has STCK values for dates of events, and STCK values for durations.
In assembler there is a STCKCONV function which takes a STCK ( or STCKE) and converts this to printable date and time, for example 2020/09/21 09:19:18.719641  .

I wrote some code (at the bottom) to call the assembler routine to do the work.  it did not work for 64 bit programs.

I had some inspiration in the middle of the night for a much simpler way of doing it.

Quick digression.   There are 8 byte STCK values and 16 Byte STCKE which have an extended time stamp to handle the 2042 problem when a STCK will overflow.  A STCKE is the first 9 characters of the STCKE with an extra “overflow” byte at the front.

 

Simple way just using C – should work in 31 and 64 bit.

There are C routines for processing times.   For example gmtime (time) take unix time and returns a structure with pointers to the year, month etc.

The unix time is the time in seconds since 00:00:00 January 1 1970.

So to use the C routines, take a STCK(E) convert it to seconds – and subtract the number of seconds which represents midnight January 1 1970.

The logic takes an 8 byte string, shifts it right by 12 bits to get the microseconds bit into the bottom bit, calculates the number of seconds, and returns it.

typedef unsigned long long * ull; 
void STCKTM(char * pData, struct  timespec  * pTimespec) { 
      unsigned long long stck  = *(ull) pData; 
      stck = stck/4096; // 4096 for stck to get microseconds 
                        // in bottom bit  
      long  microseconds = stck%1000000; // save microseconds
      stck = stck/1000000;  // seconds from microseconds 
      stck = stck - 2208988800; // number of seconds to Jan 1 1970 
      pTimespec -> tv_sec  = stck; 
      pTimespec -> tv_nsec  = microseconds * 1000; 
    } 

You call this with

struct tm * tm2; 
   struct tm * tm2; 
  struct  timespec ts; 
   STCKETM((char *) headtimeZCentry, &ts ); 
   tm2= gmtime( &ts.tv_sec  ); 
   printf("GMTIME yy:%d mm:%d dd:%d h:%d m:%d s:%d\n", 
     1900+tm2->tm_year, 
     1+tm2->tm_mon, 
     tm2->tm_mday, 
     tm2->tm_hour, 
     tm2->tm_min, 
     tm2->tm_sec); 

This produces

GMTIME yy:2020 mm:9 dd:21 h:9 m:19 s:18 

For STCKE to TM.  The logic is nearly identical. The 9 byte string only needs to be shifted 4 bits to align the microseconds to the bottom bit.

void STCKETM(char * pData, struct timespec * pTimespec){ 
      unsigned long long stck  = *(ull)               pData; 
      stck = stck/16  ; // 4096 for stck 16 fot stcke as already 
                        // shifted by definion 
      long  microseconds = stck%1000000; 
      stck = stck/1000000;  // seconds from microseconds 
      stck = stck - 2208988800; // number of seconds to Jan 1 1970 
        pTimespec-> tv_sec  = stck; 
        pTimespec ->tv_nsec  = microseconds * 1000; 
    } 

 

The hard way, using the assembler STCKCONV macro.

I could find no function in C to do the same conversion.  I used to have some C code (of about 300 lines of code)  which did the tedious calculation of converting from microseconds to days, and then allowing for leap years etc.   Instead of rereating this,  I’ve written a bit of glue code which allows you to invoke the STCKCONV macro from C.

It works with non XPLINK amode 31 C programs.   I failed the challenge of getting it to work with XPLINK, and with 64 bit C programs (which has the extra challenge that parameters are passed in as 64 bit pointers.

In your C program you have

#pragma linkage(STCKEDT,OS)

rc = STCKEDT( stckvalue ,length, output);

Quick digression.   There are 8 byte STCK values and 16 Byte STCKE which have an extended time stamp to handle the 2042 problem when a STCK will overflow.

For a STCK value specify STCKEDT(stck,8,output).

For a STCKE value specify STCKEDT(stcke,16,output);

The output is a 27 character string with a trailing null.

The return code is either from STCKCONV routine  or 20 if the length is invalid.

The code is below

STCKEDT CSECT 
STCKEDT AMODE 31 
STCKEDT RMODE ANY 
******** 
* R1-> A(STCK) 
*   -> length of STCK 8 or 16 
*   -> Return buffer 
STCKEDT2 EDCPRLG DSALEN=DLEN The name appears in CEE traceback
         LA   15,20          preset the return code - invalid parms 
         USING DSA,13 
         L    2,0(,1)         address of input 
         L    5,4(,1)         a(length of STCK) 
         L    5,0(5)          the length 
         L    6,8(,1)         return area 
         CFI  5,8             Is the passed length of STCK 8? 
         BNE  TRYSTCKE 
         STCKCONV  STCKVAL=(2),                                        x 
               CONVVAL=BUFFER,                                         x 
               TIMETYPE=DEC,  hhmmsst....                              x 
               DATETYPE=YYYYMMDD 
         BNZ  GOBACK 
         B    COMMON 
TRYSTCKE DS   0H 
         CFI  5,16            is length 16? 
         BNE  GOBACK          r15 has been set already to error 
         STCKCONV  STCKEVAL=(2),                                       x 
               CONVVAL=BUFFER,                                         x 
               TIMETYPE=DEC,  hhmmsst....                              x 
               DATETYPE=YYYYMMDD 
         BNZ  GOBACK 
         B    COMMON 
COMMON   DS   0H 
*  the macro produced time, date, so rearrange it to date time 
         MVC  DT(4),BUFFER+8   Move the date 
         MVC  DT+4(8),BUFFER+0   Move the time 
* put the ED mask in the output field 
         MVC  DATETIME,DTMASK 
* and convert it from packed numbers to readable string 
         ED   DATETIME,DT 
* returned date time string is 26 + 1 for trailing null 
         MVC  0(27,6),DATETIME+1   +1 because of leading pad char 
         SR   15,15              reset the return code
GOBACK   DS    0H 
         EDCEPIL 
&DATEMASK  SETC '4021202020612120612121'  _dddd/dd/ddd
&TIMEMASK  SETC '4021207a21207a21204b21202020202040' _dd:dd:dd.dddddd_
DTMASK   DC   X'&DATEMASK.&TIMEMASK.00'  Add trailing null for C 
* Work area 
DSA      EDCDSAD 
BUFFER   DS    4F     Time.time ..date .. work d 
DT       DS    3F     Date, time,time 
DATETIME DS   CL28    Leading blank, date time, null 
DLEN     EQU  *-DSA 
         END 

I complied it with

//S1 EXEC PGM=ASMA90,PARM='DECK,NOOBJECT,LIST(133),XREF(SHORT),GOFF', 
//             REGION=4M 
//SYSLIB   DD DSN=SYS1.MACLIB,DISP=SHR 
//         DD DISP=SHR,DSN=CEE.SCEEMAC 
//SYSUT1   DD UNIT=SYSDA,SPACE=(CYL,(1,1)) 
//SYSPUNCH DD DISP=SHR,DSN=COLIN.OBJLIB(STCKEDT) 
//SYSPRINT DD SYSOUT=* 
//SYSIN    DD * 
...

/*

and included it in my C program JCL as

//BIND.OBJ DD DISP=SHR,DSN=COLIN.OBJLIB
//BIND.SYSIN DD *
  INCLUDE OBJ(STCKEDT)
  NAME COLIN(R)
//*

Getting z/OS Explorer to work with z/OS Connect EE

Ive been trying to set up z/OS Connect, so I could look at the MQ support within it.

Setting up z/OS Connect in the first place, was a challenge, which I’ll blog about some other time.  I was looking for an Installation Verification Program (IVP) and tried to use the z/OS Explorer.  This was another challenge.  Like many problem there are answers, but it is hard to find the information.

Installing z/OS Explorer

This was easy.  I started here and installed z/OS explorer for Aqua – Eclipse tools.  Then select  IBM z/OS Connect EE.  I selected Aqua 3.2, and chose to install using eclipse p2. I have tried to avoid installation manager as it always seemed very complex and frustrating.

I tried to extend an existing eclipse, but this failed due to incompatibilities.  I used start from fresh, and this worked fine.

Adjust the z/OS Connect server configuration.

I enabled logon logging.

 <httpEndpoint id="defaultHttpEndpoint" 
    host="*" 
    accessLoggingRef="hal1" 
    httpPort="19080" 
    httpsPort="19443" > 
   <accessLogging enabled="true" 
     logFormat='h:%h i:%i u:%u t:%t r:%r s:%s b:%b D: %D m:%m' 
    /> 
<sslOptions sslRef="defaultSSLSettings"/> 
</httpEndpoint> 

This creates a file in the  location http_access.log within the log directory. It has output like

10.1.1.1 ADCDC 08/Sep/2020:17:50:40 +0000 "GET /zosConnect/services/stockQuery HTTP/1.1" 200

You can see where the request came from (10.1.1.1), user (ADCDC), the date and time, the request (“GET /zosConnect/services/stockQuery HTTP/1.1”), and the response code(200).

Getting started with z/OS Explorer

You need to define host connections.

If you totally disable security on your server you can use http.

  1. On z/OS explorer,display the Connections tab. (Window -> Show View -> Host connections)
  2. Right click on z/OS Connect Enterprise Edition, and select New z/OS Connect Enterprise Edition, Connection
    1. Name: this is displayed in the tooling
    2. Host name: I used 10.1.3.10 which is my VIPA address of the server
    3. Port number:   This comes from the  httpEndpoint for the server.  The default is http:9080 and https:9443 – but as every Liberty product uses these values, your server may have different values.  I used 19080.
    4. I initially left Secure connection(TLS/SSL) unticked
    5. Click Save and Connect
  3. A panel was displayed asking for credentials. Either create new credentials (userid and password) or select an existing credential.
  4. Double click on the connection you just created.
    1. An error of “302, Found” is an http response meaning redirection.  In the z/OS connect case, this means you are trying to use an http connection when an https ( a TLS connection) was expected.  I got this because I had not disabled security in my server.

The normal way of accessing z/OS connect is to use TLS to protect the session.  As well as TLS to protect the session you can also use client certificate authentication.  This is what I used.

You will need to set up certificates, keystores and keyrings on z/OS and get the Certificate Authority certificates sent to the “other” system.  I used my definitions from using MQWEB.

  1. On z/OS explorer, set up the keystores
    1. Window -> Preferences -> Explorer-> certificate manager
    2. The truststore contains the CA certificates to validate the certificate send down from the z/OS server.  Enter the file name (or use Browse), the pass phrase, and the key store.  My truststore was JKS.
    3. The keystore contains the client certificate used to identify this client to the server.
    4. Smart card details.  Ignore this – (despite it saying you must configure a PKCS11 driver).   This section is used if you select smart card to identify yourself, and it would be better if the wording said “If you are using Smart card authentication you must configure a PKCS11 driver ).
    5. Leave the “Do not validate server certificate trust” unticked.  This will check the passwords etc of the key stores.
    6. At the bottom I used “Secure socket protocol-> TLS v1.2” though this is optional.
    7. Select Apply and Close
  2. Display the Connections tab. (Window -> Show View -> Host connections)
  3. Right click on z/OS Connect Enterprise Edition, and select New z/OS Connect Enterprise Edition, Connection
    1. Name: this is displayed in the tooling
    2. Host name: I used 10.1.3.10 which is my VIPA address of the server
    3. Port number:   This comes from the  httpEndpoint for the server.  The default is http:9080 and https:9443 – but as every Liberty product uses these values, your server may have different values.  I used 19443
    4. I ticked Secure connection(TLS/SSL).  If you do not select this, you will not be able to use a certificate to logon.
    5. Click Save and Connect
  4. A panel was displayed asking for credentials.   When I used an existing credential I failed to connect to the server.
    1. Select Create new credentials
    2. Click on Username and Password pull down – and select Certificate from Keystore.
    3. Enter credentials name – this is just used within the tooling
    4. Userid – this seems to be ignored.  I used certificate mapping on the z/OS to map the certificate to a userid.
    5. Choose a certificate – select one from the pull down.  In my Linux box the choice of certificates came out in yellow writing on a yellow background!
    6. Click OK
    7. The connection should appear on the Connections page, under z/OS Connect Enterprise Edition.  It should go yellow while it is connecting, and green, with a padlock once it has connected

Use z/OS Connect

Use Window-> Show View -> zOS Connect EE Servers

You should see your connection displayed  with the IP address and port. Underneath this are any APIs or Services you have defined.

If you have any APIs or Services, you should be able to right click and select Show Properties View.  You can click on the links, or copy the links and use them, for example  in a web browser directly,or via curl.

If you try to use the APIs or Services, you may not be authorised.  You will need to configure

  1. <zosconnect_zosConnectManager …>
  2. <zosconnect_zosConnectAPIs>   <zosConnectAPI name=”stockmanager”  ….
  3. <zosconnect_service>  <service name=”stockquery”

Good luck.

 

 

 

Planning your Liberty – this is not an escape plan

With the web being the new front end to z/OS, most z/OS products are using Liberty as their web server to deliver web content.   Each product seems to be documented as if it is the only Liberty instance on the z/OS image, for example they all default to use http port 9080.

This blog post helps identify what planning you need to do before you can configure a Liberty instance, be it z/OSMF, MQWEB, CICS, or zOS Connect EE.

Roles

Often the person installing and configuring the server is part of the “installation team”, and may not be familiar with the  detailed use of the product, or product based tooling, for example the eclipse based  z/OS tooling.   This person’s role is to configure the server so it meets the enterprise requirements, and the configuration within the server is down to the team who requested it.

Update proclib

You need to decide if each instance will need its own procedure in proclib, or one procedure can start multiple servers.

You will need an Angel process – there can only be one Angel started task across all of your Liberty instances on an LPAR.  Ensure it is at the highest service level.

If you want isolation you may want to set up an test instance with a different started task userid to the production instance.

Where do the product ZFS libraries go

Usually the product code goes in /usr/lpp/IBM/product_name. Typically you make a directory /usr/lpp/IBM/product_name, then mount the product file system over this directory.  Some times this file system needs to be mounted RW during customisation.   When you upgrade you can just mount the new file system instead of the old file system.

Where do the configuration files go?

The Liberty configuration files can go in /var/… or /u/… . If you intend for the server to be started on another LPAR, then the configuration files need to be available on the other LPAR.  Having a variable with the LPAR name as part of the directory will not work.   The configuration files are defined with the WLP_USER_DIR environment variable. You may want shell scripts which define this variable.  For example the shell script prodmqweb could have export WLP_USER_DIR=/u/mqweb/production/MQPA.   You then use sh prodmqweb to define the variable, or pass commands to it, such as sh prodmqweb dspmqweb so you can be sure you are using the right configuration file.

UNIX Directory List Utility ISPF 3.17 has space for 56 characters in the default directory name, but only 40 when working with files, such as new file or rename.   You may want to have a short prefix, or use an alias. For example /u/zoscA/servers/stockManager/server.xml instead of /MVSA/var/zosconnect/servers/stockManager/server.xml

What configuration files will be shared?

If you already have Liberty running in your environment you may be able to reuse some files, and include them in your server.xml. For example if your keystores are defined in a file you could use <include location=”…./keystore.xml”/> .

If you are providing duplicate servers for availability, you can put you common definitions in one file and share this, and server specific definitions in a different file.

It is easier to manage the configuration files if you provide small function specific files.  For example saf.xml, trace.xml,applications.xml, andr keystores.xml .

What TCPIP ports will be used

Each product will need its own ports, typically one port for http, and another port for https.   You can define multiple ports with different characteristics.  You use httpEndpoint and can specify for this port log this information, for that port log different information to a different place.

If you want to have two instances running on an LPAR using the same port,  the port needs to be defined as SHAREDPORT.

You may want to have the same port defined on each LPAR, so no matter which LPAR is used, use port 9443.

You may want to use VIPA so you have one external IP address into your SYSPLEX, and  z/OS SYSPLEX Distributor to route connection requests or distribute connection requests to available servers. If you want to do this you will need a configured VIPA TCPIP address.

You may need to specify which TCPIP instance to use if you have more than one TCPIP instance on an LPAR.

What keystores will be used

You need two keystores

  1. To identify the server – the keystore
  2. To validate certificates passed into the instance – the trust store.  Typically this has Certificate Authority certificates, and any self signed certificates.

These can be file based or SAF based using z/OS keyrings.  For example <keyStore filebased=”false” id=”racfKeyStore”
location=”safkeyring://START1/KEY” password=”password” readOnly=”true” type=”JCERACFKS”/> 

You might have enterprise keystores available to every one, or provide isolation so you have keystores for bank1 servers, and different keystores for bank2 servers.

Defining the APPL and SERVER resource

The Liberty default  APPL definition is BBGZDFLT. This allows people to access the server (the front door).  If you already have a Liberty installed then you may be able to use the existing definition.

If you want isolation, for example test and production, or between two major applications you will need to select and define different APPL resources.

You will need

<safCredentials profilePrefix="ZZZZDFLT"

and

RDEFINE APPL ZZZZDFLT UACC(NONE) 
PERMIT ZZZZDFLT CLASS(APPL) ACCESS(READ) ID(START1) 
PERMIT ZZZZDFLT CLASS(APPL) ACCESS(READ) ID(GRALL) 
SETROPTS RACLIST(APPL )REFRESH 

RDEFINE SERVER BBG.SECPFX.ZZZZDFLT UACC(NONE) 
PERMIT BBG.SECPFX.ZZZZDFLT CLASS(SERVER) ACCESS(READ) ID(START1) 
SETROPTS RACLIST(SERVER,APPL)REFRESH

   /* for z/OS Connect
RDEFINE EJBROLE ZZZZDFLT..zos.connect.access.roles.zosConnectAccess   +
   UACC(NONE) 
PERMIT ZZZZDFLT..zos.connect.access.roles.zosConnectAccess + 
   CLASS(EJBROLE)ACCESS(READ) ID(IBMUSER) 
SETROPTS RACLIST(EJBROLE )REFRESH 

   /* for MQ Web
RDEFINE EJBROLE MQWEB.com.ibm.mq.console.MQWebAdmin UACC(NONE)

These tend to be mixed case, so take care when defining them.

Starting the instance

If you use

S BAQSTRT,PARM='server1'
S BAQSTRT,PARM='server2'

If you use STOP BAQSTRT then both servers will stop.

If you use

S BAQSTRT.BAQ1,PARMS='server1' 
S BAQSTRT.BAQ2,PARMS='server2'

You can use STOP BAQ1 to stop just server1.

You can also use

S BAQSTRT,PARMS=’server1′,jobname=BAQ1
S BAQSTRT,PARMS=’server2′,jobname=BAQ2

Then you can use P BAQ1, and also have WLM give BAQ1 and BAQ2 different service classes, and so give them different priorities

Set up monitoring

There may be SMF data available, which you can collect if you enable the SMF collection classes.

You may have data from JMX which you collect and report.

Accessing the server

You will need to set up a profile and give permission to groups of people, just to be able to use the server.

You may need to protect individual applications, for example ability to start or stop applications, or to invoke the application.  This can be done once the basic setup has been done, and the system handed over.

For example in server.xml for z/OS connect

<zosconnect_service> 
<service name="stockquery" 
    serviceDescription="stockQueryServiceDescriptionColin" 
    id="stockQueryService" 
adminGroup="a3Admin,a4Admin" 
invokeGroup="a3Invoke" 
operationsGroup="a3Ops" 
readerGroup="a3Reader" 
/> 
</zosconnect_service>

It is good policy to only grant access to groups, and not individual ids as it simplified userid administration.  You would define a3Admin, a4Admin, a3Invoke as groups

 

How many servers do I need? Every one know this – or no one knows this

I was planning on installing a product on z/OS and was going through the documentation.  It is hard to see things that are not there, but I was surprised to see nothing about initial planning and how to set up the product.  I looked at other products – and they were also missing this information. It feels like this information is so obvious that every one knows – or the person responsible for the installation instructions only had experience of installing one image and documenting it.  (Tick the box, job done).

There are two reason for configuring more than one instance of a product

  1. You want multiple copies of the same configuration
  2. You want a different configuration.

This is obvious but it took me half an hour to realise this.  I’ll cover the implications of these decisions below

You want multiple copies of the same configuration

There are many reasons for this:

  1. You have to support more than one LPAR.
  2. You want to have more than one instance on an LPAR for availability, scalability and performance.  For example with a z/OS queue manager; it can support up to 10,000 channels, and log at about 100MB a second.  If you want to do more than this you need a second queue manager.
  3. You want availability by having instances running on multiple LPARs, so if one LPAR is shutdown  work can continue to flow to the other LPARs.

You want a different configuration

The main reason for this is isolation:

  1. You have different environments for example production and test.
  2. You have different customers for example you support, bank1 and bank2.  You want isolation so if bank1 fills up the disk space, bank2 is not affected.
  3. You want bank1 and bank2 systems to use different certificate authorities, so bank1 end users cannot connect to bank2 systems.
  4. You have different performance criteria – you configure bank1 systems to have higher proirity in LWM than bank2 systems.
  5. If you have bank1 and bank2 sharing a system, and you want to restart it, you have to get agreement from both the end users.  Getting agreement for a date from one bank is easier than getting agreement for a date from multiple banks.

Implications of “you want multiple copies of the same configuration”

Ideally you want to replicate a system with very little work – a so called cookie-cutter approach.

Configuration files

Create self contained configuration files.  For example with Liberty Web Server on z/OS each server has its own server.xml file.  Create a file called keystore.xml and include that in the server.xml file.  The server.xml configuration file may look like

<server>
<include location="${server.config.dir}/keystore.xml"/> 
<include location="${server.config.dir}/mq.xml"/> 
<include location="${server.config.dir}/saf.xml"/>
instance specific data
</server

With JCL you can use

//OUTPUT1 INCLUDE MEMBER=....

to include common configuration.

TCPIP ports

You are likely to use the same port number across different LPARs.  You can define a port as a SHAREPORT, and have multiple applications listening on the same port number on the same LPAR.

Started task userid

Have the instances start with the same userid, so they have the same  access to resources.

Liberty profilePrefix

Have the instances use the same profile prefix, default BBGZDFLT.  This is defined using RDEFINE APPLY BBGZDFLT…

Define who can access the instance using SAF EJBROLE for example giving different groups of people access to the  BBGZDFLT.zos.connect.access.roles.zosConnectAccess profile.

Isolated instances

If you want to isolate instances, then use the above list and make sure you use different values.

Future proof your definitions.

An advantage of “define something once in an include file, and reuse it” is that if you change the contents, for example /usr/lpp/mqm/V9R1M1/java/lib, you change it once, and restart the servers.

If the servers each have a unique configuration file, you have to make the same change in each file.   This can be good – for example you want to change this server this week, and that server next week.

You could also have an alias /usr/mq pointing to /usr/lpp/mqm/V9R1M1.   When you want to change the version of MQ, change the alias and restart the servers.  To undo the change, change the alias to the old version and restart.  This is much easier than changing the individual files ( how many change records would you have to raise?).

Liberty Specials.

With Liberty you have have one JCL procedure, and start multiple servers.  This environment is a mixture of multiple copies of the same configuration, and isolated instances.

Each server has its own server.xml file, but uses the same JCL file, so common userid etc.

You can start a liberty server

s baqstrt,parms='stockManager',jobname=smanager

And have WLM classify this under SMANAGER.

The default Liberty profile is stored under /var/zosconnect.  If you want to have a copy of this running on two LPARs you will need  to have the profiles store in different places.  You could have

/var/zonsconnect/LPAR1/…

but if you need to move the server to a different LPAR this might point to the wrong directory.

You could change the procedure so you pass in the location of the profile.

s baqstrt,parms=’stockManager’,jobname=smanager,profile=’/var/zosconnect1/instance1′

 

 

The 1960’s were great time for many – but you do not have to continue using JCL from just the 1960’s.

It is great that JCL written in the 1960s still work today.  However JCL has moved on, and there are better JCL techniques available today.  Unfortunately the SMP/E installation instructions seem to be back in the 1960’s.  Many products have customization using the same, manual, laborious techniques.

In the 1960 you edited jobs and made the same changes multiple times

I was doing an SMP/E installation and it took 5 times longer that it should.

For example part of some JCL in one PDS member

//DELCSI EXEC PGM=IDCAMS,REGION=64M,COND=(0,LT)
//SYSPRINT DD SYSOUT=*
//SYSIN DD *
DELETE @hlqgzone@.CSI
SET MAXCC = 0
/*

and comments telling you to change @hlqgzone@  (High-level dataset qualifier(s) for SMP/E global zone datasets) to your value.

In another PDS member there is

//APPLY EXEC PGM=GIMSMP,REGION=0M,COND=(0,LT)
//SMPCSI DD DISP=SHR,
// DSN=@hlqgzone@.CSI

and comments telling you to change @hlqgzone@  (High-level dataset qualifier(s) for SMP/E global zone datasets) to your value.

In another and PDS member has

//ACCEPT EXEC PGM=GIMSMP,REGION=0M,COND=(0,LT)
//SMPCSI DD DISP=SHR,
// DSN=@hlqgzone@.CSI

and comments telling you to change  @hlqgzone@  (High-level dataset qualifier(s) for SMP/E global zone datasets) to your value.

There is a boring trend here.  Having to make the same change in about 10 files is tedious;  but it gets worse.

In the first and second job I change @hlqgzone@ to COLIN.SMP, and in the third job I changed it to COLIN.SMPE. Of course this did not work, so I had to spent time fixing it.  Having to manually change many files is error prone.

 A better way of doing it

Create a member called MYDEFS in the PDS for example

//E1 EXPORT SYMLIST=*
//S1 SET OPT=COLINOPT
//S2 SET HLQ=COLIN.PRODUCT
//S3 SET SMPVOL=USER00

In each job use cut and paste at the top

//BBQALLOC JOB NOTIFY=&SYSUID
//MYLIB JCLLIB ORDER=IBMUSER.F2
//OUTPUT1 INCLUDE MEMBER=MYDEFS

No magic here; you could use these variables in the 1960s  JCL, for example use VOL=SER=&VOL.

The magic is in the ability to use symbols within the program input.

//HBBQ999D EXEC PGM=IDCAMS,REGION=4M,COND=(0,LT)
//SYSPRINT DD SYSOUT=*
//SYSIN DD *,SYMBOLS=JCLONLY
DELETE &HLQ..AZYZCOB
DELETE &HLQ..SZYXCOB

Because of the SYMBOLS=JCLONLY, the variable &HLQ is replaced with the variable.

And what’s more – it gets better!

You can have the SYMBOLS processing logged to a logging-DDname.

//SYSIN DD *,SYMBOLS=(JCLONLY,SYMBOLS)
//SYMBOLS DD SYSOUT=*

and the DDNAME SYMBOLS has the data

SYSIN : RECORD 1 BEFORE SUBSTITUTION
SYSIN : DELETE &HLQ..AZYXCOB
SYSIN : RECORD 1 AFTER SUBSTITUTION
SYSIN : DELETE COLIN.PRODUCT.AZYXCOB

This logging-DDname option came out in z/OS 2.2 July 2015, so this may be new to some people.

And while I am grumbling…

The SMP/E install has members like

BQQACCPT
BQQALLOC
BQQAPPLY
BQQDDDEF
BQQSMPSU
BQQRECVE

you have to follow the documentation very carefully as the jobs have to be run in a particular order.

If they were to be renamed

B1QSMPSU
B2QALLOC
B3QDDDEF
B4QRECVE
B5QAPPLY
B6QACCPT

then it is easy to know the order of running the jobs, there should be less documentation, and it would be faster.

 

How do I find out about my VIPA configuration?

This follows on from setting up VIPA for the Liberty web server to provide High Availability.  I had a few problems setting it up, and this blog post is about some of the commands I used to get it working.

I cover

  1. Is the VIPA active?
  2. Where is it running
  3. Are applications processing requests

Some IP basics.

  1. Every connection has an IP address at each end.  An address looks like 10.3.4.15 or 4 * 8 bit numbers.
  2. My machine has several connections, ethernet, wireless, and a tunnelling connection to z/OS. Each connection has a different IP address.
  3. Packets get routed through the network depending on the destination IP address.  The router has logic like,  packets going to 10.4.5.* go does this connection, packets for 17.2.2.* go down that connection, any other packets – try sending them to down the connection 11.13.6.6.
  4. The router uses a netmask to calculate which connection to use.
    1. A net mask is a string of 1’s followed by 0s.  For example 255.255.255.0 – or 3 * 8 =24 ones.
    2. A router takes a packet IP address and a netmask and logically ands them together, and uses the result to decide where to route the packet.
    3. A connection handling 10.4.1.0 to 10.4.1.255 would have a netmask of 255.255.255.0 (also written /24 bits) a default connection may handle all packets for 10.* with a netmask of 255.0.0.0 or /8.

My scenario

  1. I have my desktop machine running Ubnutu Linux
  2. I have z/OS (called SOW1) running on my desktop using the zPDT.
  3. I have 3 TCPIP images (stacks) running on the z/OS image
    1. TCPIP running as the front end
    2. TCPIP2 running as a backend – this could be on another LPAR
    3. TCPIp3 running as a backend
  4. I have a VIPA defined with address 10.1.3.10

What configuration does Ubuntu have?

There are many commands to display network configuration information on Linux.

What address does Ubuntu have?

ip address gives a lot of information – but I did not use it

What packet routing does my desktop have?

the command ip route gives

  1. 10.1.0.0/24 dev eno1 proto kernel scope link src 10.1.0.3 metric 100
  2. 10.1.1.0/24 dev tap0 proto kernel scope link src 10.1.1.1
  3. 10.1.2.0/24 dev tap1 proto kernel scope link src 10.1.2.1
  4. 10.1.3.0/24 dev tap0 scope link
  5. 10.20.2.4 dev tap0 scope link
  6. 192.168.1.0/24 dev wlxd037450ab7ac proto kernel scope link src 192.168.1.67 metric 600

Bold line(2) shows

  • Traffic for any address between 10.1.1.0 and 10.1.1.255 (remember the netmask /24 means 24 bits or 255.255.255.0) goes  to device(connection) tap0
  • The IP address for the desktop end of the connection is 1.1.1.1

Bold line(4) shows

  • that any traffic 10.1.3.0 to 10.1.3.255 goes to device tap0

The command used to set this up was sudo ip route add 10.1.3.0/24 dev tap0

Bold line(5) shows

  • that traffic to 10.20.2.4 goes to device tap0.

The command used to set this up was sudo ip route add  10.20.2.4 dev tap0

What is the routing for an IP address ?

You can use traceroute command to display which route a packet would take. For example

  • traceroute 10.1.3.10
    • traceroute to 10.1.3.10 (10.1.3.10), 30 hops max, 60 byte packets
    • 1 10.1.3.10 (10.1.3.10) 4.963 ms 4.980 ms 5.887 ms

For a connection that is not defined

traceroute 10.20.2.5 
traceroute to 10.20.2.5 (10.20.2.5), 30 hops max, 60 byte packets
1 bthub.home (192.nnn.1.mmm) 3.170 ms 4.742 ms 6.379 ms
2 * * *

So we can see it went to my bt hub  wireless router.

You can also use the ping command.  On linux there is the -R option for display route.

ping -R 10.1.3.10 
PING 10.1.3.10 (10.1.3.10) 56(124) bytes of data.
64 bytes from 10.1.1.2: icmp_seq=1 ttl=64 time=2.54 ms
NOP
    RR: 10.1.1.1
        10.1.1.2
        10.1.1.1

The request went to 10.1.1.1.  10.1.1.2 caught it, and sent the reply back, via 10.1.1.1

I was looking for my VIPA address, 10.1.3.10, and we can see it got to 10.1.1.2.

For the ping to work, there must be a server processing the ping request.  If there are no applications processing the VIPA, the VIPA is not active, so a ping will fail.

A successful ping to a VIPA address means a packet can get to the LPAR, be processed and  the reply set back.  If the ping does not respond it could be

  1. The VIPA is not active
  2. The VIPA is active and a packet was sent to the LPAR hosting the VIPA, but it could not send a response back due to a set up error.

How to issue change TCPIP configuration on z/OS

You can change the configuration of a TCPIP image using the operator command

V TCPIP,TCPIPn,OBEY,filename

Where

  1. V TCPIP tells z/OS to route this TCPIP
  2. TCPIPn is the name of the TCPIP address space to direct the command to, for example V TCPIP,TCPIP2.  If there is only one TCPIP running you can use V TCPIP,,
  3. OBEY this is the TCP command
  4. filename is the parameter passed to the OBEY command.   The filename containing the commands/configuration to be executed.

How to display information on z/OS

There are three ways of displaying TCPIP information, for example the IP address(es) of the TCP image

  1. The operator command D TCPIP,TCPIP2,NETSTAT,HOME… similar in syntax to the V TCPIP command above
  2. The TSO command NETSTAT HOME TCP TCPIP2
  3. The USS command netstat -h -p tcpip   The commands are similar to but different from Linux commands!

The output is usually similar between the commands.

What is the IP address of my TCPIP image?

From the TSO NETSTAT HOME command

EZZ2350I MVS TCP/IP NETSTAT CS V2R4 TCPIP Name: TCPIP2 17:15:53
EZZ2700I Home address list:
EZZ2701I Address   Link         Flg
EZZ2702I -------   ----         ---
EZZ2703I 10.1.1.3  ETH1         P
EZZ2703I 10.1.2.3  ETHB
EZZ2703I 172.1.1.2 EZASAMEMVS
EZZ2703I 10.1.3.10 VIPL0A01030A I
EZZ2703I 127.0.0.1 LOOPBACK

10.1.1.3  ties up with the information on the desktop which had IP addresses had 10.1.1.1 for device tap0, and 10.1.2.3 ties up with 10.1.2.1 for device tap1.

For the links

  1. I configured link ETH1 and ETHB.
  2. The VIPL0A01030A takes the IP address and converts it to hex so 10.1.3.10 becomes VIPL 0A 01 03 0A
  3. EZASAMEMVS is prefix EZA and “SAME MVS”.   This is generated by TCPIP from the DYNAMIXCF configuration.

What routing is there?

The command TSO command NETSTAT ROUTE TCP TCPIP2 or the USS command netstat -r -p tcpip gives

MVS TCP/IP NETSTAT CS V2R4 TCPIP Name: TCPIP2 16:15:43 
Destination  Gateway  Flags Refcnt     Interface 
----------- -------   ----- ------     --------- 
Default      10.1.1.1 UGS   0000000000 ETH1 
10.0.0.0/8   0.0.0.0  US    0000000000 ETH1 
10.1.1.3/32  0.0.0.0  UH    0000000000 ETH1 
10.1.2.0/24  0.0.0.0  US    0000000000 ETHB 
10.1.2.3/32  0.0.0.0  UH    0000000000 ETHB 
127.0.0.1/32 0.0.0.0  UH    0000000000 LOOPBACK 
172.1.1.1/32 0.0.0.0  UHS   0000000000 EZASAMEMVS 
172.1.1.2/32 0.0.0.0  UH    0000000000 EZASAMEMVS 
172.1.1.3/32 0.0.0.0  UHS   0000000000 EZASAMEMVS

This shows that to get to 10.1.2.0 to10.1.2.255 (with a netmask of /24 or  255.255.255.0) it goes by link(interface) ETHB

What is happening to my VIPA on  z/OS?

On the OSA connection (think ethernet connection)  from the desktop to my z/OS environment there could be several LPARs using the OSA, each with multiple TCP images.

The operator command D TCPIP,TCPIP2,SYSPLEX,VIPADYN issued on any LPAR on any active TCPIP image gives a Sysplex view of the VIPA configuration

11.54.05 STC09473  EZZ8260I SYSPLEX CS V2R4 387                         C  
VIPA DYNAMIC DISPLAY FROM TCPIP    AT S0W1                                 
IPADDR: 10.1.3.10  LINKNAME: VIPL0A01030A                                  
  ORIGIN: VIPADEFINE                                                       
  TCPNAME  MVSNAME  STATUS RANK ADDRESS MASK    NETWORK PREFIX  DIST       
  -------- -------- ------ ---- --------------- --------------- ----       
  TCPIP    S0W1     ACTIVE      255.255.255.0   10.1.3.0        DIST       
  TCPIP2   S0W1     BACKUP 001                                  DEST       
  TCPIP3   S0W1     ACTIVE                                      DEST       
IPADDR: 10.1.4.10                                                          
  TCPNAME  MVSNAME  STATUS RANK ADDRESS MASK    NETWORK PREFIX  DIST       
  -------- -------- ------ ---- --------------- --------------- ----       
  TCPIP3   S0W1     ACTIVE      255.255.255.0   10.1.4.0                   
  TCPIP2   S0W1     MOVING      255.255.255.0   0.0.0.0                    

IPADDR:10.1.3.10

The VIPA 10.1.3.10 was created using a VIPADEFINE.

We see that TCPIP on S0W1 “owns” the VIPA  10.1.3.10 and is responsible for distributing requests.  This image is DISTributing requests to other TCPIP Images.

The DEST means it is a target for connections ( a DESTination)  and has a server processing requests. BOTH means it is a DESTination and  DISTributing connections, and has a server processing them.

IPADDR:10.1.4.10

We can see that TCPIP3 is processing request.   TCPIP2 is not processing requests, it does not have a network prefix.

How are DVIPA connection requests distributed?

You need to ask the TCP that owns the VIPA. In my case, from the previous section, this is TCPIP.

The TSO command NETSTAT VDPT TCP TCPIP  or the USS command netstat -O -p tcpip gives

MVS TCP/IP NETSTAT CS V2R4 TCPIP Name: TCPIP 16:49:42 
Dynamic VIPA Destination Port Table for TCP/IP stacks: 
Dest IPaddr DPort DestXCF Addr Rdy TotalConn  WLM TSR Flg 
----------- ----- ------------ --- ---------  --- --- --- 
10.1.3.10   08443 172.1.1.2    001 0000000005  01 100 
  DistMethod: Roundrobin 
  TCSR: 100 CER: 100 SEF: 100 
  ActConn:   0000000000 
10.1.3.10   08443 172.1.1.3    000 0000000000  01 100 
  DistMethod: Roundrobin 
  TCSR: 100 CER: 100 SEF: 100 
  ActConn:  0000000000

We have a heading showing the TCPIP image name, and we are looking at Dynamic VIPA Destination Port Table for TCP/IP stacks.

When report was generated the application on 172.1.1.2 (TCPIP2) was active, and the application on TCPIP3 had been stopped.

From

Dest IPaddr DPort DestXCF Addr Rdy TotalConn  WLM TSR Flg 
----------- ----- ------------ --- ---------  --- --- --- 
10.1.3.10   08443 172.1.1.2    001 0000000005  01 100 
ActConn: 0000000000

We can see

  • Dest IPaddr: 10.1.3.10 is our VIPA address
  • DPort :08443 is the destination port
  • DestXCF Addr: 172.1.1.2 is where the request is going – we know this is TCPIP2.  It would be good if it could say SOW1.TCPIP2
  • Rdy: 001 there is one active application listening
  • TotalConn: 0000000005 there have been 5 requests to this application
  • ActConn: 0000000000 there are no active connections to this application

As TotalConn is greater than 0, this means there have been connections to the application, so is a good sign to show the set-up is working.

Because the front end TCPIP is distributing the requests using Roundrobin – each TCPIP should get a connection in turn.

When I started the application on TCPIP3, and started another application on TCPIP2.  When I ran a workload I had 10 requests go to TCPIP3 and 10 requests go to TCPIP2.  On TCPIP2 the requests were evenly distributed between the two servers.  It looked like round robin, but I do know know if this was design or chance

How do I know if I have a backup configuration defined?

I set up TCPIP with a VIPABACKUP configuration.   The operator command d tcpip,tcpip,sysplex,vipadyn  gave me

VIPA DYNAMIC DISPLAY FROM TCPIP AT S0W1 
IPADDR: 10.1.3.10 LINKNAME: VIPL0A01030A 
ORIGIN: VIPADEFINE 
TCPNAME  MVSNAME  STATUS RANK ADDRESS MASK    NETWORK PREFIX  DIST 
-------- -------- ------ ---- --------------- --------------- ---- 
TCPIP    S0W1     ACTIVE      255.255.255.0   10.1.3.0        DIST 
TCPIP2   S0W1     BACKUP 001                                  DEST 
TCPIP3   S0W1     ACTIVE                                      DEST

We can see that TCPIP2 is defined as being the backup.

What connections does this TCPIP have

You can use the TSO command NETSTAT ALLCONN TCP TCPIP2 or the USS command  netstat -a -p tcpip2  to show what sessions are active.

MVS TCP/IP NETSTAT CS V2R4       TCPIP Name: TCPIP2          07:19:15  
User Id  Conn     Local Socket           Foreign Socket         State  
-------  ----     ------------           --------------         -----  
MYSERVER 0000003F 10.1.3.10..8443        0.0.0.0..0             Listen 
MYSERVER 0000003C 10.1.3.10..8443        0.0.0.0..0             Listen 
MYSERVER 0000004B 10.1.3.10..8443        0.0.0.0..0             Listen 

This shows there are 3 instances of MYSERVER running using IP address 10.1.3.10 and port 8443.

There will usually be a lot of output.  You can filter the request by

  • tso netstat allconn tcp tcpip2 (ipaddr 10.1.3.10
  • tso netstat allconn tcp tcpip2 (port 8443
  • uss  netstat -a -p tcpip2 -I 10.1.3.10 
  • uss netstat -a -p tcpip2 -P 8443
  • operator D TCPIP,tcpip2,netstat,allconn,ipaddr=10.1.3.10 
  • operator D TCPIP,tcpip2,netstat,allconn,port=8443

 

What VIPA stuff does this TCPIP have?

USS netstat -v  -p tcpip3 or TSO NETSTAT VIPADYN TCP TCPIP3

MVS TCP/IP NETSTAT CS V2R4       TCPIP Name: TCPIP3          10:21:04 
Dynamic VIPA: 
  IP Address      AddressMask     Status    Origination     DistStat 
  ----------      -----------     ------    -----------     -------- 
  10.1.3.10       255.255.255.0   Active                    Dest 
    ActTime:      08/30/2020 10:40:10 
  10.1.4.10       255.255.255.0   Active    VIPARange Bind 
    ActTime:      08/30/2020 11:03:05        JobName:        MYSERVER

The 10.1.3.10 VIPA is created using VIPADEFINE.  VIPA 10.1.4.10 was create by means of VIPARANGE.

There may be multiple jobs processing the port. MYSERVER is just one of them.