How do I find out about my VIPA configuration?

This follows on from setting up VIPA for the Liberty web server to provide High Availability.  I had a few problems setting it up, and this blog post is about some of the commands I used to get it working.

I cover

  1. Is the VIPA active?
  2. Where is it running
  3. Are applications processing requests

Some IP basics.

  1. Every connection has an IP address at each end.  An address looks like 10.3.4.15 or 4 * 8 bit numbers.
  2. My machine has several connections, ethernet, wireless, and a tunnelling connection to z/OS. Each connection has a different IP address.
  3. Packets get routed through the network depending on the destination IP address.  The router has logic like,  packets going to 10.4.5.* go does this connection, packets for 17.2.2.* go down that connection, any other packets – try sending them to down the connection 11.13.6.6.
  4. The router uses a netmask to calculate which connection to use.
    1. A net mask is a string of 1’s followed by 0s.  For example 255.255.255.0 – or 3 * 8 =24 ones.
    2. A router takes a packet IP address and a netmask and logically ands them together, and uses the result to decide where to route the packet.
    3. A connection handling 10.4.1.0 to 10.4.1.255 would have a netmask of 255.255.255.0 (also written /24 bits) a default connection may handle all packets for 10.* with a netmask of 255.0.0.0 or /8.

My scenario

  1. I have my desktop machine running Ubnutu Linux
  2. I have z/OS (called SOW1) running on my desktop using the zPDT.
  3. I have 3 TCPIP images (stacks) running on the z/OS image
    1. TCPIP running as the front end
    2. TCPIP2 running as a backend – this could be on another LPAR
    3. TCPIp3 running as a backend
  4. I have a VIPA defined with address 10.1.3.10

What configuration does Ubuntu have?

There are many commands to display network configuration information on Linux.

What address does Ubuntu have?

ip address gives a lot of information – but I did not use it

What packet routing does my desktop have?

the command ip route gives

  1. 10.1.0.0/24 dev eno1 proto kernel scope link src 10.1.0.3 metric 100
  2. 10.1.1.0/24 dev tap0 proto kernel scope link src 10.1.1.1
  3. 10.1.2.0/24 dev tap1 proto kernel scope link src 10.1.2.1
  4. 10.1.3.0/24 dev tap0 scope link
  5. 10.20.2.4 dev tap0 scope link
  6. 192.168.1.0/24 dev wlxd037450ab7ac proto kernel scope link src 192.168.1.67 metric 600

Bold line(2) shows

  • Traffic for any address between 10.1.1.0 and 10.1.1.255 (remember the netmask /24 means 24 bits or 255.255.255.0) goes  to device(connection) tap0
  • The IP address for the desktop end of the connection is 1.1.1.1

Bold line(4) shows

  • that any traffic 10.1.3.0 to 10.1.3.255 goes to device tap0

The command used to set this up was sudo ip route add 10.1.3.0/24 dev tap0

Bold line(5) shows

  • that traffic to 10.20.2.4 goes to device tap0.

The command used to set this up was sudo ip route add  10.20.2.4 dev tap0

What is the routing for an IP address ?

You can use traceroute command to display which route a packet would take. For example

  • traceroute 10.1.3.10
    • traceroute to 10.1.3.10 (10.1.3.10), 30 hops max, 60 byte packets
    • 1 10.1.3.10 (10.1.3.10) 4.963 ms 4.980 ms 5.887 ms

For a connection that is not defined

traceroute 10.20.2.5 
traceroute to 10.20.2.5 (10.20.2.5), 30 hops max, 60 byte packets
1 bthub.home (192.nnn.1.mmm) 3.170 ms 4.742 ms 6.379 ms
2 * * *

So we can see it went to my bt hub  wireless router.

You can also use the ping command.  On linux there is the -R option for display route.

ping -R 10.1.3.10 
PING 10.1.3.10 (10.1.3.10) 56(124) bytes of data.
64 bytes from 10.1.1.2: icmp_seq=1 ttl=64 time=2.54 ms
NOP
    RR: 10.1.1.1
        10.1.1.2
        10.1.1.1

The request went to 10.1.1.1.  10.1.1.2 caught it, and sent the reply back, via 10.1.1.1

I was looking for my VIPA address, 10.1.3.10, and we can see it got to 10.1.1.2.

For the ping to work, there must be a server processing the ping request.  If there are no applications processing the VIPA, the VIPA is not active, so a ping will fail.

A successful ping to a VIPA address means a packet can get to the LPAR, be processed and  the reply set back.  If the ping does not respond it could be

  1. The VIPA is not active
  2. The VIPA is active and a packet was sent to the LPAR hosting the VIPA, but it could not send a response back due to a set up error.

How to issue change TCPIP configuration on z/OS

You can change the configuration of a TCPIP image using the operator command

V TCPIP,TCPIPn,OBEY,filename

Where

  1. V TCPIP tells z/OS to route this TCPIP
  2. TCPIPn is the name of the TCPIP address space to direct the command to, for example V TCPIP,TCPIP2.  If there is only one TCPIP running you can use V TCPIP,,
  3. OBEY this is the TCP command
  4. filename is the parameter passed to the OBEY command.   The filename containing the commands/configuration to be executed.

How to display information on z/OS

There are three ways of displaying TCPIP information, for example the IP address(es) of the TCP image

  1. The operator command D TCPIP,TCPIP2,NETSTAT,HOME… similar in syntax to the V TCPIP command above
  2. The TSO command NETSTAT HOME TCP TCPIP2
  3. The USS command netstat -h -p tcpip   The commands are similar to but different from Linux commands!

The output is usually similar between the commands.

What is the IP address of my TCPIP image?

From the TSO NETSTAT HOME command

EZZ2350I MVS TCP/IP NETSTAT CS V2R4 TCPIP Name: TCPIP2 17:15:53
EZZ2700I Home address list:
EZZ2701I Address   Link         Flg
EZZ2702I -------   ----         ---
EZZ2703I 10.1.1.3  ETH1         P
EZZ2703I 10.1.2.3  ETHB
EZZ2703I 172.1.1.2 EZASAMEMVS
EZZ2703I 10.1.3.10 VIPL0A01030A I
EZZ2703I 127.0.0.1 LOOPBACK

10.1.1.3  ties up with the information on the desktop which had IP addresses had 10.1.1.1 for device tap0, and 10.1.2.3 ties up with 10.1.2.1 for device tap1.

For the links

  1. I configured link ETH1 and ETHB.
  2. The VIPL0A01030A takes the IP address and converts it to hex so 10.1.3.10 becomes VIPL 0A 01 03 0A
  3. EZASAMEMVS is prefix EZA and “SAME MVS”.   This is generated by TCPIP from the DYNAMIXCF configuration.

What routing is there?

The command TSO command NETSTAT ROUTE TCP TCPIP2 or the USS command netstat -r -p tcpip gives

MVS TCP/IP NETSTAT CS V2R4 TCPIP Name: TCPIP2 16:15:43 
Destination  Gateway  Flags Refcnt     Interface 
----------- -------   ----- ------     --------- 
Default      10.1.1.1 UGS   0000000000 ETH1 
10.0.0.0/8   0.0.0.0  US    0000000000 ETH1 
10.1.1.3/32  0.0.0.0  UH    0000000000 ETH1 
10.1.2.0/24  0.0.0.0  US    0000000000 ETHB 
10.1.2.3/32  0.0.0.0  UH    0000000000 ETHB 
127.0.0.1/32 0.0.0.0  UH    0000000000 LOOPBACK 
172.1.1.1/32 0.0.0.0  UHS   0000000000 EZASAMEMVS 
172.1.1.2/32 0.0.0.0  UH    0000000000 EZASAMEMVS 
172.1.1.3/32 0.0.0.0  UHS   0000000000 EZASAMEMVS

This shows that to get to 10.1.2.0 to10.1.2.255 (with a netmask of /24 or  255.255.255.0) it goes by link(interface) ETHB

What is happening to my VIPA on  z/OS?

On the OSA connection (think ethernet connection)  from the desktop to my z/OS environment there could be several LPARs using the OSA, each with multiple TCP images.

The operator command D TCPIP,TCPIP2,SYSPLEX,VIPADYN issued on any LPAR on any active TCPIP image gives a Sysplex view of the VIPA configuration

11.54.05 STC09473  EZZ8260I SYSPLEX CS V2R4 387                         C  
VIPA DYNAMIC DISPLAY FROM TCPIP    AT S0W1                                 
IPADDR: 10.1.3.10  LINKNAME: VIPL0A01030A                                  
  ORIGIN: VIPADEFINE                                                       
  TCPNAME  MVSNAME  STATUS RANK ADDRESS MASK    NETWORK PREFIX  DIST       
  -------- -------- ------ ---- --------------- --------------- ----       
  TCPIP    S0W1     ACTIVE      255.255.255.0   10.1.3.0        DIST       
  TCPIP2   S0W1     BACKUP 001                                  DEST       
  TCPIP3   S0W1     ACTIVE                                      DEST       
IPADDR: 10.1.4.10                                                          
  TCPNAME  MVSNAME  STATUS RANK ADDRESS MASK    NETWORK PREFIX  DIST       
  -------- -------- ------ ---- --------------- --------------- ----       
  TCPIP3   S0W1     ACTIVE      255.255.255.0   10.1.4.0                   
  TCPIP2   S0W1     MOVING      255.255.255.0   0.0.0.0                    

IPADDR:10.1.3.10

The VIPA 10.1.3.10 was created using a VIPADEFINE.

We see that TCPIP on S0W1 “owns” the VIPA  10.1.3.10 and is responsible for distributing requests.  This image is DISTributing requests to other TCPIP Images.

The DEST means it is a target for connections ( a DESTination)  and has a server processing requests. BOTH means it is a DESTination and  DISTributing connections, and has a server processing them.

IPADDR:10.1.4.10

We can see that TCPIP3 is processing request.   TCPIP2 is not processing requests, it does not have a network prefix.

How are DVIPA connection requests distributed?

You need to ask the TCP that owns the VIPA. In my case, from the previous section, this is TCPIP.

The TSO command NETSTAT VDPT TCP TCPIP  or the USS command netstat -O -p tcpip gives

MVS TCP/IP NETSTAT CS V2R4 TCPIP Name: TCPIP 16:49:42 
Dynamic VIPA Destination Port Table for TCP/IP stacks: 
Dest IPaddr DPort DestXCF Addr Rdy TotalConn  WLM TSR Flg 
----------- ----- ------------ --- ---------  --- --- --- 
10.1.3.10   08443 172.1.1.2    001 0000000005  01 100 
  DistMethod: Roundrobin 
  TCSR: 100 CER: 100 SEF: 100 
  ActConn:   0000000000 
10.1.3.10   08443 172.1.1.3    000 0000000000  01 100 
  DistMethod: Roundrobin 
  TCSR: 100 CER: 100 SEF: 100 
  ActConn:  0000000000

We have a heading showing the TCPIP image name, and we are looking at Dynamic VIPA Destination Port Table for TCP/IP stacks.

When report was generated the application on 172.1.1.2 (TCPIP2) was active, and the application on TCPIP3 had been stopped.

From

Dest IPaddr DPort DestXCF Addr Rdy TotalConn  WLM TSR Flg 
----------- ----- ------------ --- ---------  --- --- --- 
10.1.3.10   08443 172.1.1.2    001 0000000005  01 100 
ActConn: 0000000000

We can see

  • Dest IPaddr: 10.1.3.10 is our VIPA address
  • DPort :08443 is the destination port
  • DestXCF Addr: 172.1.1.2 is where the request is going – we know this is TCPIP2.  It would be good if it could say SOW1.TCPIP2
  • Rdy: 001 there is one active application listening
  • TotalConn: 0000000005 there have been 5 requests to this application
  • ActConn: 0000000000 there are no active connections to this application

As TotalConn is greater than 0, this means there have been connections to the application, so is a good sign to show the set-up is working.

Because the front end TCPIP is distributing the requests using Roundrobin – each TCPIP should get a connection in turn.

When I started the application on TCPIP3, and started another application on TCPIP2.  When I ran a workload I had 10 requests go to TCPIP3 and 10 requests go to TCPIP2.  On TCPIP2 the requests were evenly distributed between the two servers.  It looked like round robin, but I do know know if this was design or chance

How do I know if I have a backup configuration defined?

I set up TCPIP with a VIPABACKUP configuration.   The operator command d tcpip,tcpip,sysplex,vipadyn  gave me

VIPA DYNAMIC DISPLAY FROM TCPIP AT S0W1 
IPADDR: 10.1.3.10 LINKNAME: VIPL0A01030A 
ORIGIN: VIPADEFINE 
TCPNAME  MVSNAME  STATUS RANK ADDRESS MASK    NETWORK PREFIX  DIST 
-------- -------- ------ ---- --------------- --------------- ---- 
TCPIP    S0W1     ACTIVE      255.255.255.0   10.1.3.0        DIST 
TCPIP2   S0W1     BACKUP 001                                  DEST 
TCPIP3   S0W1     ACTIVE                                      DEST

We can see that TCPIP2 is defined as being the backup.

What connections does this TCPIP have

You can use the TSO command NETSTAT ALLCONN TCP TCPIP2 or the USS command  netstat -a -p tcpip2  to show what sessions are active.

MVS TCP/IP NETSTAT CS V2R4       TCPIP Name: TCPIP2          07:19:15  
User Id  Conn     Local Socket           Foreign Socket         State  
-------  ----     ------------           --------------         -----  
MYSERVER 0000003F 10.1.3.10..8443        0.0.0.0..0             Listen 
MYSERVER 0000003C 10.1.3.10..8443        0.0.0.0..0             Listen 
MYSERVER 0000004B 10.1.3.10..8443        0.0.0.0..0             Listen 

This shows there are 3 instances of MYSERVER running using IP address 10.1.3.10 and port 8443.

There will usually be a lot of output.  You can filter the request by

  • tso netstat allconn tcp tcpip2 (ipaddr 10.1.3.10
  • tso netstat allconn tcp tcpip2 (port 8443
  • uss  netstat -a -p tcpip2 -I 10.1.3.10 
  • uss netstat -a -p tcpip2 -P 8443
  • operator D TCPIP,tcpip2,netstat,allconn,ipaddr=10.1.3.10 
  • operator D TCPIP,tcpip2,netstat,allconn,port=8443

 

What VIPA stuff does this TCPIP have?

USS netstat -v  -p tcpip3 or TSO NETSTAT VIPADYN TCP TCPIP3

MVS TCP/IP NETSTAT CS V2R4       TCPIP Name: TCPIP3          10:21:04 
Dynamic VIPA: 
  IP Address      AddressMask     Status    Origination     DistStat 
  ----------      -----------     ------    -----------     -------- 
  10.1.3.10       255.255.255.0   Active                    Dest 
    ActTime:      08/30/2020 10:40:10 
  10.1.4.10       255.255.255.0   Active    VIPARange Bind 
    ActTime:      08/30/2020 11:03:05        JobName:        MYSERVER

The 10.1.3.10 VIPA is created using VIPADEFINE.  VIPA 10.1.4.10 was create by means of VIPARANGE.

There may be multiple jobs processing the port. MYSERVER is just one of them.

 

HA Liberty web server – implementing VIPA with distributing connections.

Overview of VIPA solutions

You can implement VIPA, where you give your application its own IP address, across multiple TCPIP images.   This solves the problem of certificates not matching the host IP address.

You can have

  • One TCPIP image processing the connection requests. You have multiple TCPIP images – but only one TCPIP image at a time processes the connections.   If the TCPIP image stops, another can take over.
  • Multiple TCPIP stacks can process connection requests. A front end TCPIP image takes the connection requests and distributes them to TCPIP instances where the application is running.   You can use load balancing across multiple TCPIP images such distributing the connection techniques such as Round Robin, or Hot Standby.  This is based on Sysplex Distributor technology.

This blog post discusses the second case.

To provide background information, I created

Sysplex Distributor background

Sysplex Distributor is like having a router inside your TCPIP on z/OS; it can route traffic transparently to other TCP images in the environment.  The Sysplex Distributor can distribute connection requests for a  VIPA requests to TCPIP images where the application is running.    This set up is called Distributed VIPA (DVIPA).

It took me about 2 weeks to get a Sysplex DVIPA working – about a week understanding the documentation on VIPA, and the other week trying to understand why it didn’t work – and the simple configuration error I had.

I’ll break it down into simple stages which should help you understand the documentation.

The scenario

The scenario I used was going from my laptop, to z/OS running under zPDT on my laptop.  In effect there was an OSA connection between Ubuntu and my z/OS LPAR.  If you do not know what an OSA is, think of it as an Ethernet connection which can plug into multiple z/OS LPARs.

  1. My Ubuntu had an address of 10.1.1.1 over the tunnel connection
  2. I had one LPAR  with three TCPIP images
    1. TCPIP, host address 10.1.1.1 for the primary “front end”
    2. TCPIP2, host address 10.1.1.2 for the backup “front end” and where a server instance was running
    3. TCPIP3, host address 10.1.1.3 with a server instance running.
  3. I used a VIPA address of 10.1.3.10

The steps are

  1. Connect from Ubuntu to the z/OS
  2. Configure the LPAR(s)
  3. Define the VIPA configuration
  4. Define the “routing” to where the server was running.
  5. Getting the server to use the VIPA
  6. Commands to see what is going on (or not as the case may be)

Connect from Ubuntu to the z/OS

I had an existing tunnel connection from Ubuntu to z/OS.

I used the ip route command

sudo ip r add 10.1.3.10 link tap0

to define the route to 10.1.3.10 via the tunnelling device tap0 .  This looks like an OSA connection into z/OS.

Configure the LPAR(s)

The Sysplex Router uses XCF communications between LPARs and TCPIP images on the LPARs.

You configure each TCPIP with a statement

IFCONFIG DYNAMICXCF 172.1.2.x

My “frontend” TCP/IP had  IFCONFIG DYNAMICXCF 172.1.2.1, the other two TCP/IP images had 172.1.2.2 and 172.1.2.3.

The 172.1.2.x address can be any address not used by your enterprise.  It is internal to the Sysplex configuration.

Define the VIPA configuration

You define the configuration once, in the front end TCPIP.  It is visible from the other TCP/IP images because the information is shared via the DYNAMICXCF.

You define the VIPA in the front end TCP/IP image with a VIPADEFINE netmask address.  I used

VIPADYNAMIC
  VIPADEFINE 255.255.255.0 10.1.3.10
...
ENDVIPADYNAMIC

You can define VIPABACKUP in another TCPIP image, so if the main front end TCP/IP is not available then a backup can take the traffic and distribute it to the other TCP/IP stacks.

When the main front end TCP/IP image is restarted, you can have it take back the routing.

Define the “routing” to where the server was running.

You can define a variety of ways of routing the work

  • A Hot Standby – where the input to the front end TCP/IP image is routed to a single “backend” application’s TCP/IP image.  If this fails, the work is routed to a running backup application.
  • A Round Robin – where requests are routed in turn to each TCPIP with an active application.
  • Routing depending on WLM or other load characteristics.

This routing is done by the VIPADISTRIBUTE command in the front end TCP/IP image.

The definitions for the front end TCPIP

IPCONFIG SYSPLEXROUTING 
    DYNAMICXCF 172.1.1.1 255.255.255.0 3 

VIPADYNAMIC 
   VIPADEFINE 255.255.255.0 10.1.3.10 

   VIPADISTRIBUTE DEFINE DISTM ROUNDROBIN 10.1 .3.10 PORT 8443 
      DESTIP 
         172.1.1.2 
         172.1.1.3 
ENDVIPADYNAMIC 

This routes connection requests to the TCPIP images with the DYNAMICXCF of 172.1.1.2 (TCPIP2) and 172.1.1.3 (TCPIP3)

The definitions for TCPIP2

IPCONFIG SYSPLEXROUTING 
DYNAMICXCF 172.1.1.2 255.255.255.0 3 

The definitions for TCIP3

IPCONFIG SYSPLEXROUTING 
DYNAMICXCF 172.1.1.3 255.255.255.0 3

Use of VIPABACKUP

If the backup TCPIP front end image it used, it can have its own VIPADISTRIBUTE statement, or just use the same statement shared from the main front end TCP/IP image.  It is better to have the VIPADISTIBUTE statements, for the case when the backup TCPIP is started before the front end TCPIP.   The backup needs the VIPADISTRIBUTE statements. (These statements can be put into a PDS, and included using the INCLUDE dataset(member) statement in both primary and backup environments.)

To define TCPIP2 as a backup I used

VIPADYNAMIC 
    VIPABACKUP MOVEABLE IMMEDIATE 255.255.255.0 10.1.3.10 
    VIPADISTRIBUTE DEFINE DISTM ROUNDROBIN 10.1.3.10 PORT 8443 
        DESTIP 
        172.1.1.2 
        172.1.1.3 
ENDVIPADYNAMIC 

Getting the server to use the VIPA

The TCP/IP images hosting the applications just have the IFCONFIG DYNAMICXCF aa.bb.cc.dd statement.  They do not have any VIPADYNAMIC … ENDVIPADYNAMIC statements unless they are the main or backup front end TCP/IP images.

The application can connect using the VIPA address, for example create the SSLSOCKET Listener passing the VIPA address.   You can also configure TCP/IP so when a port is used, it binds to a particular IP address for example

PORT 8443 BIND 10.1.3.10

So an application using port 8443 to listen, will get IP address 10.1.3.10 – which in my case is a VIPA address.

You can use

PORT
9443 TCP * SHAREPORT BIND 10.1.3.7

to allow the port to be shared by many applications on a TCPIP Instance.

How are the connections distributed?

The VIPADISTRIBUTE  has many routing options. I used Hot Standby and Round Robin.

With RoundRobin, I had

  • the front end TCPIP
  • TCPIP2 with two Liberty servers
  • TCPIP3 with  one Liberty server

I ran some workload and found that the server on TCPIP3 had half the requests, and each of the two servers on TCPIP2 had a quarter of the overall requests.  This shows the routing is done at the TCPIP level – not the number of servers.

HA Liberty web server – implementing VIPA using the simpler VIPARANGE technology

Overview of VIPA solutions

You can implement VIPA, where you give your application its own IP address, across multiple TCPIP images.   This solves the problem of certificates not matching the host IP address.

  • One TCPIP image processes the connection requests. You have multiple TCPIP images – but only one TCPIP image at a time processes the connections.   If the TCPIP image stops, another can take over.
  • Multiple TCPIP stacks can process connection requests. This uses Sysplex Distributor;  a front end TCPIP image takes the connection requests and distributes them to TCPIP instances where the application is running.   You can use load balancing such as Round Robin, or Hot Standby.

This blog post discusses the first case.

To provide background information, I created

Using VIPARANGE configuration

The technique uses the VIPARANGE configuration statement.

The concept is that many LPARs can be attached to an OSA adapter, one, just one,  TCPIP stack (I dont know which of the available images) takes the connection requests and passes them on to the application on that TCPIP image.

You allocate a range of TCPIP address for your applications, with the same network prefix, for example 9.4.6.x   Allocate a host id to a Liberty, for example 9.4.6.7.   The Liberty instance keeps this address for whereever it runs.  You configure your routers so that  9.4.5.* are routed to the OSA adapter.

For each TCPIP image where you want to run Liberty, add to the  TCPIP startup configuration (or to an OBEY file)

VIPADYNAMIC 
   VIPARANGE DEFINE 255.255.255.0 9.4.6.7 
ENDVIPADYNAMIC

The 255.255.255.0 is the  subnet mask.  If your organisation uses a different subnet mask, it affects the IP addresses used.

These instructions say that they are defining a range of IP addresses on this LPAR, for  9.4.6.1 to 9.4.6.254

If an application connects to TCPIP, and the bind specifies a value in this range (9.4.6.1 to 9.4.6.254) then it is considered a VIPA address.  If the application used 9.4.6.7 this would count as a VIPA address.

When your application (Liberty) connects to TCP and uses an address in the VIPARANGE,  the TCPIP instance will create a dynamic IP address.   When I started my server application,   I got a z/OS console message

EZD1205I DYNAMIC VIPA 9.4.6.7 WAS CREATED USING BIND BY jobname ON TCPIP2.

When I shut down the server I got

EZD1298I DYNAMIC VIPA  9.4.6.7 DELETED FROM TCPIP2
EZD1207I DYNAMIC VIPA 9.4.6.7 WAS DELETED USING CLOSE API BY jobname ON TCPIP2

If the VIPA address is active on more than one TCPIP image, just one image will get all of the requests.  If you stop this image, another TCPIP image can take over.

If you have a different server using the same IP address, but a different port number, because they use the same IP address, the same LPAR will process the requests.

With VIPAROUTE you do not get connections distributed to more than one TCPIP image.

In your browser use  9.4.6.7:9443 address, the network router, routes this to the OSA, a TCPIP captures this and passes it to the application (Liberty).   As part of the handshake Liberty sends down its certificate, which has a SAN of  9.4.6.7 which matches the IP address, so this works.

On another day, when a different z/OS image is capturing the VIPA address connections,  the TCPIP address is 9.4.6.7 as before, so this matches the SAN in the certificate.

Testing it

In a test I used “ping -R 9.4.6.7 ” to the VIPA address.
This reported it was sent to TCPIP stack with 10.1.1.2. When I shut this TCPIP image down, ping reported the request was sent to 10.1.1.3.  It did this with no manual intervention.

 

HA Liberty web server – introduction to using VIPA to provide high availability connectivity.

I struggled to set up Liberty to provide a Highly Available solution – if I shut down one TCPIP instance, I want to access Liberty through another TCPIP instance.  In principle it is easy; but there is just a little problem when you are using certificates on the z/OS image.  I’ll use names rather than IP addresses in the discussion below.

The problem

Take the simple scenario where have your Liberty instance running on z/OS image with IP address MVS1 port 9443.

Your web browser uses MVS1:9443/ibmmq/console to initiate the sign-on.  As part of the handshake Liberty sends down its certificate with the Subject Alternative Name(SAN) of MVS1.    The browser checks that this SAN is the same as where the Liberty instance is running and, as it matches, the logon proceeds.

You want to shut down that LPAR, and run the work on another LPAR,  MVS2.    The Liberty instance starts up, binds to TCPIP and waits for requests.   The web browser connects to MVS2, and sends down the certificate with the SAN of MVS1.  As MVS1 does not match MVS2, the browser complains saying that someone could be stealing your information.

You could have a certificate for each LPAR, but this is additional administration overhead.

You also have the problem of your browser getting to the different IP address.  This could be a different URL, update to the DNS server, a change of router configuration, or change your work station to direct the request to a different place.

The solution

This has been solved using Virtual IP Addressing or VIPA.   In simple terms, give the Web Server its own IP address, which can move around between different LPARs in a Sysplex.

This area has a lot of new, complex jargon.  You have Static VIPAs, Dynamic VIPAs, and Distributed VIPAs .  The TCP documentation is not bad, but it focuses on TCP/IP, not how it will be used.   The documentation has example configurations, but one configuration covers many scenarios.  I was looking for  a simple, getting started example.

There are a couple of ways of setting up the configuration.

  • The simplest scenario, use VIPARANGE.
    • Liberty has its own IP address which can be activated on different TCP images
    • Once set up, when the web server binds to TCPIP, the IP address is created on the TCPIP image.
    • When the VIPA is active, the TCPIP image will listen for the request.  I had two TCPIP images listening on the same OSA connection.   The connections went to one TCPIP.   When I stopped that TCPIP, the connections automatically went to the other TCPIP.  When I stopped both TCPIP the client got “No route to host (Host unreachable)”.
    • You can have a web server with the same IP address running on different TCPIP images at the same time (with different configuration files).
  • Use Sysplex Distributer.   This has a front end IP which takes connection requests and distributes them to TCPIP images.   It can do this using “Hot Standby”, “round robin” and other techniques.  If the front end is shut down, you can configure other TCPIP images to take over.

How it works – TCPIP (Background knowledge needed to understand how to configure a VIPA)

When I was a child, I had a series of books called “How it works…”, for example “How it works – the motor car”, “How it works – Nuclear Physics.  As I’ve been working on making Liberty highy available, and using Virtual IP Addresses (VIPAs), I’ve realised that I had had holes in my knowledge about TCPIP.    There are many books about how TCP/IP works, but they do not provide the information in an easily digested format  – and often went too deep too quickly.   So this blog post is my view on what you need to know to understand VIPAs etc.

I’ll only consider TCPIP V4.

Topics covered

  • IP connections
  • Subnet mask
  • How applications use TCPIP
  • How applications can bind to a specific IP address and port
  • On z/OS
    • How to issue TCPIP commands on z/OS
    • What is the IP address of my TCPIP image?
    • What routing is there on my TCPIP image
  • On Linux
    • What IP address does my Ubuntu machine have
    • What routing is there on my Ubuntu machine
    • What is the routing for a particular IP address ?

Some IP basics.

  1. Every connection has an IP address at each end.  An address looks like 10.3.4.15 or 4 * 8 bit numbers.
  2. You can use a name instead of a number, so you could have MVS1.SSS.COM.  To covert this to an IP address you call a Domain Name Server(DNS).  You pass it MVS1.SSS.COM and get back 10.3.4.15.
  3. My machine has several connections (logic bits of wire connected to the back of the machine), Ethernet, wireless, and a tunnelling connection to z/OS. Each connection has a different IP address.
  4. Packets get routed through the network depending on the destination IP address.  The router has logic like,  packets going to 10.4.5.* go does this connection, packets for 17.2.2.* go down that connection, any other packets – try sending them to down the connection 11.13.6.6.
  5. The router uses a netmask to calculate which connection to use.
    1. A net mask is a string of 1’s followed by 0s.  For example 255.255.255.0 – or 3 * 8 =24 ones.
    2. A router takes a packet IP address and a netmask and logically ands them together, and uses the result to decide where to route the packet.
    3. A connection handling 10.4.1.0 to 10.4.1.255 would have a netmask of 255.255.255.0 (also written /24 bits) a default connection may handle all packets for 10.* with a netmask of 255.0.0.0 or /8.
  6. Multiple z/OS LPARs can be attached to an OSA Adapter (think if it as Ethernet with more function), they can all be listening for an IP address – only one LPAR will get the data.  If that LPAR goes down, another LPAR will get the data.

How applications use TCPIP

You have network connection (for example wireless) which connect your machine to another machine.  On each machine applications use a port.

When your application talks to another application it establishes a session with the IP address:port.

Applications including web servers, web browsers, 3270 emulators and FTP connect.

If your application is a server it may bind to a specific port, if not your application can say give me any free port.  A port can be set up, so it is shared, so two servers can listen to connection requests on it.  Only one will get the connection request.

A server application can say I am interested in traffic on port 9443 – coming in over a specific IP address, or coming in over any address.

How does an application specify a bind value .

A Java application can issue a request for a specific port and IP address.

ServerSocket listener = factory.createServerSocket(port,1,host )

where port is 9443 and host is “10.1.3.7”

You can also configure this in the TCPIP parameters

PORT
9443 TCP * SHAREPORT BIND 10.1.3.7

You can also control which applications can use specify which ports by using the SAF resname and the RACF profile

EZB.PORTACCESS.sysname.tcpname.resname.

Changing TCPIP configuration on z/OS

The startup configuration for a TCPIP instance is in the JCL PROFILE  ddname,  or a file like TCPIP.PROFILE.

You can change the configuration of a TCPIP image using the operator command

V TCPIP,TCPIPn,OBEY,filename

Where

  1. V TCPIP tells z/OS to route this TCPIP
  2. TCPIPn is the name of the TCPIP address space to direct the command to, for example V TCPIP,TCPIP2.  If there is only one TCPIP running you can use V TCPIP,,
  3. OBEY this is the TCP command
  4. filename is the parameter passed to the OBEY command.   The filename containing the commands/configuration to be executed.

How to display information on z/OS

There are three ways of displaying TCPIP information, for example the IP address(es) of the TCP image

  1. The operator command D TCPIP,TCPIP2,NETSTAT,HOME
  2. The TSO command NETSTAT HOME TCP TCPIP2
  3. The USS command netstat -h -p tcpip   The commands are similar to, but different from Linux commands!

The output is usually similar between the commands.

What is the IP address of my z/OS TCPIP image?

From the TSO NETSTAT HOME command

EZZ2350I MVS TCP/IP NETSTAT CS V2R4 TCPIP Name: TCPIP2 17:15:53
EZZ2700I Home address list:
EZZ2701I Address   Link         Flg
EZZ2702I -------   ----         ---
EZZ2703I 10.1.1.3  ETH1         P
EZZ2703I 10.1.2.3  ETHB
EZZ2703I 172.1.1.2 EZASAMEMVS
EZZ2703I 10.1.3.10 VIPL0A01030A I
EZZ2703I 127.0.0.1 LOOPBACK

For the links

  1. I configured link ETH1 and ETHB.
  2. The VIPL0A01030A takes the VIPA IP address and converts it to hex so 10.1.3.10 becomes VIPL 0A 01 03 0A
  3. EZASAMEMVS is prefix EZA and “SAME MVS”.   This is generated by TCPIP from the DYNAMIXCF configuration.
  4. You always get a LOOPBACK address at 127.0.0.1

What routing is there on z/OZ TCPIP?

The command TSO command NETSTAT ROUTE TCP TCPIP2 or the USS command netstat -r -p tcpip gives

MVS TCP/IP NETSTAT CS V2R4 TCPIP Name: TCPIP2 16:15:43 
Destination  Gateway  Flags Refcnt     Interface 
----------- -------   ----- ------     --------- 
Default      10.1.1.1 UGS   0000000000 ETH1 
10.0.0.0/8   0.0.0.0  US    0000000000 ETH1 
10.1.1.3/32  0.0.0.0  UH    0000000000 ETH1 
10.1.2.0/24  0.0.0.0  US    0000000000 ETHB 
10.1.2.3/32  0.0.0.0  UH    0000000000 ETHB 
127.0.0.1/32 0.0.0.0  UH    0000000000 LOOPBACK 
172.1.1.1/32 0.0.0.0  UHS   0000000000 EZASAMEMVS 
172.1.1.2/32 0.0.0.0  UH    0000000000 EZASAMEMVS 
172.1.1.3/32 0.0.0.0  UHS   0000000000 EZASAMEMVS

This shows that to get to 10.1.2.0 to10.1.2.255 (with a netmask of /24 or  255.255.255.0) it goes by link(interface) ETHB.

What configuration does Ubuntu have?

There are many commands to display network configuration information on Linux, for example ip and the older, superseded command, ifconfig.

What address does Ubuntu have?

ip address gives a lot of information – but I did not use it

What packet routing does my desktop have?

The command ip route gives

  1. 10.1.0.0/24 dev eno1 proto kernel scope link src 10.1.0.3 metric 100
  2. 10.1.1.0/24 dev tap0 proto kernel scope link src 10.1.1.1
  3. 10.1.2.0/24 dev tap1 proto kernel scope link src 10.1.2.1
  4. 10.1.3.0/24 dev tap0 scope link
  5. 10.20.2.4 dev tap0 scope link
  6. 192.168.1.0/24 dev wlxd037450ab7ac proto kernel scope link src 192.168.1.67 metric 600

Bold line(2) shows

  • Traffic for any address between 10.1.1.0 and 10.1.1.255 (remember the netmask /24 means 24 bits or 255.255.255.0) goes  to device(connection) tap0
  • The IP address for the desktop end of the connection is 1.1.1.1

Bold line(4) shows

  • that any traffic 10.1.3.0 to 10.1.3.255 goes to device tap0

The command used to set this up was sudo ip route add 10.1.3.0/24 dev tap0

Bold line(5) shows

  • that traffic to 10.20.2.4 goes to device tap0.

The command used to set this up was sudo ip route add  10.20.2.4 dev tap0

What is the routing for a particular IP address ?

You can use traceroute command to display which route a packet would take. For example

  • traceroute 10.1.3.10
    • traceroute to 10.1.3.10 (10.1.3.10), 30 hops max, 60 byte packets
      1 colins machine(10.1.1.1) 30 ms !H 30 ms !H 30 ms !H

This shows the route to 10.1.3.10 went to the connection with IP address 10.1.1.1

For a connection that is not defined

traceroute 10.20.2.5 
traceroute to 10.20.2.5 (10.20.2.5), 30 hops max, 60 byte packets
1 bthub.home (192.nnn.1.mmm) 3.170 ms 4.742 ms 6.379 ms
2 * * *

So we can see it went to my bt hub  wireless router.

You can also use the ping command.  On linux there is the -R option for display route.

ping -R 10.1.3.10 
PING 10.1.3.10 (10.1.3.10) 56(124) bytes of data.
64 bytes from 10.1.1.2: icmp_seq=1 ttl=64 time=2.54 ms
NOP
    RR: 10.1.1.1
        10.1.1.2
        10.1.1.1

The request went to 10.1.1.1.  10.1.1.2 caught it, and sent the reply back, via 10.1.1.1

I was looking for my VIPA address, 10.1.3.10, and we can see it got to 10.1.1.2.

For the ping to work, there must be a server processing the ping request.  If there are no applications processing the VIPA, the VIPA is not active, so a ping will fail.

A successful ping to a VIPA address means a packet can get to the LPAR, be processed and  the reply set back.  If the ping does not respond it could be

  1. The VIPA is not active
  2. The VIPA is active and a packet was sent to the LPAR hosting the VIPA, but it could not send a response back due to a set up error.

 

Setting up a highly available web server is a “yes – but” problem

I’ve been setting up a Liberty web server, as used in MQWEB, z/OS Connect, z/OSMF and so on, and was looking into how to make this available, so I could move the web server to a different LPAR or TCP/IP instance.

Moving it should be easy – it is  – but …  but there are things you need to think about. It is a bit like going around a maze trying to find the solution.

How do I get to the fail over system?

You start the web server on a different LPAR in the sysplex. How can you support this to allow your browser to get to the backend, without changing the URL?

You have two choices.

  1. You change your DNS look up, or router so your request goes via a different connection (think different bit of wire) to the failover LPAR. These change can be automated to some extent.
  2. Multiple z/OS images can listen for an IP address.

These work but…

The certificate sent down from the web server contains the address of the LPAR as part of the SAN.  When the browser processes it, it compares the LPAR address in the certificate with the address in the certificate.  If they do not match the browser produces an error message.

How do I get over the certificate SAN and the IP address difference?

You have a couple of choices

  1. Use a unique certificate on each LPAR.    Yes this works, but there is more administrative work to set up.  You could set up two web servers and only use one at a time.   This work, but it is unnecessary work.
  2. Use a Virtual IP address.   In TCP networking the end of every connection is a “device” or system with its unique IP address.   You can give the web server its own IP address which is “virtual” as it is not  device or system.  With this, when you start your web server on a different LPAR, it has the same IP address.  To use this you have to configure z/OS to support this.  You can set this up
    1. To support multiple web servers, and distribute the work to them
    2. Have a hot standby
    3. To route traffic to where the web server has started.

Yes, these work, but – is not easy to set up.  I’ll be blogging how to do this.

Defining a second TCPIP stack on z/OS on zPDT

I wanted a second TCPIP stack on my z/OS because I wanted to test it with MQWEB.   There is no good documentation in one place, there is good documentation hidden away, but not all in one place.
This took me about half a day to set up -including several IPLs , but I am on my own z/OS zPDT image so this was not a problem.  It take a while to understand the definitions – it is another one of “this point to that which points to something else…”.   You need to be able to copy a definition rather than use the books to create it from nothing.

I’ll describe setting up TCPIP2.

Overall I was surprised at how easy this bit was to set up.

The work breaks into

  • setting up the connectivity from Linux to z/OS
  • setting up the second TCP stack
    • Configure sys1.parmlib memmber and IPL
    • Define the new TCPIP procedure
    • Configure the new TCPIP configuration
    • Allowing people to use the TCPIP stack

Both of these need an IPL of z/OS, so you could do all of the customising and IPL afterwards at the end.

I’ll cover sharing an existing OSA adapter and setting up a new OSA adapter.

Sharing an existing OSA adapter.

Copy ADCD.Z24A.VTAMLST(OSATRL2) to USER.Z24A.VTAMLST(OSATRL2) and make the changes in bold

OSATRL1 VBUILD TYPE=TRL 00010000
OSATRL1E TRLE LNCTL=MPC,READ=(0400),WRITE=(0401), X00020007
               DATAPATH=(0402,0404,040,0406),     X00021013
               PORTNAME=PORTA,                    X00022004
               MPCLEVEL=QDIO                       00023005
*SATRL2E TRLE LNCTL=MPC,READ=(0404),WRITE=(0405),DATAPATH=(0406), X00024011
* PORTNAME=PORTB, X00025011
* MPCLEVEL=QDIO 00026011

I changed

  • DATAPATH=(0402) to DATAPATH=(0402,0404,0406)  – note every other address.    With 0402,0403 etc in the list, the second TCP failed to work, with messages like
    • EZZ4310I ERROR: CODE=80100040 REPORTED ON DEVICE PORTA. DIAGNOSTIC CODE: 03
    • EZZ4309I ATTEMPTING TO RECOVER DEVICE PORTA
    • IST1222I DATA DEVICE 0403 IS INOPERATIVE, NAME IS PORTA
    • IST1578I DEVICE INOP DETECTED FOR PORTA BY ISTTSCMA CODE = 104
  • Commented out/deleted the second TRLE definition

The zPDT devmap needs to have OSA definitions for these

name awsosa 0009 --path=A0 --pathtype=OSD --tunnel_intf=y # QDIO mode
device 400 osa osa --unitadd=0
device 401 osa osa --unitadd=1
device 402 osa osa --unitadd=2
device 403 osa osa --unitadd=3
device 404 osa osa --unitadd=4
device 405 osa osa --unitadd=5
device 406 osa osa --unitadd=6

I created a file USER.Z24A.TCPPARMS(T2OSA)

DEVICE PORTA  MPCIPA 
LINK ETH1  IPAQENET PORTA 
START PORTA 
HOME 10.1.1.3 ETH1 

and put

include USER.Z24A.TCPPARMS(T2OSA)

into my tcpip2 startup.

By putting the definitions in a PDS member, means I can use

V TCPIP,TCPIP2,OBEY,USER.Z24A.TCPPARMS(T2OSA)

to activate them.

I reipled the system to pick up VTAM changes.

Once I had stared TCPIP and TCPIP2 the command d net,id=OSATRL1E gave

D NET,ID=OSATRL1E
IST097I DISPLAY ACCEPTED
IST075I NAME = OSATRL1E, TYPE = TRLE 466
IST486I STATUS= ACTIV, DESIRED STATE= ACTIV
IST087I TYPE = LEASED , CONTROL = MPC , HPDT = YES
IST1954I TRL MAJOR NODE = OSATRL2
IST1715I MPCLEVEL = QDIO MPCUSAGE = SHARE
IST1716I PORTNAME = PORTA LINKNUM = 0 OSA CODE LEVEL = 7617
IST2337I CHPID TYPE = OSD CHPID = A0 PNETID = **NA**
IST1577I HEADER SIZE = 4096 DATA SIZE = 0 STORAGE = ***NA***
IST1221I WRITE DEV = 0401 STATUS = ACTIVE STATE = ONLINE
IST1577I HEADER SIZE = 4092 DATA SIZE = 0 STORAGE = ***NA***
IST1221I READ DEV = 0400 STATUS = ACTIVE STATE = ONLINE
IST924I -------------------------------------------------------------
IST1221I DATA DEV = 0403 STATUS = ACTIVE STATE = N/A
IST1724I I/O TRACE = OFF TRACE LENGTH = *NA*
IST1717I ULPID = TCPIP ULP INTERFACE = PORTA
...
IST1221I DATA DEV = 0404 STATUS = ACTIVE STATE = N/A
IST1724I I/O TRACE = OFF TRACE LENGTH = *NA*
IST1717I ULPID = TCPIP2 ULP INTERFACE = PORTA
IST2310I ACCELERATED ROUTING DISABLED
IST924I -------------------------------------------------------------
IST1221I DATA DEV = 0405 STATUS = RESET STATE = N/A
IST1724I I/O TRACE = OFF TRACE LENGTH = *NA*
IST924I -------------------------------------------------------------
IST1221I DATA DEV = 0406 STATUS = RESET STATE = N/A
IST1724I I/O TRACE = OFF TRACE LENGTH = *NA*
IST924I -------------------------------------------------------------
IST1500I STATE TRACE = OFF

Setting up the connectivity from Linux to z/OS using a second OSA adapter

You need to set up an interface from Linux to z/OS via an Open Systems Adapter (OSA).

TCP/IP Interfaces are used to tunnel from Linux to z/OS.  These have names like tap0, tap1;  they tie up with z/OS paths and devices.  The Linux device drivers implement the QDIO protocol, a simpler and faster protocol than traditional z/OS channels.

Identify the path and devices to be used.

The zPDT find_io command gave me

 FIND_IO for "colin@colin-ThinkCentre-M920s" 

      I/face Cur           MAC      IPv4         IPv6 
Path  Name   State         Address  Address      Address 
----  ----   ---- -------- -------- -------       ----------------- ---------------- -------------- 
F0    eno1    UP, RUNNING 00:d8:... 10.1.0.3     fe80:...%eno1 
F1    wlxd..  UP, RUNNING d0:37:... 192.168.1.67 2a00:...6cab 
.  
A0   tap0     UP, RUNNING 9e:30:... 10.1.1.1     fe80:... %tap0 
A1   tap1     UP, RUNNING 7e:66:... 0.1.2.1      fe80:... %tap1 
A2   tap2   DOWN 02:a2:a2:a2:a2:a2  *            *

We can see from this the IP addresses being used;  channel paths A0, A1 are in use by tunneling; channel path A2 is available.

In the zPDT devmap I set up

[manager] # tap0 define network adapter (OSA) for communication with Linux
name awsosa 0009 --path=A0 --pathtype=OSD --tunnel_intf=y # QDIO mode
device 400 osa osa --unitadd=0
device 401 osa osa --unitadd=1
device 402 osa osa --unitadd=2

[manager] # tap1 define network adapter (OSA) for communication with Linux
name awsosa 0010 --path=A1 --pathtype=OSD --tunnel_intf=y --tunnel_ip=10.1.2.1 --tunnel_mask=255.255.255.0 # QDIO mode
device 408 osa osa --unitadd=0
device 409 osa osa --unitadd=1
device 40a osa osa --unitadd=2

Where the paths tie up with the output from the find_io.

Each connection needs 3 consecutive devices, for example 408,409,40a.

On z/OS use the command D U,CTC to find which devices are available.  I think (I am not sure) that the first device has to end in 0, or 8 .

I have

UNIT TYPE STATUS 
0400 OSA A-BSY 
0401 OSA A 
0402 OSA A-BSY 
0403 OSA OFFLINE 
0404 OSA OFFLINE 
0405 OSA OFFLINE 
0406 OSA OFFLINE 
0407 OSA OFFLINE 
0408 OSA A-BSY 
0409 OSA A 
040A OSA A-BSY

Once you have selected the OSA addresses to use, and configured the devmap file, you will need to restart zPDT with the updated devamp – but you need to customise z/OS and IPL – so do not IPL just yet.

Z/OS work for setting up the second TCP stack

Some basic terminology and concepts.

  • There is an network domain AF_INET which programmers use via sockets to communicate with the network.   (There is another network domain AF_UNIX for Unix programming).
  • You have to configure the domain, for example how many concurrent sessions it can support.
  • Originally you could have only one TCP stack in the environment.   This used an interface called INET.  This did not support more than one TCP/IP stacks.
  • A new interface was developed Common INET ( CINET). Conceptually this sits in front of TCP/IP and routes packets to the TCPIP subsystems.
  • To be able to use multiple stacks, CINET needs to be used instead of INET.
  • These are customised in SYS1.PARMLIB(BPXPRMxx).

Customise sys1.parmlib(BPXPRMxx) member

For example

FILESYSTYPE TYPE(INET) ENTRYPOINT(EZBPFINI) 

SUBFILESYSTYPE NAME(TCPIP) 
     TYPE(INET) 
     ENTRYPOINT(EZBPFINI) 

NETWORK DOMAINNAME(AF_INET) 
     DOMAINNUMBER(2) 
     MAXSOCKETS(64000) 
     TYPE(INET) 
     INADDRANYPORT(5555) 
     INADDRANYCOUNT(1000)

Change TYPE(INET) to TYPE(CINET) in 3 places, and change ENTRYPOINT(EZBPFINI) to ENTRYPOINT(BPXTCINT)

Add the new TCPIP address space

SUBFILESYSTYPE NAME(TCPIP2) 
     TYPE(CINET) 
     ENTRYPOINT(EZBPFINI)

This change needs a IPL to activate (or possibly a SETOMVS RESET=(xx).   I do not know what else the change from INET to CINET affects, so check with IBM before implementing it.

Define the TCPIP2 procedure

  • I defined a new profile in the STARTED class to map TCPIP2 to a userid.   I used the same userid as for TCPIP.
  • I copied the TCPIP procedure from TCPIP to TCPIP2.
  • The TCPIP procedure refers to TCP configuration,
    • //PROFILE  … DSN=SYS1.TCPPARMS(PROF) and
    • //SYSTCPD …. DSN=SYS1.TCPPARMS(TCPDATA).
  • Create your own copies of these, for example copy them to USER.TCPPARMS, and rename the members to PROF2, and TCPDATA2

Create VTAM definition for the tunnelling connection for the a second OSA adapter

 

If you are using a second OSA adapter, you need to create a VTAM member to map from the OSA device to a TCP/IP name using MPC.  This is Multi Protocol Channel, using protocol QDIO which is simpler and faster protocol than traditional z/OS channels.

Create a member in VTAMLST for a Transport Resource List major node, for example OSACOLIN.

----+----1----+----2----+----3----+----4----+----5----+----6----+----7----+----8
OSA5 VBUILD TYPE=TRL 
OSATRL5E TRLE LNCTL=MPC,READ=(0408),WRITE=(0409),DATAPATH=(040A),      X
               PORTNAME=PORTZ,                                         X
               MPCLEVEL=QDIO

Note the format, continuation ‘x’ in column 72, and continuation text in column 16.

You can use V NET,ACT,ID=OSACOLIN  to activate it. If you use D NET,IS=OSACP,E it should find it, and report it is active.

You can use D NET,TRL  to display the status of the  links.

Configure TCPPARMS(PROF2)

I used a copy of the TCPIP(PROF) as my starting configuration.

I commented out all of the lines between AUTOLOG and ENDAUTOLOG.

I went down to the DEVICE and BEGINROUTE section and used

DEVICE PORTZ MPCIPA
LINK ETHZ IPAQENET PORTZ
; end of link and device definitions
;
HOME 10.1.2.2 ETHZ
;
BEGINRoutes
;            Destination SubnetMask FirstHop LinkName Size
ROUTE        DEFAULT               10.1.2.1  ETHZ     MTU 1492
ENDRoutes
; start it when TCP/IP starts
START PORTZ

Where

  • DEVICE PORTZ MPCIPA  –  MPCIPA says this is an OSA QDIO, and uses the PORTZ definition.  PORTZ was defined above in the VTAMLST(OSACP).
  • LINK ETHZ IPAQENET PORTZ –  this create a LINK ETHZ associated with DEVICE PORTZ in the line above. It uses the interface type IPAQENET, which is for IP V4 and device OSA QDIO.   (There is IPAQENET6 for IP V6 for OSA QDIO).
  • HOME 10.1.2.2 ETHZ –  for traffic coming in over ETHZ (via PORTZ, and back to the tap1 which was defined with –tunnel_ip=10.1.2.1).   A ping 10.1.2.2 should come in over this interface.  For the first OSA adapter this had 10.1.1.2.
  •  BEGINRoutes
    ;                Destination SubnetMask FirstHop LinkName  Size
    ROUTE  DEFAULT                               10.1.2.1          ETHZ    MTU 1492
    ENDRoutes

    •  Any traffic going to 10.1.2.1 go via link ETHZ and use a packet size of 1492 bytes.
  • START PORTZ – get it working

Edit the TCPDATA

For sharing an OSA or using a new OSA, I edited the TCPDATA2 file and added

TCPIPJOBNAME TCPIP2
S0W1: HOSTNAME S0W1COL
DOMAINORIGIN COLIN.HOST.COM
DATASETPREFIX TCPIP
NSPORTADDR 53
RESOLVEVIA UDP
LOOKUP LOCAL
ALWAYSWTO YES

I dont know which of these are important.  I changed the bold lines, to match my name.

RACF profile changes

You have to set up a security  profile before an application can connect to TCPIP and listen on a socket.  MQWEB got EDC5112I Resource temporarily unavailable. (errno2=0x74610296)

rdefine SERVAUTH EZB.INITSTACK.*.TCPIP2  from(EZB.INITSTACK.*.TCPIP)

Using the model… above copies the permission from the base object.   You can allow more users using

permit EZB.INITSTACK.*.TCPIP2 class(SERVAUTH) id(START1) access(READ)

The “*” is for any system in the sysplex, so you could have EZB.INITSTACK.MVSA.TCPIP2 and allow access to TCPIP2 on system MVSA, but not from another MVS system.

You can protect TCPIP2 for example protect the NETSTAT command

RDEFINE SERVAUTH (EZB.NETSTAT.*.TCPIO2.*) UACC(NONE)
PERMIT (EZB.NETSTAT.*.TCPIP2.*) ACCESS(READ) CLASS(SERVAUTH) ID(TCPADMIN)
SETROPTS GENERIC(SERVAUTH) REFRESH 

Check it out

You can use the Linux netstat -i command to display the interfaces defined to Linux.  On my Linux  I got

colin@colin-ThinkCentre-M920s:/home/zPDT$ netstat -i 
Kernel Interface table
Iface     MTU   RX-OK ... Flg
eno1     1500   84758 ... BMRU
lo      65536  188855 ... LRU
tap0     1500       6 ... BMRU
tap1     1500      25 ... BMRU
wlxd0374 1500   10545 ... BMRU

z/OS commands

D TCPIP – displays the TCP address spaces in the LPAR

D tcpip,tcpip2,netstat,home gave
EZZ2500I NETSTAT CS V2R4 TCPIP2 540
HOME ADDRESS LIST:
ADDRESS LINK FLG
10.1.2.2 ETHZ P
127.0.0.1 LOOPBACK

Using TCPIP2 from Liberty web server

I added

_BPXK_SETIBMOPT_TRANSPORT=TCPIP2

to the server.env file, and restarted Liberty

I connected from my web browser to MQWEB using 10.1.2.2:9443, and got the messages

Your connection is not private
Attackers may be trying to steal your information from 10.1.2.2
NET:ERR_CERT_COMMON_NAME_INVALID

The NET:ERR_CERT_COMMON_NAME_INVALID message is because the certificate had a Subject Alternative Name of a different IP address 10.1.1.2.  It traffic flow was sent from 10.1.2.2.

This was what I expected.

Customising for MQWEB Liberty on z/OS, things the documentation does not tell you about

This post covers the customising you need to consider enterprise use of the Liberty MQWEB server.  It covers

  • Setup the USS path and defining an alias for the mq executable’s  directory
  • Do you have common configuration across mqwebuser.xml files?
  • Decide if you want to use setmqweb.
  • Setting up the server’s certificate and keyring
  • Setting up the trust store
  • Setting up the Angel process(es)
  • Reserving the TCP/IP Port number
  • Customising the jvm.option
    • To prevent the web server coming up if the Angel process is missing
    • Setting the time zone
  • Customising the mqwebuser.xml
    • SAF definitions
    • Setting the log sizes so the logs can be viewed
  • Letting requests in from outside of the LPAR
  • dspmqweb/setmqweb – which instance to use?
  • Selecting which IP stack to use
  • Customising ISPF option 3.17 – Unix Directory List

Setup the USS path and defining an alias for the mq executable’s directory

To be able to use the dspmweb and setmqweb commands you need to point to the command location.

You can add to the user’s .profile file, or the /etc/profile the statement

export PATH=/usr/lpp/mqm/V9R1M1/web/bin:$PATH

If you have multiple releases of MQ in your environment you could set up shell commands like v913dspmqweb.sh

/usr/lpp/mqm/V9R1M3/web/bin/dspmqweb "$@"

But this causes extra work when you need to migrate to the new release.  It might be better to set up an alias

ls -s  /usr/lpp/mqm/V9R1M3/web/bin /v913
ls -s  /usr/lpp/mqm/V9R1M3/web/bin /mqcur

so you just need to type /v913/dspmweb or /mqcur/setmqweb

As part of the migration to a newer release you just change the alias.

Do you have common configuration across mqwebuser.xml files?

If you have multiple mqweb instances, either because you have multiple LPARs in a sysplex, or you have to support different release of MQ concurrently, you may want to put common configuration in an include file. For example created a file common.xml to hold the configuration and put

<include location=”common.xml” optional=”false”/>

in the mqwebuser.xml file.

Decide if you want to use setmqweb.

You can update your *.xml configuration files, or use setmqweb to update mqwebuser.xml for you.

Some organisations do not allow manual changes to configuration. You have to change a configuration file, have it reviewed, and use automation to deploy it.

For test systems it may be ok to use the setmqweb command and change things dynamically.

If you make a change using setmqweb, it updates the mqwebuser.xml file, by adding/changing a <variable name=”…” value=”..”/>  statement.

If you are using SAF authentication and certificate authentication

You will need keyring with the certificate to identify the server (the key store).  You will need a keyring to identify the certificates you trust (the trust store).  You could use the same keyring for both – but this is not good practice.

The server’s certificate and key store keyring

You need to decide if the MQWEB server uses the same certificate as CICS, WAS and z/OS Connect etc. on the same LPAR.  You could have a common certificate to simplify administration. The certificate needs a Subject Alternative Name, to identify the machine the certificate came from. This can be the DNS name or the dotted address (9.20.4.6) depending on your set up.  It might be easier to define both. Note the RACF command

RACDCERT .. ALTNAME(IP(10.1.1.2) IP(10.1.1.3) DOMAIN(‘WWW.ME.COM’) DOMAIN(‘WWW.LAST.COM’))…

accepts multiple entries, but only uses the last one. The above command gave produced a certificate with

Subject's AltNames: 
IP: 10.1.1.3 
Domain: WWW.LAST.COM

This means you many only be able to use the certificate only on the LPAR that has been defined, (if you move the server to a different LPA, it will have a different IP address, and your clients will complain – see below).   You may be able to something clever things with VIPA (Virtual IP addressing) where your Sysplex has one IP address and this maps to different IP addresses on each LPAR.

If you have the wrong IP or Domain then the browser gets  a message like “Your connection is not private. Attackers may try to steal your information from 10.1.1.2.  NET:ERR_COMMON_NAME_INVALID”

The trust store keyring.

The trust store keyring has the certificates to authenticate what has been sent from the client.  For example, a copy of any self signed certificate, or the Certificate Authorities of the Web Browser’s certificate.

This keyring could be sysplex wide, and shared by CICS, WAS, Z/OS connect etc – assuming they have the same people connecting to them.

The certificates may have been configured with owner CERTAUTH rather than an userid.

My definitions

<sslDefault sslRef="defaultSSLConfig"/> 
<ssl id="defaultSSLConfig" 
   sslProtocol="TLSv1.2" 
   keyStoreRef="racfKeyStore" 
   trustStoreRef="racfTrustStore" 
   clientAuthenticationSupported="true" 
   clientAuthentication="true" 
   serverKeyAlias="MYMQWEB/> 

<keyStore filebased="false" id="racfKeyStore" 
   location="safkeyring://START1/KEY" 
   password="password" 
   readOnly="true" 
   type="JCERACFKS"/> 

 <keyStore filebased="false" id="racfTrustStore" 
   location="safkeyring://START1/TRUST" 
   password="password" 
   readOnly="true" 
   type="JCERACFKS"/> 

<webAppSecurity allowFailOverToBasicAuth="false"/>
  • The sslDefault  points to the ssl with the same ID
  • The ssl points to
    • the key store with the servers certificate with the id racfKeyStore
    • the trust store to validate connecting clients, with the id racfTrustStore

Create an angel

You need an Angel process to handle the SAF (RACF) security requests – the MQ documentation tells you this.

Typically the Angel started task is started at IPL, and shut down at system shut down.
All instances of Liberty Web Server running on an LPAR can all use the same Angel, for example the z/OSMF angel IZUANG1.

You cannot shut down the Angle process if it is in use, but if you cancel it, the servers using it will stop working (hang) and may abend.

You may want to consider more than one Angel process, and not share it.

When the Angel process has started, it uses no CPU, as the Web Servers execute code within the  Angel address space, on the Web Server’s threads – just like MQ, DB2 etc.

Customise  jvm.options

Stop if there is no Angel  process

If the Angel process is not running at Liberty startup,  then the Web Server may continue to come up.  People will not be authorised to access it, but the Web Server will be running.   This is pretty useless.

You can specify an option so the liberty server (MQWEB) does not start if the Angel task is not running.

I use

-Dcom.ibm.ws.zos.core.angelRequired=true
#-Dcom.ibm.ws.zos.core.angelName=MYANGEL

-Dcom.ibm.ws.zos.core.angelRequired=true

If the angel process is not available then the MQWEB stops when it detects the angel is not available.

#-Dcom.ibm.ws.zos.core.angelName=MYANGEL

If you are using a names Angel, uncomment this and specify the Angel name.

If you are using the unnamed Angel, leave this commented.

Set the time zone

The time zone is picke up from TZ in /etc/profile, but you can override it by specifying

-Duser.timezone=Europe/London

This sets the time-zone of the messages in the message.log and trace.log files.

Reserve the TCP/IP port number

It is a good idea to talk to the networking team and get them to update the TCP/IP configuration for example

PORT 
    20 TCP OMVS NOAUTOLOG ; FTP Server 
    21 TCP OMVS ; FTP Server 
    22 TCP SSHD* ; port for sshd daemonrver 
    23 TCP TN3270 ; Telnet 3270 Server 
    ...
    1414 TCP CSQ9CHIN ; CSQ9 MQ TCP Listener  
    ...
    9443 TCP MQWEB ; Colin Paice MQWEB

 

Customise mqwebuser.xml

Message log and trace file settings

If the trace or message files are too big, you cannot view them. You have to use edit to look at them, but if the file is too large, browse is substituted and browse does not do code page conversion, so you are looked at raw ascii characters in an EBCDIC browser.

<variable name=”maxTraceFileSize” value=”20″/>
<variable name=”maxTraceFiles” value=”20″/>
<variable name=”maxMsgTraceFileSize” value=”20″/>
<variable name=”maxMsgTraceFiles” value=”20″/>

The file size values are in MB.

You should consider keeping you messages.log files for a week or so, so make the number of files large enough.

SAF – Access to RACF

If you are using SAF (RACF or other z/OS security manager) to manage access and authorisation you will have a default entry like

<!-- 
Example SAF Registry 
--> 
<safAuthorization racRouteLog="NONE" id="saf" /> 

<safRegistry id="saf" /> 
<safCredentials unauthenticatedUser="WSGUEST" profilePrefix="MQWEB" 
suppressAuthFailureMessages="false" /> 

I use <safAuthorization racRouteLog=”ASIS”… to get RACF violation messages on the joblog during set up.  See here.

<safRegistry suppressAuthFailureMessages=”false”…  prints out violation messages.  See here.

Let request in from outside z/OS

For this to work you have to edit the mqwebuser.xml file and uncomment

<variable name="httpHost" value="*"/> 
<!-- 
-->

By default it only allows request from the same z/OS system – so not allowing browsers access.

dspmqweb/setmqweb – which instance to use?

This page  says you must use

export WLP_USER_DIR=WLP_user_directory

This is fine when you have one mqweb instance on one LPAR.  You might want a shell program to set this every time.  For example,  the program disMQPAweb.sh

export WLP_USER_DIR=/u/mqmweb/MQPA
/usr/lpp/mqm/V9R1M1/web/bin/dspmqweb "$@"

Then you can use /usr/lpp/mqm/V9R1M1/web/bin/dspmqweb as before.

If you have multiple releases of MQ in your environment you might want to point to the command in the script, so dspMQPA.sh might have

export WLP_USER_DIR=/u/mqmweb/MQPA
/usr/lpp/mqm/V9R1M1/web/bin/dspmqweb "$@"

Though it might be better to have a shell script mq911 with an optional queue manager parameter

Selecting which IP stacks to use.

There is an article from IBM, which gives two ways of configuring it.  Changing the httpEndpoint, or specifying an environment variable

Customise ISPF z/OS UNIX Directory List

In the MWEB directory are message logs and trace logs.  When the file fills up, it renames the old file to include the date and time, for example messages_20.07.29_16.49.29.0.log , and creates a new message.log or trace.log

If you are using ISPF 3.17 (z/OS UNIX Directory List) to use the files, it only displays the first 15 characters of the file name, so you get lots of files with a name like “messages_20.07.” where 20 is the year, and 07 is the month.

The default layout for the z/OS UNIX Directory List  displays by default some unhelpful fields.   You can arrange the fields, (but not make the filename field wider).
If you go to the OPTIONS on the top line, and select “2. Directory List Column Arrangement… ” you can change what fields are displayed, and the order.  I set the widths of all fields to 0, except for

  • Type 04
  • Modified 19 (if you specify a smaller value you only get the YYYY-MM…  not the time)
  • Size 10

The documentation says

  • Modified The date and time the file was last changed.
  • Changed The date and time the status of the file was last changed.

I do not know the difference between these two.

Controlling what is displayed

In the directory list you can use sort commands

  • sort file A
  • sort mod D

Looking at a log or trace file

If you sort by Modified A the newer files will be at the top, so you can look at the “modified” column to look for the time the file was created, and so get the order of the files.

You can use the line command / to display the options.

You can use e to edit, or V to use edit in browse mode.

Browse displays a mess because it does not do conversion

 


	

Liberty on z/OS: Mapping an incoming certificate to a z/OS userid for client certificate authentication – and don’t forget the cookies!

I thought I understood how this worked, I found I didn’t, then had a few days hunting around for the problem

The basics

You can use a digital certificate from a web browser ( curl, or other tools) to authenticate to z/OS.  You need to map the certificate to a userid.

A certificate coming in can have a Distinguished Name like CN=adcdd.O=cpwebuser.C=GB  (Note the ‘.’not ‘,’ between elements).

Your userid needs to have SPECIAL define to be able to use the RACDCERT command (SPECIAL, not just GROUP-SPECIAL).

You will need a definition like (see here for the command)

RACDCERT MAP ID(ADCDD ) - 
    SDNFILTER('CN=adcdd.O=cpwebuser.C=GB') - 
    WITHLABEL('adcdd')

or a general definition for those certificate with  O=cpwebuser.C=GB, ignoring the CN part

RACDCERT MAP ID(ADCDB ) - 
   SDNFILTER('O=cpwebuser.C=GB') - 
   WITHLABEL('cpwerbusergb') 

or using the Issuing Distinguished Name (the Certificate Authority)

IDNFILTER(‘CN=TESTCA.OU=SSSCA.C=GB)

Using a generic

SDNFILTER(‘CN=a*.O=cpwebuser.C=GB’)

does not work.

If you attempt to use a certificate which is not mapped you get

ICH408I USER(START1 ) GROUP(SYS1 ) NAME(COLIN)
DIGITAL CERTIFICATE IS NOT DEFINED. CERTIFICATE SERIAL NUMBER(0163)  SUBJECT(CN=adcdd.O=cpwebuser.C=GB) ISSUER(CN=SSCA8.OU=CA.O=SSS.C=GB).

It is worth defining these using JCL, because if you try to add it, and it already exists then you get a message saying it exists already.  If you know the userid, you can list the maps associated with it.   If you do not know the userid, there is no practical way of finding out – you have to logon with the certificate, and display the userid from the web browser, or extract the list of all users, and use LISTMAP on all of them.

Once you have set up the userid, you can connect them to the group to give them access to the EJBROLE profiles.  For example use group names

  • MQPAWCO MQPAMQWebAdminRO Console Read Only.
  • MQPAWCU MQPAMQWebUser  Console User only.  The request operates under the signed on userid authority.
  • MQPAWCA MQPAMQWebAdmin Console Admin.

for queue manager MQPA, Web  Console (rather than REST) and the access.

You may want to set up  userids solely for client authentication.  If the userid has NOPASSWORD, it cannot be used to logon with userid and password, and of course the lack of password means the password will not expire.

Having a set of userids just for certificate access makes it easier to manage the RACDCERT MAPping.    You have a job with

RACDCERT ID(adcd1) LISTMAP
RACDCERT ID(adcd2) LISTMAP
etc

and search the output for the certificate of interest.

It gets more complicated…

Often the user’s certificate is in the form CN=Colin Paice,o=SSS,C=GB so if you want to allow all people in the MQADMIN team access, you will need to to specify them individually.  It would be easier if DN had CN=Colin Paice,OU=MQADMIN,o=SSS,C=GB, then you can filter on the OU=MQADMIN.   These could map to a userid MQADM1.

It gets more complication if someone can work with MQ, and CICS or z/OS Connect, and you have to decide a userid – MQADM1 or CICSADM1?

Setting up a one to one mapping may be the best solution, so CN=Colin Paice,o=SSS,C=GB maps to CPAICE (or GB070594).   This userid is then added to the appropriate RACF groups to give access to the EJBROLEs, to give access to the servers.

How do I tell what is being used?

I could not get Liberty to record an audit record for the logon/matching.   I tried altering the userid to have UADIT – but it did not work either.

If you have audit defined on the class EJBROLE profile MQWEB.com.ibm.mq.console.MQWebUser, you will get a audit record in SMF.   This has many fields including

  • Date
  • Time
  • ACCESS
  • SUCCESS – or INSACC (INSufficient Access)
  • ADCDC – userid being used
  • READ – Requested access
  • READ – permitted access
  • EJBROLE – the class
  • MQWEB.com.ibm.mq.console.MQWebUser – the profile
  • CN=adcdd.O=cpwebuser.C=GB – the Distinguished Name of the certificate
  • CN=SSCA8.OU=CA.O=SSS.C=GB – the Issuers (Certificate Authority) of the certificate

From this you can see the userid being used ACDC, and the certificate DN CN=adcdd.O=cpwebuser.C=GB.

And to make it more complicated

I deleted the RACDCERT MAP entry, but the web browser continued to work with the user.  I had a cup of tea and a cookie, and the web browser stopped working.   Was problem this connected to a cup of tea and a cookie?

Setting up the initial handshake is expensive.  The system has to do a logon with the certificate to get the userid from the RACDCERT mapping.  It then checks the userid has access to the SERVER profile, then it checks to see if it is MQWebAdmin, MQWebAdminRO, or  MQWebUser.

Once it has done this it it takes the userid and information, encrypts it, and creates the LTPA cookie.   This is sent down to the web browser.

The next time the web browser sends some data, it also sends the cookie. The MQWEB server decrypts the cookie, checks the time stamp to make sure the information is current, and if so, uses it.  The timeline I had was

  • create the RACDCERT mapping from certificate DN to userid
  • use browser to logon to mqweb, using the certificate with the DN
  • it works, mqweb sends down the cooke
  • delete the RACDCERT mapping for the DN
  • restart the browser, logon to mqweb, using the certificate with the DN.  The cookie is passed up – the logon works
  • clear the browser’s cookies – and retry the logon.  It fails as expected.

So ensure the browser cookie is cleared if you change the mapping or ejbrole access for the user.