Python on z/OS – creating a C extension

I enjoy using Python on Linux, because it is very powerful. I thought it would be interesting to port the MQ Python interface pymqi to z/OS. This exposed many of the challenges of running Python on z/OS.

I’ll cover some of the lessons I learned in doing this work. Thanks to Steve Pitman who helped me package the extension.

Creating files that would compile was a challenge.

See here.

Compiling files.

I copied the pymqi C code to z/OS Unix Services, and tried to compile it. This was a mistake, as it took me a long time to get the compile options right. I found that using the setup.py script was the right way to go.

My directory tree

/u/pymqi
..setup.py
..include
....sample.h
..code
....pymqi
......__init__.py        
......CMQCFC.py          
......CMQXC.py           
......CMQZC.py           
......const.py           
......CMQC.py            
......pymqe.c            

Setup.py

This script needs export _C89_CCMODE=1, otherwise you get FSUM3008 message

Specify a file with the correct suffix (.c, .i, .s, .o, .x, .p, .I, or .a), or a corresponding data set name, instead of -L

import setuptools 
from distutils.core import setup, Extension 
import os 
import sysconfig 
# 
# This script needs    export _C89_CCMODE=1 
# Otherwise you get FSUM3008  messages 
# 
import os 
os.environ['_C89_CCMODE'] = '1' 
bindings_mode = 1 
version = '1.12.0' 
setup(name = 'pymqi', 
    version = version, 
    description = 'Python...', 
    platforms='OS Independent', 
    package_dir = {'': 'code'}, 
    packages = ['pymqi'],  
    py_modules = ['pymqi.CMQC', 'pymqi.CMQCFC', 'pymqi.CMQXC', 'pymqi.CMQZC'], 
    ext_modules = [Extension('pymqi.pymqe',['code/pymqi/pymqe.c'], define_macros=[('PYQMI_BINDINGS_MODE_BUILD', 
bindings_mode)], 
    include_dirs=["//'COLIN.MQ924.SCSQC370'"], 
    extra_link_args=["//'COLIN.MQ924.SCSQDEFS.OBJ(CSQBMQ2X)'"], 
      )] 
) 
# I had extra_link_args=["-Wl,INFO,LIST,MAP",.... when setting 
# this up
# I used 
# extra_compile_args=["-Wc,LIST(c.lst),XREF"], 
# to get out a listing and cross reference.

Which says

  • The package name is packages = [‘pymqi’],
  • The Python files are py_modules = [‘pymqi.CMQC’….
  • There is an extension .. ext_modules=.. with the source program code/pymqi/pymqe.c
  • It needs “//’COLIN.MQ924.SCSQC370′” to compile and “//’COLIN.MQ924.SCSQDEFS.OBJ(CSQBMQ2X)'” at bind time. This file contains the MQ Binder input
  • When I wanted the binder output – “-Wl,INFO,LIST,MAP”. This goes to the terminal. I used a ‘>’ command to pipe the output of the python3 setup build it to a file.
  • and a C listing “-Wc,LIST(c.lst),XREF”. The listing goes to c.lst

You need

  • import setuptools so that the setup bdist_wheel packaging works. You also need the wheel package installed.

Setup

There is a buglet in the compile set up. You need to specify

export _C89_CCMODE=1

Without it you get

FSUM3008 Specify a file with the correct suffix (.c, .i, .s, .o, .x, .p, .I, or .a), or a corresponding data set name, instead of -obuild/lib.os390-27.00-1090-3.8/pymqi/pymqe.so.

You also need the binder input in a data set with the correct suffix. For example .OBJ

“//’COLIN.MQ924.SCSQDEFS.OBJ(CSQBMQ2X)'”

If you do not have the correct suffix you get

FSUM3218 xlc: File //’COLIN.MQ924.SCSQDEFS(CSQBMQ2X)’ contains an incorrect file suffix.

Doing the compile and test install

I used a shell script to do the compiles and install

touch code/pymqi/*.c
rm a b c d
export _C89_CCMODE=1
#python3 setup.py clean
python3 setup.py build 1>a 2>b
python3 setup.py install 1>c 2>d

I captured the output from the setup.py jobs using 1>a etc because I could not see how to direct the binder output to a file. It comes out on the terminal – and there was a lot of it!.

Packaging the package

Python build which worked

I had to install wheel package. See How to install software in an isolated environment, or just use python3 -m pip install wheel if your z/OS image is connected to the network.

The command I used was

python3 -m pip install –user –no-cache-dir /u/tmp/py/wheel-0.37.1-py2.py3-none-any.whl-f /u/tmp/py/wheel-0.37.1-py2.py3-none

I had to add import setuptools to my setup.py file (at the top). (This converted the install package from a dist-utils to a setuptools packaging)

python3 setup.py bdist_wheel

This created a file “/u/pymqi/dist/pymqi-1.12.0-cp310-cp310-os390_27_00_1090.whl

This file is specific to python 3.10

For a wheel package, you’ll need to build it for all major versions and cannot just use one. Note that there were some issues in 3.8/3.9 with wheels that have been resolved in 3.10, so it’s recommended you to use 3.10.

Steven Pitman

This means you need to have multiple levels of Python installed, and build for each one!

This also has the operating system level (os390_27_00 – this may be constant across machines) and the hardware 1090. For this to work on other hardware, one solution would be to manually rename the file to pretend it is for a different machine, but this both not supported nor recommended, and has no guarantee to work. So it is hard to know the best thing to do. I do not have every 390 machine from IBM to do a build on !

Failing build. This built but did not install.

python3 setup.py bdist –format=tar

It built the package and create a file

./dist/pymqi-1.12.0.os390-27.00-1090.tar

This tar file is not completely readable by the z/OS tar command.

When I used tar -tf ….tar it gave

FSUMF371 Value 1641318408.0 is not valid for keyword mtime. Keyword not set.

It uses a Python tar command, not the operating system tar command.

You can display the contents using a Python program like

import tarfile
tar = tarfile.open("dist/pymqi-1.12.0.os390-27.00-1090.tar.gz")
# tar.extractall() 
for x in tar:
    print(x)

This gave output like

<TarInfo ‘.’ at 0x5008ad3880>
<TarInfo ‘./usr’ at 0x5008ad3dc0>
<TarInfo ‘./usr/lpp’ at 0x5008ad3a00>

Installing the package

From an authorised user in OMVS,

python3 -m pip install –no-cache-dir /u/pymqi/dist/pymqi-1.12.0-cp310-cp310-os390_27_00_1090.whl /u/pymqi/dist/pymqi-1.12.0-cp310-cp310-os390_27_00_1090.whl
Processing /u/pymqi/dist/pymqi-1.12.0-cp310-cp310-os390_27_00_1090.whl
Installing collected packages: pymqi
Successfully installed pymqi-1.12.0

If you do not use –no-cache-dir, you may get

-[33]WARNING: The directory ‘/u/.cache/pip’ or its parent directory is not owned or is not writable by the current user. The cache has been disabled. Check the permissions and owner of that directory. If executing pip with sudo, you should use sudo’s -H flag.-[0]

The compile options

The following text is the compile and bind options used for my code. Some of the options are pymqi specific.

/bin/xlc -DNDEBUG -O3 -qarch=10 -qlanglvl=extc99 -q64
-Wc,DLL
-D_XOPEN_SOURCE_EXTENDED
-D_UNIX03_THREADS
-D_POSIX_THREADS
-D_OPEN_SYS_FILE_EXT
-qexportall -qascii -qstrict -qnocsect
-Wa,asa,goff -Wa,xplink
-qgonumber -qenum=int
-DPYQMI_BINDINGS_MODE_BUILD=1 -I//’COLIN.MQ924.SCSQC370′
-I/u/tmp/python/usr/lpp/IBM/cyp/v3r10/pyz/include/python3.10
-c code/pymqi/cpmqe.c
-o build/temp.os390-27.00-1090-3.10/code/pymqi/cpmqe.o

/bin/xlc build/temp.os390-27.00-1090-3.10/code/pymqi/cpmqe.o -L.
-o build/lib.os390-27.00-1090-3.10/pymqi/cpmqe.cpython-310.so -Wl,INFO,LIST,MAP,DLL //’COLIN.MQ924.SCSQDEFS.OBJ(CSQBMQ2X)’
-Wl,dll
/u/tmp/python/usr/lpp/IBM/cyp/v3r10/pyz/lib/python3.10/config-3.10/libpython3.10.x
-q64

Python on z/OS coding a C extension.

I was porting the pymqi code, which provides a Python interface to IBM MQ, to z/OS.

I’ve documented getting the code to build. I also had challenges trying to use it.

The code runs as ASCII!

When my C extension was built, it gets built with the ASCII option. This means any character constants are in ASCII not, EBCDIC.

My python program had

import sys
sys.path.append(‘./’)
import pymqi
queue_manager = ‘AB12’
qmgr = pymqi.connect(queue_manager)
qmgr.disconnect()

When my C code got to see the AB12 value, it was in ASCII. When the code tried to connect to the queue manager, it return with name error. This was because the value was in ASCII, and the C code expected EBCDIC.

You can take see if the program has been compiled with the ASCII flag using

#ifdef __CHARSET_LIB

… It is has been compiled with option ASCII
#else

If you use printf to display the queue manager name it will print AB12. If you display it in hex you will be 41423132 where A is x41, B is x42, 1 is x31 is 2 is x32.

In your C program you can convert this using the c function a2e with length option, for example

char EQMName[4];
memcpy(&EQMName[0],QMname,4);
__a2e_l(&EQMName[0],sizeof(EQMName));

// then use &EQMName

Converting from ASCII to EBCDIC in Python.

Within Python you have strings stored as Unicode, and byte data. If you have a byte array with x41423132 (the ASCII equivalent to AB12). You can get this in the EBCDIC format using

a=b’41423132′ # this is the byte array
m = a.decode(“ascii”) # this creates a character string
e = m.encode(‘cp500’) # this create the new byte array of the EBCDIC version .. or xC1C2F1F2

In Python you convert from a dictionary of fields into a “control block” using the pack function. You can use the “e” value above in this.
MQ control blocks have a 4 character eye catcher at the front eg “OD “. If you use the pack function and pass “OD ” you will pass the ASCII version. You will need to do the decode(‘ascii’) encode(‘CP500’) to create the EBCDIC version.

Similarly passing an object such as MQ queue name, will need to be converted to the EBCDIC version.

Converting from EBCDIC to ASCII

If you want to return character data from your C program to Python, you will need to the opposite.

For example m is a byte array retuned back from the load module.

#v is has the value b’C1C2F2F3′ (AB23)
m = v.decode(“cp500”)
a = m.encode(‘ascii’)

# a now has b’41423132′ which is the ascii equivilant

A short C quiz, and some gotcha’s

I’ve been looking at porting pymqi, the Python MQ interface to z/OS.

The biggest challenges where nothing to do with Pymqi.

So if you are bored after Christmas and want something stimulating… here are a few questions for you… The answers are below. I tried getting them displayed upside down, like all quality magazines; but that was too difficult.

Question 1. C question

I’ve reduced the problem I experienced,down to

int main() 
{ 
if ( 1==0 ) return 8; 
int rc; 
*=ERROR===========> CCN3275 Unexpected text 'int' encountered.
}                                                          

Hint: it works in a batch compile, using EDCCB

Question 2 binding in Unix Services

/bin/xlc a.o -L. -o b.so -Wl,INFO //’COLIN.MQ924.SCSQDEFS(CSQBRR2X)’ -Wl,dll c.x

Gave

FSUM3218 xlc: File //’COLIN.MQ924.SCSQDEFS(CSQBRR2X)’ contains an incorrect file suffix.

What do I need to do to fix it?

Question 3. Strange bind messages

Before I found the solution to problem number 2, I put the bind statements into a Unix Services file.

Using this gave me

IEW2326E 1221 THE FOLLOWING INVALID RECORD HAS BEEN SEEN:
=”lm-source” *

Copyright
IEW2326E 1221 THE FOLLOWING INVALID RECORD HAS BEEN SEEN:
IBM Corp. 2009, 2016 All Rights Reserved.

This bind statement was

cc -o mqsamp -W l,DYNAM=DLL,LP64 c.o mq.o

it worked without the mq.o

The mq.o file had

* <copyright                                                          * 
* notice="lm-source"                                                  * 
* (C) Copyright IBM Corp. 2009, 2016 All Rights Reserved.             * 
* </copyright>                                                        * 

Answer

  1. Using the cc compiler, it defaults to #pragma Langlvl(stdc89) which supports the c89 level of C. This says all variable declarations must come before any logic. This is relaxed in the c99 level, so specifying #pragma Langlvl(stdc99) cures it. You can also specify LANGLVL(EXTENDED) in the cc statement
  2. To include datasets in some of the binder options you need host file: filename with .OBJ suffix (object host file for the binder/IPA Link). When I used /bin/xlc a.o … -Wl,INFO //’COLIN.MQ924.SCSQDEFS.OBJ(CSQBRR2X)’ … it worked.
  3. The binder is not good at files in Unix Services, it likes records which are fixed block 80. The mq.o file had trailing blanks removed, and this confused it. I had to use a PDSE to get it to work.

Rexx to C to Rexx sample code

I’ve put up on github some sample code to demonstrate how you can write a function in C, and invoke it from Rexx. I’ve provided some glue code as Rexx uses R0 and R1 to pass parameters, and C programs only use R1.

I’ve create some small functions to use in your C program which hide the Rexx logic. For example

rc = CRexxDrop(pEnv,”ZDROP”);
rc = CRexxGet(pEnv,”InSymbol”,&buffer[0],&lBuffer);
rc = CRexxPut(pEnv,”CPPUTVar,”Colinsv”,0);
Iterate through all symbols

If you have any comments or suggestions, please let me know.

Where’s my invisible code.

In trying to get system exits written in C to work. I found my code was not being executed, even when the first instructions were a deliberate program check. I tried using the tried and trusted program AMASPZAP (known as Super Zap) for displaying the internals of a load module and zapping it – but my code was not there! Where was it hiding? When I took a dump of the address space my code was in the dump. Why was it invisible and not being executed?

HSM archives on tape

Like taking 20 minutes to recall a long unused dataset from HSM (mounting a physical tape to retrieve the data set), I had this vague memory of doing a presentation on the binder and the structure of load modules. After a cup of tea and a chocolate biscuit to help the recall, I remembered about classes etc in a load module.

Classes etc

When I joined IBM over 40 years ago you wrote your program, and used the link editor to create the load module, a single blob of instructions and data.

Things have moved on. Think about a C program, in read only memory. When you issue a load to use it, you get a pointer to the read only (the re-entrant instructions and data), and your own copy of the “global variables” or Writeable Static Area (WSA). When using the C compiler, at bind time it includes a bit of code with 24 bit addressing mode. This means you have code which runs in 31/64 bit mode, and some code resident in 24 bit mode! It is no longer a single blob of instructions and data.

Within the load module there are different classes of data for example

  • C_CODE – C code
  • C_WSA – for a C program compiled with RENT option. This is the global data which each instance gets its own private copy of
  • B_TEXT code from the assembler
  • Using the HL Assembler, you can define your own classes using CATTR.

A class has attributes, such as

  • Length.
  • Should it be loaded or not. You could store documentation in the load module, which can be used by programs, but not needed at execution time.
  • RMODE.
  • It is reentrant or not.
  • Should this code be merged or replaced with similar code. For example the C Globals section would be merged. A block of instructions would be replace.

Segments

The binder can take things with similar attributes and store them together within a segment. You can have mixed classes eg B_TEXT and C_CODE, with the same RMODE attributes etc and have them in one segment. The C_WSA needs to be in a different segment because it has different attributes.

So where was my invisible code?

I needed to change my SPZAP job to tell it to dump out the C_CODE section. By default it dumps the B_TEXT sections. You can specify C_* or B_*. See the AMASPZAP documentation.

//STEP EXEC PGM=AMASPZAP
//SYSPRINT DD SYSOUT=A
//SYSLIB DD DISP=SHR,DSN=COLIN.C.LOAD
//SYSIN DD *
DUMPT COLIN CPPROGH C_CODE
/*

This dumps out (decoding the data into instructions) load module COLIN, CSECT CPPROGH and the C_CODE class.

Why wasn’t my code executing? The code to set up the C environment was not invoking my program because I had compiled it with the wrong options!

Writing system exits in C (and compiling them).

I wanted to call a C program from Rexx to do some special processing. The C programming guide gave me some hints, but I found it was a struggle to do it. It reminded me of when I was young and my father gave me a “beginners electronics kit” where had transistors, resisters, etc. You could build a “computer” that counted to 3, and make a radio. Unfortunately the instructions that came in it were in German, and for a different model kit to what I had. As a result it was very difficult to get working, but once you knew it was easy.

In the C programming guide there were instruction like “The CSECT must be the module entry point.” without saying which CSECT to use. They gave some sample programs, but not the JCL to compile them. After many failures, (looking at dumps and traces) I found you had to compile the C programs with “NORENT” which went against many years of experience.

I was using the System Programming C facility, which can be used, for example as z/OS exits. Note: This is different to Metal C, which allows you to include assembler code in your C program.

Some background

  • These programs do not have a main() but are invoked with a z/OS type parameter list.
  • They can use C facilities, such as printf, but not LE functions.
  • You cannot use the UNIX file system functions.
  • They need to be called with the C environment set up. You cannot just branch to the entry point.
  • You can have several functions in the same source file. You branch to the one of interest.

Simple case

My C program was

#pragma environment(CPPROGH)
#pragma csect (CODE, “OEMPUT”)
int CPPROGH(int * p, evalblock * pEval, char * env) {
….
return 0;
}

The pragma environment said set up the C environment before calling executing this function. It takes the standard z/OS parameter list.

I needed some glue code to take the parameters from Rexx and store them in a parameter list for the function.

This glue codes saves parameters from R0,and 16(r1) and 20 (r1), then executes the function.

ENVA RMODE ANY
ENVA AMODE 31 
ENVA  CSECT
  ...   
  L    R3,16(R1)  a(Parmlist) 
  ST   R3,Parmlist+0 
  L    R3,20(R1)  a(evalblk) 
  L    R3,0(R3) 
  ST   R3,Parmlist+4 
  ST   R0,PARMLIST+08  A(env block) 
  OI   PARMLIST+08,X'80' 
  la   r1,parmlist 
  L     R15,=V(CPPROGH) 
  BASR  R14,R15 

I wanted this to be called from REXX, which passes parameters in R0 and R1, so I had to write some glue code to store the parameters in storage before passing them to the program.

I compiled the glue code with

//GLUE EXEC PGM=ASMA90,PARM=’DECK,NOOBJECT,LIST,XREF(SHORT),NORENT’,
// REGION=0M
//SYSLIB DD DISP=SHR,DSN=CEE.SCEEMAC
// DD DSN=SYS1.MACLIB,DISP=SHR
//SYSUT1 DD UNIT=SYSDA,SPACE=(CYL,(1,1))
//SYSPUNCH DD DISP=SHR,DSN=COLIN.C.REXX.OBJ(GLUE2)
//SYSPRINT DD SYSOUT=*
//SYSIN DD DISP=SHR,DSN=COLIN.C.REXX(GLUE2)
//*

and compiled the C code with

//S1 JCLLIB ORDER=CBC.SCCNPRC
// SET LOADLIB=COLIN.C.REXX.LOAD
// SET LIBPRFX=CEE
//COMPILE EXEC PROC=EDCCB,
// LIBPRFX=&LIBPRFX,
// CPARM=’OPTFILE(DD:SYSOPTF),NORENT‘,
// BPARM=’SIZE=(900K,124K),RENT,LIST,XREF,RMODE=ANY,AMODE=31′
//COMPILE.SYSOPTF DD DISP=SHR,DSN=COLIN.C.REXX(CPARMS)
//COMPILE.SYSIN DD DISP=SHR,DSN=COLIN.C.REXX(CPPROGHE)
//BIND.SYSLMOD DD DISP=SHR,DSN=&LOADLIB.
//BIND.SYSLIB DD DISP=SHR,DSN=CEE.SCEESPC
// DD DISP=SHR,DSN=CEE.SCEELKED
//BIND.OBJLIB DD DISP=SHR,DSN=COLIN.C.REXX.OBJ
//BIND.SYSIN DD *
INCLUDE OBJLIB(GLUE2)
ENTRY ENVA
NAME COLIN(R)
/*

The EDCCB procedure to compile and bind, stores the object deck in a temporary file then passes this file and BIND.SYSIN into the binder.

C persistent environment.

The previous example created a C environment, ran my program, and deleted he C environment. If you want to do many calls to C functions you can set up a Persistent C environment. In this environment you do

  • From assembler, set up the environment
  • From assembler, use the environment, and call functions with your program as many times as you need
  • From assembler close down the environment,

This is well documented in the C programming guide, (but not how to compile it).

The essence of my program was

Set up the environment

L R15,=V(EDCXHOTL)
BASR R14,R15

Call my function

   LA R4,HANDLE 
   LA R5,USEFN  This has the  
   STM  R4,R5,PARMLIST 
* now the user paramaters
  ...
   OI   PARMLIST+16,X'80' 
   LA   R1,PARMLIST 
   L    R15,=V(EDCXHOTU) 
   BASR R14,R15 
...
USEFN    DC V(CPPROGH) <<  This function name

Clean up

    LA R1,PARMLIST 
    OI 0(R1),X'80' 
    L R15,=V(EDCXHOTT) 
    BASR R14,R15 

My C program was

#pragma linkage(CPPROGH,OS)
int CPPROGH(int * p, evalblock * pEval, char * env) {
printf(“in CPPROG\n”);
return 0}

In this case the pragma is LINKAGE(CPPROGH,OS). The previous, self contained code, had ENVIRONMENT(CPPROGH). You need to use the right one.

Which procedure do I use to compile?

The C books describe the various procedures, for example EDCCB for compile and BIND, and EDCCL for compile and LINKEDIT. They do the same thing. The LINKEDIT uses program HEWL to link edit. The BIND uses IEWL to invoke the binder. These are both aliases to the binder IEWBLINK.

What’s the difference between BALR and BASR?

When coding, my fingers automatically used BALR (Branch and Link Register). This worked fine, but I should have used BASR (Branch and Save Register). As the Principles of Operation (POP) says

It is recommended, however, that BRANCH AND SAVE (BAS and BASR) be used instead and that BRANCH AND LINK be avoided since it places nonzero information in bit positions 32-39 of the general register in the 24-bit addressing mode, which may lead to problems and may decrease performance.

In 31 bit mode with BALR 14,15, the return address is stored in register 14. ‘1’ followed by the 31 it address.

In 24 bit mode, the return address has other information at the top, including the condition code. Most of the time this information will be ignored.

So using BALR is not wrong, it is that BASR is better.