How to stop when blocked.

I hit an interesting challenge when working with SMF-real-time.

An application can connect to the SMF real time service, and it will be sent data as it is produced. Depending on a flag the application can use:

  • Non blocking. The application gets a record if there is one available. If not it gets are return code saying “No records available”.
  • Blocking. The application waits (potentially for ever) for a record.

Which is the best one to use?

This is a general problem, it not just SMF real time that has these challenges.

What are the options- generally?

Non blocking

You need to loop to see if there are records, and if not wait. The challenge is how long should you wait for. If you wait for 10 seconds, then you may get a 10 second delay between the record being created, and your application getting it. If you specify a shorter time you reduce this window. If you reduce the delay time, then the application does more loops per minute and so there is an increase in CPU cost.

Blocking

If you use blocking – there is no extra CPU looping round. The problem is how to do you stop processing your application cleanly when it is in a blocked wait, to allow cleaning up at the end of the processing.

A third way

MQSeries process application messages asynchronously. You can say wait for a message – but time out after a specified time interval if no messages have been received.

This method is very successful. But some application abused it. They want their application to wake up every n second, and check their application’s shutdown flag. If their flag is set, then shutdown.

The correct answer in this case is to have MQ Post an Event Control Block(ECB), the application’s code posts another ECB; gthe mainline code waits for either of the EBCs to be posted, and take the appropriate action. However the lazy way of sleeping, waking, checking and sleeping is quick and easy to code.

What are the options for SMF real time?

With the SMF real time code, while one thread is running – and in a blocked wait, another thread can issue a disconnect() request. This wakes up the waiting thread with a “you’ve been woken up because of a disconnect” return code.

The solution is to use a threading model.

The basic SMF get code

while True:
x, rc = f.get(wait=True) # Blocking get
if x == "" : # Happens after disc()
print("Get returned Null string, exiting loop")
break
if rc != "OK":
print("get returned",rc)
print("Disconnect",f.disc())
i += 1
# it worked do something with the record
...
print(i, len(x)) # Process the data
...

Make the SMF get a blocking get with timeout

With every get request, create another thread to wake up and issue the disconnect, to cause the blocked get to wake up.

This may be expensive doing a timer thread create and cancel with every record.

def blocking_get_loop(f, timeout=10, event=None,max_records=None,):
i = 0
while True:
t = Timer(timeout, timerpop(f)) # execute the timeout function after 10 seconds
x, rc = f.get(wait=True) # Blocking get
t.cancel()

....


# This wakes after the specified interval then executes.
def timerpop(f):
f.disc()

Wait for a terminal interrupt

Vignesh S sent me some code which I’ve taken and modified.

The code to the SMF get is in a separate thread. This means the main thread can execute the disconnect and wake up the blocked request.

# Usage example: run until Enter or interrupt, or max_records
if __name__ == "__main__":
blocking_smf(max_records=6) # execute the code below.

def blocking_smf(stream_name="IFASMF.INMEM", debug=0, max_records=None):
f = pySMFRT(stream_name, debug=debug)
f.conn(stream_name,debug=2) # Explicit connect

# Start the blocking loop in a separate thread
get_thread = threading.Thread(target=mainlogic,args=(f),kwargs={"max_records": 4}))
get_thread.start()


try:
# Main thread: wait for user input to stop
input("Press Enter to stop...\n")
print("Stopping...")
except KeyboardInterrupt:
print("Interrupted, stopping...")
finally:
f.disc() # This unblocks the get() call
get_thread.join() # Wait for thread to exit

The key code is

get_thread = threading.Thread(target=mainlogic,args=(f),kwargs={"max_records": 4}))
get_thread.start()

This attaches a thread, execute the function mainlogic, passing the positional parameters f, and keyword arguments max_records.

The code to do the gets and process the requests is the same as before, with the addition of the count of records processed.

def blocking_get_loop(f, max_records=None):
i = 0
while True:
x, rc = f.get(wait=True) # Blocking get
if x == "" : # Happens after disc()
print("Get returned Null string, exiting loop")
break
if rc != "OK":
print("get returned",rc)
print("Disconnect",f.disc())
i += 1
# it worked do something with the record
...
print(i, len(x)) # Process the data
...
#
if max_records and i >= max_records:
print("Reached max records, stopping")
print("Disconnect",f.disc()) # clean up if ended because of number of records
break

If the requested number of records has been processed, or there has been an error, or unexpected data, then disconnect is called, and the function returns.

Handle a terminal interrupt

The code

try:
# Main thread: wait for user input to stop
input("Press Enter to stop...\n")
print("Stopping...")
except KeyboardInterrupt:
print("Interrupted, stopping...")
finally:
f.disc() # This unblocks the get() call
get_thread.join() # Wait for thread to exit

waits for input from the terminal.

This solution is not perfect because if the requested number of records are processed quickly, you still have to enter something at the keyboard.

Use an event with time out

One problem has been notifying the main task when the SMF get task has finished. You can use an event for this.

In the main logic have

def blocking_smf(stream_name="IFASMF.INMEM", debug=0, max_records=None):
f = pySMFRT(stream_name, debug=debug)
f.conn(stream_name,debug=2) # Explicit connect
myevent = threading.Event()

# Start the blocking loop in a separate thread
get_thread = threading.Thread(target=blocking_get_loop, args=(f,),
kwargs={"max_records": max_records,
"event":myevent})
get_thread.start()
# wait for the SMF get task to end - or the event time out
if myevent.wait(timeout=30) is False:
print("We timed out")
f.disc() # wakeup the blocking get

get_thread.join # Wait for it to finish

In the SMF code

def blocking_get_loop(f, t,max_records=None, event=None):
i = 0
while True:
#t = Timer(timeout, timerpop(f))
x, rc = f.get(wait=True) # Blocking get
#t.cancel()
if x == "" : # Happens after disc()
print("Get returned Null string, exiting loop")
break
if rc != "OK":
print("get returned",rc)
print("Disconnect",f.disc())
i += 1
print(i, len(x)) # Process the data
if max_records and i >= max_records:
print("Reached max records, stopping")
print("Disconnect",f.disc())
break
if event is not None:
event.set() # wake up the main task

s

Why oh why is my application waiting?

I’ve been working on a presentation on performance, and came up with an analogy which made one aspect really obvious…. but I’ll come to that.

This blog post is a short discussion about software performance, and what affects it.

Obvious statement #1

The statement used to be An application is either using, CPU, or waiting. I prefer to add or using CPU and waiting which is not obvious unless you know what it means.

Obvious statement #2

All applications wait at the same speed. If you are waiting for a request from a remote server, it does not matter how fast your client machine is.

Where can an application wait?

I’ll go from longest to shortest wait times

Waiting for the end user

If you have displayed a menu for an end user to complete, you might wait minutes (or hours) for the end user to complete the information and send it.

Waiting for a remote request

This can be a request to a remote server to do something. This could be to buy something, or simple web look up, or a Name Server lookup. These should all be under a second.

Waiting for disk I/O

If your application is doing database work, such as DB2 there can be many disk I/Os. Any updates are logged to disk for recovery purposes. If your disk response time is typically 1 ms, then you may have to wait several milliseconds. When your application issues a commit, and wants to log data – there will likely to be an I/O in progress – so you have to wait for that I/O to complete before any more data can be written. Typically a database can write 16 4KB pages at a time. If the database logging is very active you may have to wait until any queued data in log buffers is written, before your application’s data can be written. An I/O consists of a set up followed by data transmission. The set up time is usually pretty constant – but more data takes more time to transfer. Writing 16 * 4 KB pages will usually take longer than writing one 4KB page.

An application writing to a file may buffer up several records before writing one record to the external medium. You application wrote 10 records, but there was only one I/O.

These I/Os should be measured in milliseconds (or microseconds).

Database and record locks

If your application want to update some information in a database record it could do

  • Get record for update (this prevents other threads from updating it)
  • Display a menu for the end user to complete
  • When the form has been completed, update the record and commit.

This is an example of “Waiting for the end user”. Another application wanting to update the same record may get an “unavailable” response, or wait until the first application has finished.

You can work around this using logic like

  • Each record has a last updated timestamp.
  • Read the record note the last updated timestamp, display the menu
  • When the form has been completed..
    • Read the record for update from the database, and check the “last updated time”.
    • If the time stamp matches the saved value, update the information and commit the changes.
    • If the time stamp does not match, then the record has been updated – release it, and go to the top and try again.

Coupling Facility access

This is measured in 10s of microseconds. The busier the CF is, the longer requests take.

Latches

Latches are used for serialisation of small sections of code. For example updating storage chains.

If you have two queues of work elements, one queued work, on in-progress work. In a single threaded application you can move a work element between queues. With multiple threads you need some form of locking.

In its simplest form it is

pthread_mutex_lock(mymutex)
work = waitqueue.pop()
active.push(work)
pthread_mutex_unlock(mymutext)

You should design your code so few threads have to wait.

Waiting for CPU

This can be due to

  • The LPAR is constrained for CPU; other work gets priority, and your application is not dispatched.
  • The CEC (physical box) is constrained for CPU and your LPAR is not being dispatched.

If your LPAR has been configured to use only one CPU, and there is space capacity in the CEC your LPAR will not be able to use it.

Waiting for paging etc

In these days of lots of real storage in the CEC, waiting for paging etc is not much of an issue. If the virtual page you want is not available to you the operating system has to allocate the page, and map it to real storage.

Waiting for data – using CPU and waiting.

Some 101 education on computer Z architecture

  • The processors for the z architecture are in books. Think of a book as being a physical card which you can plug/unplug from a rack.
  • You can have multiple books.
  • Each book has one or more chips
  • Each chip has one or more CPUs.
  • There is cache (RAM) for each CPU
  • There is cache for each chip
  • There is cache for each book
  • At a hardware level, when you are updating a real page, it is locked to your CPU.
  • If another CPU wants to use the same real page, it has to send a message to the holding CPU requesting exclusive use
  • The physical distance between two CPUs on the same chip is measured in millimeters
  • The distance between two CPUs in the same book is measured in centimeters
  • The distance between two CPUs in different books could be a metre.
  • The time to send information depends on the distance it has to travel. Sharing data between data two CPUs on the same chip will be faster than sharing data between CPUs in different books.

Some instructions like compare and swap are used for serialising access to one field.

  • Load register 4 with value from data field. This could be slow if the real page has to be got from another CPU. It could be fast it the storage is in the CPU, chip or book cache.
  • Load register 5 with new value
  • Compare and swap does
    • Get the exclusive lock on the data field
    • If the value of the data field matches the value in register 4 (the compare)
    • then replace it with the value in register 5 (the swap)
    • else say mismatch
    • Unlock

These instruction (especially the first load) can take a long time, especially if the data field is “owned” by another CPU, and the hardware has to go and get the storage from another CPU in a different book, a metre away.

A common technique for Compare and Swap is to have a common trace table. Each thread gets the next free element, and sets the next free. With many CPU’s actively using the Compare and Swap, these instructions could be a major bottleneck.

A better design is to give each application thread their own trace buffer to avoid the need for a serialisation instruction, and so there is no contention.

Storage contention

We finally get to the bit with the analogy to explain storage contention

You have an array of counters with one slot for each potential thread. You have 16 threads, your array is size 16.

Each thread keeps updates its counter regularly.

Imaging you are sitting in a class room listening to me lecture about performance and storage contention.

I have a sheet of paper with 16 boxes drawn on it.
I pick a person in the front row, and ask them to make a tick on the page in their box every 5 seconds.

Tick, tick, tick … easy

Now I introduce a second person and it gets harder. The first person make a tick – I then walk the piece of paper across the classroom to the second person, who makes a tick. I walk back to the first, who makes another tick etc

This will be very slow.

It gets worse. My colleague is giving the same lecture upstairs. I now do my two people, then go up a floor, so someone in the other classroom can make a mark. I then go back down to my class room and my people (who have been waiting for me) can then make their ticks.

How to solve the contention?

The obvious answer is to give each person their own page, and there is no contention. In hardware terms it might be a 4KB page – or it may be a 256 cache line.

I love this analogy; it has many levels of truth.

Getting from C enumerations to Python dicts.

I wanted to create Python enumerates from C code. For example with system ssl, there is a datatype x509_attribute_type.

This has a definition

typedef enum {                                                                    
x509_attr_unknown = 0,
x509_attr_name = 1, /* 2.5.4.41 */
x509_attr_surname = 2, /* 2.5.4.4 */
x509_attr_givenName = 3, /* 2.5.4.42 */
x509_attr_initials = 4, /* 2.5.4.43 */
...
} x509_attribute_type;

I wanted to created

x509_attribute_type = {
"x509_attr_unknown" : 0,
"x509_attr_name" : 1,
"x509_attr_surname" : 2,
"x509_attr_givenName" : 3,
"x509_attr_initials" : 4,
...
}

I’ve done this using ISPF macros, but thought it would be easier(!) to automate it.

There is a standard way for compilers to products information for debuggers to understand the structure of programs. DWARF is a debugging information file format used by many compilers and debuggers to support source level debugging. The data is stored internally using Executable and Linkable Format(ELF).

Getting the DWARF file

To get the structures into the DWARF file, it looks like you have to use the structure, that is, if you #include a file, by default the definitions are not stored in the DWARF file.

When I used

#include <gskcms.h>
...
x509_attribute_type aaa;
x509_name_type nt;
x509_string_type st;
x509_ecurve_type et;
int l = sizeof(aaa) sizeof(nt) + sizeof(st) + sizeof(et);

I got the structures x509_attribute_type etc in the DWARF file.

Compiling the file

I used USS xlc command with

xlc ...-Wc,debug(FORMAT(DWARF),level(9))... abc.c

or

/bin/xlclang -v -qlist=d.lst -qsource qdebug=format=dwarf -g -c abc.c -o abc.o

This created a file abc.dbg

The .dbg file includes an eye catcher of ELF (in ASCII)

I downloaded the file in binary to Linux.

There are various packages which are meant to be able to process the file. The only one I got to work successfully was dwarfdump. The Linux version has many options to specify what data you want to select and how you want to report it. dwarfdump reported some errors, but I got most of the information out.

readelf displays some of the information in the file, but I could not get it to display the information about the variables.

What does the output from dwarfdump look like?

The format has changed slightly since I first used this a year or so ago. The data are not always aligned on the same columns, and values like <146> and 2426 (used as a locator id) are now hexadecimal offsets.

The older format

<1>< 2426>      DW_TAG_subprogram
DW_AT_type <146>
DW_AT_name printCert
DW_AT_external yes
...
<2>< 2456> DW_TAG_formal_parameter
DW_AT_name id
DW_AT_type <10387>
...

The newer format

< 1><0x0000132d>    DW_TAG_typedef
DW_AT_type <0x00001349>
DW_AT_name x509_attribute_type
DW_AT_decl_file 0x00000002
DW_AT_decl_line 0x000000a7
DW_AT_decl_column 0x00000003
< 1><0x00001349> DW_TAG_enumeration_type
DW_AT_name __3
DW_AT_byte_size 0x00000002
DW_AT_decl_file 0x00000002
DW_AT_decl_line 0x00000091
DW_AT_decl_column 0x0000000e
DW_AT_sibling <0x00001553>
< 2><0x00001356> DW_TAG_enumerator
DW_AT_name x509_attr_unknown
DW_AT_const_value 0
< 2><0x0000136a> DW_TAG_enumerator
DW_AT_name x509_attr_name
DW_AT_const_value 1

Some of the fields are obvious… others are more cyptic, with varying levels of indirection.

  • <1><0x0000132d> is a high level object <1> with id <0x0000132d>
    • DW_TAG_typedef is a typedef
    • DW_AT_type <0x00001349> see <0x00001349> for the definition (below)
    • DW_AT_name x509_attribute_type is the name of the typedef
    • DW_AT_decl_file 0x00000002 there is a file definition #2… but I could not find it
    • DW_AT_decl_line 0x000000a7 the position within the file
    • DW_AT_decl_column 0x00000003
  • < 1><0x00001349> DW_TAG_enumeration_type. This is referred to by the previous element
    • DW_AT_name __3 this an internally generated name
  • < 2><0x00001356> DW_TAG_enumerator This is part of the <1> <0x00001349> above. It is an enumerator.
    • DW_AT_name x509_attr_unknown this is the label of the value
    • DW_AT_const_value 0 with value 0
  • the next is label x509_attr_name with value 1

Other interesting data

I have a function

int colin(char * cinput, gsk_buffer * binput )
{
...
}
and
typedef struct _gsk_data_buffer {
gsk_size length;
void * data;
} gsk_data_buffer, gsk_buffer;

Breaking this down into its parts, there is an entry in the DWARF output for “int”, “colin”, “*”, “char”, “cinput”, “*”, gsk_buffer (which has levels within it), “binput”

< 1><0x0000009a>    DW_TAG_base_type
DW_AT_name int
DW_AT_encoding DW_ATE_signed
DW_AT_byte_size 0x00000004

< 1><0x00000142> DW_TAG_subprogram
DW_AT_type <0x0000009a>
DW_AT_name colin
DW_AT_external yes(1)
...
< 2><0x0000015c> DW_TAG_formal_parameter
DW_AT_name cinput
DW_AT_type <0x00001fa9>

< 2><0x00000173> DW_TAG_formal_parameter
DW_AT_name binput
DW_AT_type <0x00001fb5>
:

< 2><0x0000018a> DW_TAG_variable
DW_AT_name __func__
DW_AT_type <0x00001fc0>

for cinput

< 1><0x00001fa9>    DW_TAG_pointer_type
DW_AT_type <0x0000006e>
DW_AT_address_class 0x0000000a
< 1><0x0000006e> DW_TAG_base_type
DW_AT_name unsigned char
DW_AT_encoding DW_ATE_unsigned_char
DW_AT_byte_size 0x00000001

For binput

  • binput (1fbf) -> DW_TAG_pointer_type (1e2e) -> gsk_buffer (1e41) -> _gsk_data_buffer of length 8.
  • _gsk_data_buffer has two component
    • length _> gsk_size…
    • data (1faf) -> pointer_type (13c)-> unspecified type void

Processing the data in the file

Parse the data

For an entry like DW_AT_type <0x0000006e> which refers to a key of <0x0000006e>. The definition for this could be before after the current entry being processed.

I found it easiest to process the whole file (in Python), and build up a Python dictionary of each high level defintion.

I could then process the dict one element at a time, and know that all the elements it refers to are in the dict.

There are definitions like

typedef struct _x509_tbs_certificate { 
x509_version version;
gsk_buffer serialNumber;
x509_algorithm_identifier signature;
x509_name issuer;
x509_validity validity;
x509_name subject;
x509_public_key_info subjectPublicKeyInfo;
gsk_bitstring issuerUniqueId;
gsk_bitstring subjectUniqueId;
x509_extensions extensions;
gsk_octet rsvd[16];
} x509_tbs_certificate;

Some elements like x509_algorithm_identifier have a complex structure, which refer to other structures. I think the maximum depth for one of the structures was 6 levels deep.
If you are processing a structure you need to decide how many levels deep you process. For the enumeration I was just interested in the level < 1> and < 2> definitions and ignored any below that depth.

For each < 1> element, there may be zero or more < 2> elements. I added each < 2> element to a Python list within the < 1> element.

You may decide to ignore entries such as which file, row or column a definition is in.

My Python code to parse the file is

fn = "./dwarf.txt"
with open(fn) as fp:
# skip the stuff at the front
for line in fp:
if line[0:5] == "LOCAL":
break
all = {} # return the data here
for line in fp:
# if the line starts with ".debug" we're done
if line[0:6] == ".debug":
break
lhs = line[0:4]
# do not process nested requests
if line[0:1] == "<":
if lhs in ["< 1>","< 2>"]:
keep = True
else:
keep = False
else:
if line[0:1] != " ":
continue
if keep is False: # only within <1> and <2>
continue
# we now have records of interest < 1> and < 2> and records underneath them
if line[0:4] == "< 1>":
key = line[4:16].strip() # "< 123>"
kwds1 = {}
all[key] = kwds1
kwds1["l2"] = [] # empty list to which we add
state = 1 # <1> element
kwds1["type"] = line[20:-1]
elif line[0:4] == "< 2>":
kwds2 = {}
kwds1["l2"].append(kwds2)
kwds2["type"] =line[21:-1]
state = 2 # <2> element
else:
tag = line[0:47].strip()
value = line[47:-1].strip()
if state == 1:
kwds1[tag] = value
else:
kwds2[tag] = value

Process the data

I want to look for the enumerate definitions and only process those

print("=================================")
for e in all:
d = all[e]
if d["type"] == "DW_TAG_typedef": # only these
print(d["DW_AT_name"],"{")
at_type = d["DW_AT_type" ] # get the key of the value
l = all[at_type]["l2"] # and the <2> elements within it
for ll in l:
if "DW_AT_const_value" in ll:
print(' "'+ll["DW_AT_name"]+'":',ll["DW_AT_const_value"])
print("}")

This produced code like

x509_attribute_type = {
"x509_attr_unknown" : 0,
"x509_attr_name" : 1,
"x509_attr_surname" : 2,
"x509_attr_givenName" : 3,
"x509_attr_initials" : 4,
...
}

You can do more advanced things if you want to, for example create structures for Python Struct (struct — Interpret bytes as packed binary data) to build control blocks. With this you can pass in a dict of names, and the struct definitions, and it converts the bytes into the specified definitions ( int, char, byte etc) with the correct bigend/little-end processing etc.

Making WordPress code blocks display the right size

I started using WordPress’s block code with syntax highlighting. Unfortunately the text was being displayed too large.

What I wanted

x509_attribute_type = {
"x509_attr_unknown" : 0,
"x509_attr_name" : 1,
"x509_attr_surname" : 2,
"x509_attr_givenName" : 3,
"x509_attr_initials" : 4,
...
}

What I got

x509_attribute_type = {
"x509_attr_unknown" : 0,
"x509_attr_name" : 1,
"x509_attr_surname" : 2,
"x509_attr_givenName" : 3,
"x509_attr_initials" : 4,
...
}

There were “helpful” suggestions such as changing the CSS, but these did not work for me.
What did work (but involved several clicks) is

  • create a pre-formatted block
  • paste the data
  • in the FONT SIZE box on the right hand side click S
  • go back to your block, and make it into a code block
  • from the pull down list of languages, select the language you want
  • It should not display the code in the right sized font

What the above actions have done is to put <pre class=”wp-block-code has-small-font-size”></pre> around the code snippet, which is another solution.

Using irrseq00 to extract profile information from RACF

IRRSEQ00 also known as R_ADMIN can be used by an application to issue RACF commands, or extract information from RACF. It is used by the Python interface to RACF pysear.

Using this was not difficult – but has it challenges (including a designed storage leak!).

I also had a side visit into That’s strange – the compile worked.

Challenges

The documentation explains how to search through the profiles.

The notes say

When using extract-next to return all general resource profiles for a given class, all the discrete profiles are returned, followed by the generic profiles. An output flag indicates if the returned profile is generic. A flag can be specified in the parameter list to request only the generic profiles in a given class. If only the discrete profiles are desired, check the output flag indicating whether the returned profile is generic. If it is, ignore the entry and terminate your extract-next processing.

  • To search for all of the profiles, specify a single blank as the name, and use the extract_next value.
  • There are discrete and generic profiles. If you specify flag bit 0x20000000 For extract-next requests: return the next alphabetic generic profile. This will not retrieve discrete profiles. If you do not specify this bit, you get all profiles.

This is where it gets hard.

  • The output is returned in a buffer allocated by IRRSEQ00. This is the same format as the control block used to specify parameters. After a successful request, it will contain the profile, and may return all of the segments (such as a userid’s TSO segment depending on the option specified).
  • Extract the information you are interested in.
  • Use this data as input to the irrseq00 call. I set used pParms = buffer;
  • After the next IRRSEQ00 request FREE THE STORAGE pointed to by pParms.
  • Use the data returned to you in the new buffer.
  • Loop

The problems with this are

The documentation says

The output storage is obtained in the subpool specified by the caller in the Out_message_subpool parameter. It is the responsibility of the caller to free this storage.

I do not know how to issued a FREEMAIN/STORAGE request from a C program! Because you cannot free the z/OS storage from C, you effectively get a storage leak!

I expect the developers did not think of this problem. Other RACF calls get the back in the same control block, and you get a return code if the control block is too small.

I solved this by having some assembler code in my C program see Putting assembler code inside a C program.

My program

Declare constants

 #pragma linkage(IRRSEQ00 ,OS) 
// Include standard libraries */
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <inttypes.h>

int main( int argc, char *argv??(??))
{
// this structure taken from pysear is the parameter block
typedef struct {
char eyecatcher[4]; // 'PXTR'
uint32_t result_buffer_length; // result buffer length
uint8_t subpool; // subpool of result buffer
uint8_t version; // parameter list version
uint8_t reserved_1[2]; // reserved
char class_name[8]; // class name - upper case, blank pad
uint32_t profile_name_length; // length of profile name
char reserved_2[2]; // reserved
char volume[6]; // volume (for data set extract)
char reserved_3[4]; // reserved
uint32_t flags; // see flag constants below
uint32_t segment_count; // number of segments
char reserved_4[16]; // reserved
// start of extracted data
char data[1];
} generic_extract_parms_results_t;
// Note: This structure is used for both input & output.

Set up the irrseq00 parameters

I want to find all profiles for class ACCTNUM. You specify a starting profile of one blank, and use the get next request.

char work_area[1024]; 
int rc;
long SAF_RC,RACF_RC,RACF_RS;
long ALET = 0;

char Function_code = 0x20; // Extract next general resource profile
// RACF is ignored for problem state
char RACF_userid[9];
char * ACEE_ptr = 0;
RACF_userid[0]=0; // set length to 0

char Out_message_subpool = 1;
char * Out_message_string; // returned by program

generic_extract_parms_results_t parms;
memset(&parms,0,sizeof(parms));
memcpy(&parms.eyecatcher,"PXTR",4);
parms.version = 0;
memcpy(&parms.class_name,"ACCTNUM ",8);
parms.profile_name_length = 1;
parms.data[0] =' ';

char *pParms = (char *) & parms;
Function_code = 0x20; // get next resource
int i;
generic_extract_parms_results_t * pGEP;

Loop round getting the data

I knew there were only 3 discrete profiles and one generic (with a “*” in it) .

For extract-next requests, SAF_RC = 4, RACFRC = 4 and RACFRS = 4, means. there are no more profiles that the caller is authorised to extract.

 for (i=0;i < 6;i++)
{
parms.flags = 0x04000000; // get next + base only
rc=IRRSEQ00(
&work_area,
&ALET , &SAF_RC,
&ALET , &RACF_RC,
&ALET , &RACF_RS,
&Function_code,
pParms,
&RACF_userid,
&ACEE_ptr,
&Out_message_subpool,
&Out_message_string
);
pParms = Out_message_string;

pGEP = (generic_extract_parms_results_t *) pParms;
if (RACF_RC == 0 )
{
printf("return code SAF %d RACF %d RS %d %2.2x %8.8x %*.*s \n",
SAF_RC,RACF_RC,RACF_RS, Function_code, pGEP-> flags,
pGEP->profile_name_length,
pGEP-> profile_name_length, pGEP->data);
}
else
{
printf("return code SAF %d RACF %d RS %d \n",
SAF_RC,RACF_RC,RACF_RS );
break;
}
}
return 8;
}

The results were

return code SAF 0 RACF 0 RS 0 20 00000000  ACCT# 
return code SAF 0 RACF 0 RS 0 20 00000000 IZUACCT
return code SAF 0 RACF 0 RS 0 20 10000000 TESTGEN*
return code SAF 4 RACF 4 RS 4

For TESTGEN* the flag is 0x10 which is On output: indicates that the profile returned by RACF is generic. When using extract-next to cycle through profiles, the caller should not alter this bit. For the others this bit is off, meaning the profiles are discrete.

Putting assembler code inside a C program

I was using a RACF service, and the documentation casually says “the application must free the returned storage”.

C does not provide facilities to use the STORAGE RELEASE z/OS service, so I had to write some assembler code to do this. It wasn’t difficult, I just fell over many little problems.

The IBM extension to the C language ASM is defined in the documentation.

I think of asm() as a special macro.

  • You pass in strings of assembler code. asm() will format it, and split the line if it would be longer than 72 characters.
  • If you want multiple lines you should end each line with \n (standard printf in C).
  • Lines starting with “*” are treated as comments, and produce no code
  • You can start your line with just one blank, before the instruction and it will be formatted as properly aligned assembler code, wrapping onto new lines if needed.
  • You do not use the C variable names in the assembler source.
    • You specified a mapping like [length] “m”(lDeletestorage), it takes the C variable lDeletestorage, determines its storage address such as it is offset 40 from register 5 (40(5)) and assigns this value to “length”
    • In the assembler code you specify ” L 2,%[length] \n” and it substitutes the value for %[length]. This instruction becomes ” L 2,40(5) “
    • At first glance it you would think it should be able to substitute the value directly, but you need to specify meta information about the field (is it used read only or read/write, is it a storage location, or a literal string). This is why there is this indirection
  • You need to specify which registers are modified so the C code can either save/restore their values, or just use different registers.

My code

if ( pDeletestorage != 0 )
{
int freerc = 0;
#define ASM_PREFIX " SYSSTATE ARCHLVL=2 \n"
asm(
ASM_PREFIX
" L 2,%[length] \n"
" L 4,%[a] \n"
" STORAGE RELEASE,LENGTH=(2),ADDR=(4),COND=YES,RTCD=%[rc] "
: [rc] "+m"(freerc) //* output
: [length] "m"(lDeletestorage,
[a] "m"(pDeletestorage)
:"r0", "r1" , "r2" , "r4" , "r15" );
printf("Storage release rc %i\n",rc);
}
  • if ( pDeletestorage != 0 ) If there is a block of storage to release
  • {
  • int freerc = 0; define and preset the return code
  • #define ASM_PREFIX ” SYSSTATE ARCHLVL=2 \n” see below. The \n makes a new line
  • asm( generate the code
  • ASM_PREFIX insert the SYSTATE code
  • ” L 2,%[length] \n” Load register 2 with the value of the variable length defined below
  • ” L 4,%[a] \n” Load register 4 with the value of the variable a(ddress) defined below
  • ” STORAGE RELEASE,LENGTH=(2),ADDR=(4),COND=YES,RTCD=%[rc] “ issue the storage macro
  • : After the first : is a list of output variables
    • [rc] “+m”(freerc) //* output [rc] matches the name in the RTCD=%[rc]
      • “+” means Indicates that the operand is both read and written by the instruction
      • “m” is use a memory object – that is, a variable. Other options are literal values or registers.
      • (freerc) is the name of the C variable
  • : After the second : is a list of input variables
    • [length] “m”(lDeletestorage),
    • [a] “m”(pDeletestorage)
  • : After the third : is a list of register that are changed (clobbered)
    • “r0”, “r1” , “r2” , “r4” , “r15”
  • );
  • printf(“Storage release rc %i”,rc);
  • }

You have to pass the data to the macro using registers. You can either select registers yourself, or let the compiler pick them for you. See Using registers with my program as a more correct way of coding the program.

Note: If you pick the registers yourself, they may be unavailable if you compile it as a 64 bit program or as XPLINK, so using Using registers with my program is better.g

With the SYSSTATE ARCHLVL=2

This statement sets the minimum architecture to  z/Architecture (which includes Jump instructions).

The generated code was

* SYSSTATE ARCHLVL=2                                              
* THE VALUE OF SYSSTATE IS NOW SET TO ASCENV=P AMODE64=NO ARCHLVX01-SYSS
* L=2 OSREL=00000000 RMODE64=NO
L 2,1368(13)
L 4,1364(13)

STORAGE RELEASE,LENGTH=(2),ADDR=(4),COND=YES,RTCD=1372(13)
LR 0,2 .STORAGE LENGTH
LR 1,4 .ADDRESS OF STORAGE
LHI 15,X'0001' .Add in parameters
L 14,16(0,0) .CVT ADDRESS
L 14,772(14,0) .ADDR SYST LINKAGE TABLE
L 14,204(14,0) .OBTAIN LX/EX FOR RELEASE
PC 0(14) .PC TO STORAGE RTN
ST 15,1372(13) .SAVE RETURN CODE

Where

  L     2,1368(13)       
L 4,1364(13)

is loading the registers with the variable data from the C code,

and

  ST     15,1372(13)                  .SAVE RETURN CODE      

saves the return code into the C variable.

Without the SYSSTATE ARCHLVL=2

The code failed to compile because of addressability issues.

         L     2,1368(13)                                              
L 4,1364(13)
STORAGE RELEASE,LENGTH=(2),ADDR=(4),COND=YES,RTCD=1372(13)
CNOP 0,4
B IHB0001B .BRANCH AROUND DATA
*** ASMA307E No active USING for operand IHB0001B
IHB0001F DC BL1'00000000'
DC AL1(0*16) .KEY
DC AL1(0) .SUBPOOL
DC BL1'00000001' .FLAGS
IHB0001B DS 0F
LR 0,2 .STORAGE LENGTH
LR 1,4 .ADDRESS OF STORAGE
L 15,IHB0001F .CONTROL INFORMATION
*** ASMA307E No active USING for operand IHB0001F
L 14,16(0,0) .CVT ADDRESS
L 14,772(14,0) .ADDR SYST LINKAGE TABLE
L 14,204(14,0) .OBTAIN LX/EX FOR RELEASE
PC 0(14) .PC TO STORAGE RTN
ST 15,1372(13) .SAVE RETURN CODE

In days of old you had base registers to locate data and instructions. Data reference was relative to (USING) a base register. Branching within a program used one or more base register. A re-entrant program would get dynamic storage, and this would be addressed by its own base register.

Modern instructions include the Jump instruction. This says jump this many half words to the instruction. These instructions do not need a base register.

With SYSSTATE ARCHLVL=2, the generated code for the C part of the program used the jump instructions, and did not need a base register. Assembler macros that generate code the old way, need a base register.

Using old instruction, to load a value into a register, it was loaded from storage – which needed a base register to locate the storage. For example

IHB0001F DC     BL1'00000000'                                          
DC AL1(0*16) .KEY
DC AL1(0) .SUBPOOL
DC BL1'00000001' .FLAGS
...
L 15,IHB0001F .CONTROL INFORMATION

Modern instructions have the “constant” value as part of the instruction

 LHI    15,X'0001'             

This effectively clears register 15 and loads the value 0x0001 into the bottom. (To be accurate it moves 0x0001 to the bottom half, then propagates the sign bit to the upper half word. The sign bit is 0.)

Using registers

The z/OS uses registers 14,15,0,1 for linkage, and these are likely to be used when using macros. Register 3 was used by the C code to locate storage, so I used registers 2 and 4.

Using variables

Reading the documentation, it seems I should be able to say

 " STORAGE RELEASE,LENGTH=%[length],ADDR=(4),COND=YES,RTCD=%[rc] "
: ...
: [length] "m"(lDeletestorage),
[a] "m"(pDeletestorage)
:...

passing in a variable. This does not work, because C does not pass in “lDeletestorage”, but the offset and register.

For example the storage for variable lDeleteme is 1368 off register 13.

It can be used in a Load instruction

  L     2,1368(13)  

But not every where. In the macro it get substituted.

STORAGE RELEASE,LENGTH=1368(13),ADDR=(4),COND=YES,RTCD=1372(13)
L 0,=A(1368(13)) .STORAGE LENGTH
*** ASMA035S Invalid delimiter - 1368(13)

Passing back the return code using RTCD=%[rc]. The macro checks to see if the value of RTCD is a register, if not then use the value in a store instruction.

Using registers

You can specify that you want to use a register, and the ASM() will pick registers for you.

 int res = 0x12345678;        
int newRes = 55;
__asm(" SR %[rx],%[rx] clear \n"
" SR %[rx],%[ry] them \n"
" LR %[rx],%[ry] COLINS\n"

: [rx]"=r"(res) // output
: [ry]"r"(newRes) // input
);

This generates code

     L        r4,newRes(,r13,1380) 
* SR 2,2 clear
* SR 2,4 them
* LR 2,4 COLINS
SR r2,r2
SR r2,r4
LR r2,r4

LR r0,r2
ST r0,res(,r13,1376)

It has selected to use registers 2 and 4.

Where

  • L r4,newRes(,r13,1380) loads the input value into a register of its choice
  • *… these are comments of the coded instructions
  • SR.. LR.. are the generated instructions
  • LR r0,r2 , ST r0,res(,r13,1376) saves the variable defined as output back into C storage.
Using registers with my program

The compiler decides which registers to user ( r2 and r4). When I compiled this code as 64 bit (using option lp64) it used registers 6 and 7.

asm( ASM_PREFIX 
" STORAGE RELEASE,LENGTH=(%[length]),ADDR=(%[a]),COND=YES,RTCD=%[rc] "
: [rc] "+m"(freerc) //* output
: [length] "r"(lDeleteme),
[a] "r"(pDeleteme)
:"r0", "r1" , "r15" );

This generates

 *          asm( 
L r2,lDeleteme(,r13,1368)
L r4,pDeleteme(,r13,1364)
SYSSTATE ARCHLVL=2

STORAGE RELEASE,LENGTH=(2),ADDR=(4),COND=YES,RTCD=1372(13)
LR 0,2 .STORAGE LENGTH
LR 1,4 .ADDRESS OF STORAGE
LHI 15,X'0001' .Add in parameters @PCA
L 14,16(0,0) .CVT ADDRESS
L 14,772(14,0) .ADDR SYST LINKAGE TABLE
L 14,204(14,0) .OBTAIN LX/EX FOR RELEASE
PC 0(14) .PC TO STORAGE RTN
ST 15,1372(13) .SAVE RETURN CODE

Using the different data types.

This is not explained very well in the documentation.

Using literal integer constants

The code

asm( "LABEL LA     4,%[i] \n" 
:
: [i] "i"(99)
: "r4" );

Generates

 * LABEL    LA    4,99                    Colins             
LA r4,99

The label I specified on the instruction is not added to the created instruction.

Using literal string constants

Instead of using this capability, you could use C constants, and the “m” definition.

The code

asm( "LABEL LA     4,%[i] Colins \n" 
:
: [i] "i"("ABCD")
: "r4" );

Gives a compile error

10 LABEL    LA    4,ABCD                  Colins    
*** ASMA044E Undefined symbol - ABCD

The code

asm("LABEL LA     4,%[i] Colins \n" 
:
: [i] "i"("=C'ABCD'")
: "r4" );

Generates

* LABEL    LA    4,=C'ABCD'              Colins          
LA r4,0(,r3)
...
Start of ASM Literals
000360 C1C2C3C4 =C'ABCD'

Using a memory operand that is offsetable (o)

The code

asm("LABEL LA     4,%[i] Colins \n" 
"LABEL2 LA 4,%[j] Colins2\n"
:
: [i] "o"(res),
[j] "m"(res)
: "r4" );

produced

* LABEL    LA    4,1376(13)              Colins      
* LABEL2 LA 4,1376(13) Colins2
LA r4,1376(r13,)
LA r4,1376(r13,)

So it looks like the “o” operand is the same as specifying “m”.

64 bit programming

When you compile in 64 bit code (option lp64). The asm() works just as well It is better to use Using registers with my program so you do not have to guess which registers you can use.

My code

asm( ASM_PREFIX 
" STORAGE RELEASE,LENGTH=(%[l]),ADDR=(%[a]),COND=YES,RTCD=%[rc] "
: [rc] "+m"(freerc) //* output
: [l] "r"(lDeleteme),
[a] "r"(pDeleteme)
:"r0", "r1" , "r15"
);

Generated

    LLGF     r6,lDeleteme(,r4,3504) 
LG r7,pDeleteme(,r4,3496)
SYSSTATE ARCHLVL=2
...

Where the LLGF loads the 32 bit length variable into register 6, and LG loads the 64 bit address variable into register 7.

Compiling the code

I used the shell script

name=irrseq
export _C89_CCMODE=1
p1="-Wc,arch(8),target(zOSV2R3),list,source,ilp32,gonum,asm,float(ieee)"
p5=" -I. "


p7="-Wc,ASM,ASMLIB(//'SYS1.MACLIB') -Wa,LIST,RENT"

p8="-Wc,LIST(c.lst),SOURCE,NOWARN64,XREF,SHOWINC "

xlc $p1 $p5 $p7 $p8 -c $name.c -o $name.o

l1="-Wl,LIST,MAP,XREF "
/bin/xlc $name.o -o irrseq -V $l1 1>a

The important line, p7, contains

  • -Wc,ASM,ASMLIB(//’SYS1.MACLIB’) C compile options
    • ASM Enables inlined assembly code inside C/C++ programs.
    • ASMLIB Specifies assembler macro libraries to be used when assembling the assembler source code.
  • -Wa,LIST,RENT” Assembler options
    • LIST Instructs the assembler to produce a listing
    • RENT Specifies that the assembler checks for possible coding violations of program reenterability

Can I automate new SSH connections to z/OS?

I’m running on Linux, and using remote z/OS systems. Being from a performance background I hate having to waste seconds, manually starting SSH sessions to my backend systems.

I found I can automate this!

From my gnome-terminal I can issue the command

gnome-terminal --tab --working-directory=~ --title=COLINS --profile=blue  -- ssh colin@10.1.1.2 

This

  • created a new terminal session as a tab in the existing terminal window
  • did a cd ~
  • called the tab COLINS
  • selected the profile called blue (go to the hamburger of your current terminal and select profile to see what profiles you have)
  • executes the command after the ‐‐, so executes ssl colin@10.1.1.1

You can issue the command

gnome-terminal --help-all

to get a list of all options.

That’s strange – the compile worked.

I was setting up a script to compile some C code in Unix Services, and it worked – when I expected the bind to fail because I had not specified where to find a stub file.

How to compile the source

I used a shell script to compile and bind the source. I was surprised to see that it worked, because it needed some Linkedit stubs from CSSLIB. I thought I needed

export _C89_LSYSLIB=”CEE.SCEELKEX:CEE.SCEELKED:CBC.SCCNOBJ:SYS1.CSSLIB”

but it worked without it.

The script

name=irrseq 

export _C89_CCMODE=1
# export _C89_LSYSLIB="CEE.SCEELKEX:CEE.SCEELKED:CBC.SCCNOBJ:SYS1.CSSLIB"
p1="-Wc,arch(8),target(zOSV2R3),list,source,ilp32,gonum,asm,float(ieee)"
p5=" -I. "
p8="-Wc,LIST(c.lst),SOURCE,NOWARN64,XREF,SHOWINC "

xlc $p1 $p5 $p8 -c $name.c -o $name.o
# now bind it
l1="-Wl,LIST,MAP,XREF "
/bin/xlc $name.o -o irrseq -V $l1 1>binder.out

The binder output had

XL_CONFIG=/bin/../usr/lpp/cbclib/xlc/etc/xlc.cfg:xlc 
-v -Wl,LIST,MAP,XREF irrseq.o -o./irrseq
STEPLIB=NONE
_C89_ACCEPTABLE_RC=4
_C89_PVERSION=0x42040000
_C89_PSYSIX=
_C89_PSYSLIB=CEE.SCEEOBJ:CEE.SCEECPP
_C89_LSYSLIB=CEE.SCEELKEX:CEE.SCEELKED:CBC.SCCNOBJ:SYS1.CSSLIB

Where did these come from? – I was interested in SYS1.CSSLIB. It came from xlc config file below.

xlc config file

By default the compile command uses a configuration file /usr/lpp/cbclib/xlc/etc/xlc.cfg .

The key parts of this file are

* FUNCTION: z/OS V2.4 XL C/C++ Compiler Configuration file
*
* Licensed Materials - Property of IBM
* 5650-ZOS Copyright IBM Corp. 2004, 2018.
* US Government Users Restricted Rights - Use, duplication or
* disclosure restricted by GSA ADP Schedule Contract with IBM Corp.
*

* C compiler, extended mode
xlc: use = DEFLT
...* common definitions
DEFLT: cppcomp = /usr/lpp/cbclib/xlc/exe/ccndrvr
ccomp = /usr/lpp/cbclib/xlc/exe/ccndrvr
ipacomp = /usr/lpp/cbclib/xlc/exe/ccndrvr
ipa = /bin/c89
as = /bin/c89
ld_c = /bin/c89
ld_cpp = /bin/cxx
xlC = /usr/lpp/cbclib/xlc/bin/xlc
xlCcopt = -D_XOPEN_SOURCE
sysobj = cee.sceeobj:cee.sceecpp
syslib = cee.sceelkex:cee.sceelked:cbc.sccnobj:sys1.csslib
syslib_x = cee.sceebnd2:cbc.sccnobj:sys1.csslib
exportlist_c = NONE
exportlist_cpp = cee.sceelib(c128n):cbc.sclbsid(iostream,complex)
exportlist_c_x = cee.sceelib(celhs003,celhs001)
exportlist_cpp_x = ...
exportlist_c_64 = cee.sceelib(celqs003)
exportlist_cpp_64 = ...
steplib = NONE

Where the _x entries are for xplink.

It is easy to find the answer when you know the solution.

Note:

Without the export _C89_CCMODE=1

I got

IEW2763S DE07 FILE ASSOCIATED WITH DDNAME /0000002 CANNOT BE OPENED
BECAUSE THE FILE DOES NOT EXIST OR CANNOT BE CREATED.
IEW2302E 1031 THE DATA SET SPECIFIED BY DDNAME /0000002 COULD NOT BE
FOUND, AND THUS HAS NOT BEEN INCLUDED.
FSUM3065 The LINKEDIT step ended with return code 8.

Compiling in 64 bit

It was simple to change the script to compile it in 64 bit mode, but overall this didn’t work.

p1="-Wc,arch(8),target(zOSV2R3),list,source,lp64,gonum,asm,float(ieee)" 
...
l1="-Wl,LIST,MAP,XREF -q64 "

When I compiled in 64 bit mode, and tried to bind in 31/32 bit mode (omitting the -q64 option) I got messages like

 IEW2469E 9907 THE ATTRIBUTES OF A REFERENCE TO isprint FROM SECTION irrseq#C DO
NOT MATCH THE ATTRIBUTES OF THE TARGET SYMBOL. REASON 2
...
IEW2469E 9907 THE ATTRIBUTES OF A REFERENCE TO IRRSEQ00 FROM SECTION irrseq#C
DO NOT MATCH THE ATTRIBUTES OF THE TARGET SYMBOL. REASON 3

IEW2456E 9207 SYMBOL CELQSG03 UNRESOLVED. MEMBER COULD NOT BE INCLUDED FROM
THE DESIGNATED CALL LIBRARY.
...
IEW2470E 9511 ORDERED SECTION CEESTART NOT FOUND IN MODULE.
IEW2648E 5111 ENTRY CEESTART IS NOT A CSECT OR AN EXTERNAL NAME IN THE MODULE.

IEW2469E THE ATTRIBUTES OF A REFERENCE TO … FROM SECTION … DO
NOT MATCH THE ATTRIBUTES OF THE TARGET SYMBOL. REASON x

  • Reason 2 The xplink attributes of the reference and target do not match.
  • Reason 3 Either the reference or the target is in amode 64 and the amodes do not match. The IRRSEQ00 stub is only available in 31 bit mode, my program was 64 bit amode.

IEW2456E SYMBOL CELQINPL UNRESOLVED. MEMBER COULD NOT BE INCLUDED
FROM THE DESIGNATED CALL LIBRARY.

  • The compile in 64 bit mode generates an “include…” of the 64 bit stuff needed by C. Because the binder was in 31 bit, it used the 31 bit libraries – which did not have the specified include file. When you compile in 64 bit mode you need to bind with the 64 bit libraries. The compile command sorts all this out depending on the options.
  • The libraries used when binding in 64 bit mode are syslib_x = cee.sceebnd2:cbc.sccnobj:sys1.csslib. See the xlc config file above.

IEW2470E 9511 ORDERED SECTION CEESTART NOT FOUND IN MODULE.
IEW2648E 5111 ENTRY CEESTART IS NOT A CSECT OR AN EXTERNAL NAME IN THE MODULE.

  • Compiling in 64 bit mode, generates an entry point of CELQSTRT instead of CEESTART, so the binder instructions for 31 bit programs specifying the entry point of CEESTART will fail.

Overall

Because IRRSEQ00 only comes in a 31 bit flavour, and not a 64 bit flavour, I could not call it directly from a 64 bit program, and I had to use a 32 bit compile and bind.

Accessing SMF Real Time data.

The traditional way of processing SMF data (product statistics, and audit information), is to post-process SMF datasets. This might be done hourly, or daily, (or more frequently). This means there is a delay between the records being created, and being available for processing.

With SMF Real Time, you can connect an application program to SMF, and get records from the SMF buffers, as the records are created.

Configuring SMF

SMF needs to be in logstream mode. See Many is so last year – logstreams is the way to go.

You need to configure SMF to generate the records. See the SMFPRMxx parameter in parmlib.

I created an entry dynamically using

setsmf inmem(IFASMF.INMEM,RESSIZMAX(128M),TYPE(30,42))   

Note: 128M is the smallest buffer size.

The IBM documentation Defining in-memory resources covers various topics.

Displaying information

The command

D SMF

gave me

IFA714I 11.46.50 SMF STATUS 101                
LOGSTREAM NAME BUFFERS STATUS
A-IFASMF.DEFAULT 0 CONNECTED
A-IFASMF.COLIN 0 CONNECTED
A-IFASMF.INMEM 4826066 IN-MEMORY

The command

 D SMF,M

Gave showed my Real time, in Memory resource in use

d smf,m                                                   
IFA714I 11.48.15 SMF STATUS 109
IN MEMORY CONNECTIONS
Resource: IFASMF.INMEM
Con#: 0001 Connect Time: 2026.019 10:07:20
ASID: 004B
Con#: 0002 Connect Time: 2026.019 11:48:10
ASID: 0049

The Application Programming Interface.

The API is pretty easy to use. I based my C application on the IBM example.

I called my program from Python, so that was an extra challenge.

Query

You can issue the query API request. This returns the name of the INMEM definitions available to you, and the SMF record types in the definition.

Capture the data

You need to issue

  • connect, passing the name of the INMEM definition. It returns a 16 byte token. Once the connect has completed successfully, SMF will capture the data in a buffer.
  • get, passing the token. You can specify a flag saying blocking – so the thread waits until data is available. You do not get records from before the connect.
  • If there is too much data for your application to process – or your application is slow to process the data, SMF will wrap the data, and so lose records. The application will get return code IFAINMMissedData (Meaning: Records were skipped due to buffer re-use—that is, wrapping of the data in the in-memory resource. In this case, the output buffer might not contain a valid record.) You should reissue the get.
  • disconnect, passing the token. The disconnect can be done on a different thread. If so, it notifies any thread in a blocking get request, which gets a return code IFAINMGetForcedOut.

Problems

The problems I originally had were that my SMF was not running in log stream mode.
Once I set this up, I could get data back.

I set up INMEM record for SMF 30 records, and although I submitted some batch jobs, I did not get any SMF 30 records in my program.
If I logged off TSO, I got a record. If I issued tso omvs from ISPF I got records.

I added

SUBSYS(JES2) 

to my SMFPRMLS member, and I got SMF 30 records for batch jobs.

I later changed this to be

SUBSYS(JES2,EXITS(IEFU29,IEFU83,IEFU84,IEFUJP,IEFUSO))

to be consistent wit the SUBSYS(STC… parameter)
I got SMF 30 records when logging on using SSH, from using TSO OMVS, or spawning a thread in OMVS to run in the background, for example ls &

It is curious that I do not have SUBSYS(TSO) defined – but I get entries for TSO usage.

It is OK, but…

The code works and generates records. One problem I have is how to stop my program running.

You could use a non blocking call, loop around getting records until you get no records available, then return, do an external wait, and then reloop. This puts the control in your application, but does use CPU as it loops periodically (every second perhaps) looking for records.

You could use a blocking call where the request waits until a record is available, or another thread issues the disconnect call. This means an extra programming challenge creating a thread for the blocking request to run off, and another thread to handle the disconnect request.

The first case, non blocking case, feels easier to code – but at the cost of higher CPU.

Python calling C functions

  1. You can have Python programs which are pure Python.
  2. You can call C programs that act like Python programs, using Python constructs within the C program
  3. You can call a C program from Python, and it processes parameters like a normal C program.

This blog post is about the third, calling a C program from Python, passing simple data types such as char, integers and strings.

I have based a lot of this on the well written pyzfile package by @daveyc.

The glue that makes it work is the ctypes package a “foreign function library” package.

Before you start

The blog post is called “Python calling C functions”. I tried using a z/OS stub code directly. This is not written in C, and I got.

CEE3595S DLL ... does not contain a CELQSTRT CSECT.

Which shows you must supply a C program.

The C program that I wrote, calls z/OS services. These must be defined like (or default to)

#pragma linkage(...,OS64_NOSTACK)     

Getting started

My C program has several functions including

int query() {  
return 0;
}

The compile instructions said exportall – so all functions are visible from outside of the load module.

You access this from Python using code like

lib_file = pathlib.Path(__file__).parent / "pySMFRealTime.so"
self.lib = ctypes.CDLL(str(lib_file))
...
result = self.lib.query()

Where

  • lib_file = pathlib.Path(__file__).parent / “pySMFRealTime.so” says get the file path of the .so module in the same directory as the current Python file.
  • self.lib = ctypes.CDLL(str(lib_file)) load the file and extract information.
  • result = self.lib.query() execute the query function, passing no parameters, and store any return code in the variable result

Passing simple parameters

A more realistic program, passing parameters in, and getting data back in the parameters is

int conn(const char* resource_name,  // input:  what we want to connect to
char * pOut, // output: where we return the handle
int * rc, // output: return code
int * rs, // output: reason code
int * debug) // input: pass in debug information
{
int lName = strlen(resource_name);
if (*debug >= 1)
{
printf("===resource_namen");
printHex(stdout,pFn,20);
}
...
return 0;
}

The Python code has

lib_file = pathlib.Path(__file__).parent / "pySMFRealTime.so"
self.lib = ctypes.CDLL(str(lib_file))
self.lib.conn.argtypes = [c_char_p, # the name of stream
c_char_p, # the returned buffer
ctypes.POINTER(ctypes.c_int), # rc
ctypes.POINTER(ctypes.c_int), # rs
ctypes.POINTER(ctypes.c_int), # debug
]
self.lib.conn.restype = c_int

The code to do the connection is

def conn(self,name: str,):
token = ctypes.create_string_buffer(16) # 16 byte handle
rc = ctypes.c_int(0)
rs = ctypes.c_int(0)
debug = ctypes.c_int(self.debug)
self.token = None
retcode = self.lib.conn(name.encode("cp500"),
token,
rc,
rs,
debug)
if retcode != 0:
print("returned rc",rc, "reason",rs)
print(">>>>>>>>>>>>>>>>> connect error ")
return None
print("returned rc",rc, "reason",rs)
self.token = token
return rc

The code does

  • def conn(self,name: str,): define the conn function and pass in the variable name which is a string
  • token = ctypes.create_string_buffer(16) # 16 byte handle create a 16 byte buffer and wrap it in ctypes stuff.
  • rc = ctypes.c_int(0), rs = ctypes.c_int(0), debug = ctypes.c_int(self.debug) create 3 integer variables.
  • self.token = None preset this
  • retcode = self.lib.conn( invoke the conn function
    • name.encode(“cp500”), convert the name from ASCII (all Python printable strings are in ASCII) to code page 500.
    • token, the 16 byte token defined above
    • rc, rs, debug) the three integer variables
  • if retcode != 0: print out error messages
  • print(“returned rc”,rc, “reason”,rs) print the return and reason code
  • self.token = token save the token for the next operation
  • return rc return to caller, with the return code.

Once I got my head round the ctypes… it was easy.

The C program

There are some things you need to be aware of.

  • Python is compiled with the -qascii compiler option, so all strings etc are in ASCII. The code name.encode(“cp500”), converts it from ASCII to EBCDIC. The called C program sees the data as a valid EBCDIC string (null terminated).
  • If a character string is returned, with printable text. Either your program coverts it to ASCII, or your Python calling code needs to convert it.
  • Your C program can be compiled with -qascii – or as EBCDIC(no -qascii)
    • Because Python is compiled in ASCII, the printf routines are configured to print ASCII. If your program is compiled as ASCII, printf(“ABCD”) will print as ABCD. If your program is compiled as EBCDIC printf(“ABCD”) will print garbage – because the hex values for EBCDIC ABCD are not printable as ASCII characters.
    • If your program is compiled as ASCII you can define EBCDIC constants.
      • #pragma convert(“IBM-1047”)
      • char * pEyeCatcher = “EYEC”; // EBCDIC eye catcher for control block
      • #pragma convert(pop)