One minute MVS: Binder and loader

This topic is in the series of “One minute MVS” giving the essentials of a topic.

Your program

The use of functions or subroutines are very common in programming. For a simple call

x = mysub()

which calls an external function mysub has generated code like

MYPROG  CSECT     
     L 15,mysub the function 
     LA 1,PARMLIB
     BASR  14,15 or BALR in older programs 
...
mysub  DC  V(MYSUB)

where

  • MYPROG is an entry point to the program
  • the mysub variable defines some storage for the external reference(ER) to MYSUB.

The output of the assembler or compiler is a file or dataset member, known as an an “object deck” or “object file”. It cannot be executed, as it does not have the external functions or subroutines.

The binder (or linkage editor)

The binder program takes object decks, includes any subroutines and external code and creates a load module or program object.

In early days load modules were stored in PDS datasets. In the directory of a member was information about the size of the load module, and the entry point. As the binder got more sophisticated, the directory did not have enough space for all of the data that was created. As a result PDSE (Extended PDSs) were created, which have an extendable directory entry. For files in Unix Services Load modules are stored in the the Unix file system.

The term Program Object is used to cover load modules and files in the Unix file system. I still think of them both as Load Modules.

The binder takes the parts needed to create the program object, for example functions you created and are stored in a PDS or Unix, and includes code, for example for the prinf() function. These are merged into one file.

Pictorially the merged files look like

  • Offset 0 some C code.
  • Offset 200 MYPROG Object
    • Offset 10 within MYPROG, MYPROG entry point (so offset 210 from the start of the merged files)
    • Offset 200 within MYPROG, mysub:V(MYSUB)
    • Offset 310 within MYPROG end of MYPROG
  • Offset 512 FUNCTION1 object
  • Offset 800 MYSUB1 Object
    • Offset 28 within MYSUB1, MYSUB entry point
    • Offset 320 within MYSUB1, end of MYSUB

The binder can now resolve references. It knows that MYSUB entry point is at offset 28 within MYSUB1 object, and MYSUB1 Object is 800 from the start of the combined files. It can now replace the mysub:V(MYSUB) in MYPROG with the offset value 828.

The entire files is stored as a load module(program object) as one object, with a name that you give it, for example COLIN.

The loader

When load module COLIN is loaded. The loader loads the load module from disk into memory. For example at address 200,000. As part of the loading, it looks at the external references and calculates the address in memory from the offset value. So 200,000 + offset 828 is address 200828. This value is stored in the mysub variable.

When the function is about to be called via L 15,mysub, register 15 has the address of the code in memory and the program can branch to execute the code.

It gets more complex than this

Consider two source programs

int value = 0;
int total = 0;
void main()
{
  value =1;
  total = total + value; 
  printTotal();
  
}
int total;
int done;
void printotal()
{
  printf("Total = %d\n",total);
  done = 1; 
}

There are some global static variables. The variable “total” is used in each one – it is the same variable.

These programs are defined as being re-entrant, and could be loaded into read only storage.

The variables “value” and “total”, cannot go into read only storage as they change during the execution of the program.

There are three global variables: “value”, “total” and “done”; total is common to both programs.

These variables go into a storage area called Writeable Static Area (WSA).

If there are multiple threads running the program, each gets its own copy of the WSA, but they can all shared instructions.

A program can also have 31 bit resident code, and 64 bit resident code. The binder takes all of these parts and creates “classes” of data

  • The WSA class. This contains the merged list of static variables.
  • 64-bit re-entrant code – class. It takes the 64-bit resident code from all of the programs, and included subroutines and creates a “64-bit re-entrant” blob.
  • 31- bit re-entrant code -class. It takes the 31-bit resident code from all of the programs, and included subroutines and creates a “31-bit re-entrant” blob.
  • 64-bit data – class, from all objects
  • 31-bit data – class, from all objects

When the loader loads the modules

  • It creates a new copy of the WSA for each thread
  • It loads the 64 bit re-entrant code (or reuses any existing copy of the code) into 64 bit storage
  • It loads the 31 bit re-entrant code (or reuses any existing copy of the code) into 31 bit storage.

How can I see what is in the load module?

If you look at the output from the binder you get output which includes content like

CLASS  B_TEXT            LENGTH =      4F4  ATTRIBUTES = CAT,   LOAD, RMODE=ANY 
CLASS  C_DATA64          LENGTH =        0  ATTRIBUTES = CAT,   LOAD, RMODE=ANY 
CLASS  C_CODE64          LENGTH =     1A38  ATTRIBUTES = CAT,   LOAD, RMODE= 64 
CLASS  C_@@QPPA2         LENGTH =        8  ATTRIBUTES = MRG,   LOAD, RMODE= 64 
CLASS  C_CDA             LENGTH =     3B50  ATTRIBUTES = MRG,   LOAD, RMODE= 64 
CLASS  B_LIT             LENGTH =      140  ATTRIBUTES = CAT,   LOAD, RMODE=ANY 
CLASS  B_IMPEXP          LENGTH =      A6B  ATTRIBUTES = CAT,   LOAD, RMODE=ANY 
CLASS  C_WSA64           LENGTH =      6B8  ATTRIBUTES = MRG, DEFER , RMODE= 64 
CLASS  C_COPTIONS        LENGTH =      304  ATTRIBUTES = CAT, NOLOAD 
CLASS  B_PRV             LENGTH =        0  ATTRIBUTES = MRG, NOLOAD 

Where

  • B_TEXT is from HLASM (assembler program). Any sections are conCATenated together (Attributes =CAT)
  • C_WSA64 is the 64 bit WSA. Any data in these sections have been MeRGed (see the “total” variable above) (Attributes = MRG)
  • C_OPTIONS contains the list of C options used at compile time. The loader ignores this section (NOLOAD), but it is available for advanced programs such as debuggers to extract this information from the load module.

To introduce even more complexity. You can have class segments. These are an advanced topic where you want groups of classes to be independently loaded. Most people use the default of 1 segment.

Layout of the load module

Class layout

You can see the layout of the classes in the segment.

  • Class B_TEXT starts at offset 0 and is length 4F4.
  • Class C_CODE64 is offset 4F8 (4F4 rounded up to the nearest doubleword) and of length 1A38.
CLASS  B_TEXT            LENGTH =      4F4  ATTRIBUTES = CAT,   LOAD, RMODE=ANY 
                         OFFSET =        0 IN SEGMENT 001     ALIGN = DBLWORD 
  -  -  -  -  -  -  -  -  -  -  -  -  -  -  -  -  -  -  -  -  -  -  -  -  -  -  -
CLASS  C_CODE64          LENGTH =     1A38  ATTRIBUTES = CAT,   LOAD, RMODE= 64 
                         OFFSET =      4F8 IN SEGMENT 001     ALIGN = DBLWORD ff

Within each class

CLASS  C_CODE64          LENGTH =     1A38  ATTRIBUTES = CAT,   LOAD, RMODE= 64 
                         OFFSET =      4F8 IN SEGMENT 001     ALIGN = DBLWORD 
--------------- 
                                                                                                  
 SECTION    CLASS                                      ------- SOURCE -------- 
  OFFSET   OFFSET  NAME                TYPE    LENGTH  DDNAME   SEQ  MEMBER 
                                                                                                  
                0  $PRIV000010        CSECT      1A38  /0000001  01 
       0        0     $PRIV000011        LABEL 
      B8       B8     PyInit_zconsole    LABEL 
     8B8      8B8     or_bit             LABEL 
     C30      C30     cthread            LABEL 
    1158     1158     cleanup            LABEL 
    11B8     11B8     printHex           LABEL 
  • The label $PRIV000010 CSECT is generated because I did not have a #pragma CSECT(CODE,”….”) statement in my source. If you use #pragma CSECT(STATIC,”….”) you get the name in the CLASS C_WSA64 section (see the following section)
  • The C function “or_bit” is at offset B8 in the class C_CODE64.

The static area

For the example below, the C module had #pragma CSECT(STATIC,”SZCONSOLE”)

CLASS  C_WSA64           LENGTH =      6B8  ATTRIBUTES = MRG, DEFER , RMODE= 64 
                         OFFSET =        0 IN SEGMENT 002     ALIGN = QDWORD 
--------------- 
                                                                                     
            CLASS 
           OFFSET  NAME                TYPE    LENGTH   SECTION 
                0  $PRIV000012      PART            10 
               10  SZCONSOLE        PART           5A0  ZCONSOLE 
              5B0  ascii_tab        PART           100  ascii_tab 
              6B0  gstate           PART             4  gstate 

There are two global static variables, common to all routines, ascii_tab, and gstate. They each have an entry defined in the class.

All of the static variables internal to routines are in the SZCONSOLE section. They do not have a explicit name because they are internal.

One thought on “One minute MVS: Binder and loader

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s