This topic is in the series of “One minute MVS” giving the essentials of a topic.
Your program
The use of functions or subroutines are very common in programming. For a simple call
x = mysub()
which calls an external function mysub has generated code like
MYPROG CSECT L 15,mysub the function LA 1,PARMLIB BASR 14,15 or BALR in older programs ... mysub DC V(MYSUB)
where
- MYPROG is an entry point to the program
- the mysub variable defines some storage for the external reference(ER) to MYSUB.
The output of the assembler or compiler is a file or dataset member, known as an an “object deck” or “object file”. It cannot be executed, as it does not have the external functions or subroutines.
The binder (or linkage editor)
The binder program takes object decks, includes any subroutines and external code and creates a load module or program object.
In early days load modules were stored in PDS datasets. In the directory of a member was information about the size of the load module, and the entry point. As the binder got more sophisticated, the directory did not have enough space for all of the data that was created. As a result PDSE (Extended PDSs) were created, which have an extendable directory entry. For files in Unix Services Load modules are stored in the the Unix file system.
The term Program Object is used to cover load modules and files in the Unix file system. I still think of them both as Load Modules.
The binder takes the parts needed to create the program object, for example functions you created and are stored in a PDS or Unix, and includes code, for example for the prinf() function. These are merged into one file.
Pictorially the merged files look like
- Offset 0 some C code.
- Offset 200 MYPROG Object
- Offset 10 within MYPROG, MYPROG entry point (so offset 210 from the start of the merged files)
- Offset 200 within MYPROG, mysub:V(MYSUB)
- …
- Offset 310 within MYPROG end of MYPROG
- Offset 512 FUNCTION1 object
- …
- Offset 800 MYSUB1 Object
- Offset 28 within MYSUB1, MYSUB entry point
- Offset 320 within MYSUB1, end of MYSUB
The binder can now resolve references. It knows that MYSUB entry point is at offset 28 within MYSUB1 object, and MYSUB1 Object is 800 from the start of the combined files. It can now replace the mysub:V(MYSUB) in MYPROG with the offset value 828.
The entire files is stored as a load module(program object) as one object, with a name that you give it, for example COLIN.
The loader
When load module COLIN is loaded. The loader loads the load module from disk into memory. For example at address 200,000. As part of the loading, it looks at the external references and calculates the address in memory from the offset value. So 200,000 + offset 828 is address 200828. This value is stored in the mysub variable.
When the function is about to be called via L 15,mysub, register 15 has the address of the code in memory and the program can branch to execute the code.
It gets more complex than this
Consider two source programs
int value = 0; int total = 0; void main() { value =1; total = total + value; printTotal(); }
int total; int done; void printotal() { printf("Total = %d\n",total); done = 1; }
There are some global static variables. The variable “total” is used in each one – it is the same variable.
These programs are defined as being re-entrant, and could be loaded into read only storage.
The variables “value” and “total”, cannot go into read only storage as they change during the execution of the program.
There are three global variables: “value”, “total” and “done”; total is common to both programs.
These variables go into a storage area called Writeable Static Area (WSA).
If there are multiple threads running the program, each gets its own copy of the WSA, but they can all shared instructions.
A program can also have 31 bit resident code, and 64 bit resident code. The binder takes all of these parts and creates “classes” of data
- The WSA class. This contains the merged list of static variables.
- 64-bit re-entrant code – class. It takes the 64-bit resident code from all of the programs, and included subroutines and creates a “64-bit re-entrant” blob.
- 31- bit re-entrant code -class. It takes the 31-bit resident code from all of the programs, and included subroutines and creates a “31-bit re-entrant” blob.
- 64-bit data – class, from all objects
- 31-bit data – class, from all objects
- …
When the loader loads the modules
- It creates a new copy of the WSA for each thread
- It loads the 64 bit re-entrant code (or reuses any existing copy of the code) into 64 bit storage
- It loads the 31 bit re-entrant code (or reuses any existing copy of the code) into 31 bit storage.
How can I see what is in the load module?
If you look at the output from the binder you get output which includes content like
CLASS B_TEXT LENGTH = 4F4 ATTRIBUTES = CAT, LOAD, RMODE=ANY CLASS C_DATA64 LENGTH = 0 ATTRIBUTES = CAT, LOAD, RMODE=ANY CLASS C_CODE64 LENGTH = 1A38 ATTRIBUTES = CAT, LOAD, RMODE= 64 CLASS C_@@QPPA2 LENGTH = 8 ATTRIBUTES = MRG, LOAD, RMODE= 64 CLASS C_CDA LENGTH = 3B50 ATTRIBUTES = MRG, LOAD, RMODE= 64 CLASS B_LIT LENGTH = 140 ATTRIBUTES = CAT, LOAD, RMODE=ANY CLASS B_IMPEXP LENGTH = A6B ATTRIBUTES = CAT, LOAD, RMODE=ANY CLASS C_WSA64 LENGTH = 6B8 ATTRIBUTES = MRG, DEFER , RMODE= 64 CLASS C_COPTIONS LENGTH = 304 ATTRIBUTES = CAT, NOLOAD CLASS B_PRV LENGTH = 0 ATTRIBUTES = MRG, NOLOAD
Where
- B_TEXT is from HLASM (assembler program). Any sections are conCATenated together (Attributes =CAT)
- C_WSA64 is the 64 bit WSA. Any data in these sections have been MeRGed (see the “total” variable above) (Attributes = MRG)
- C_OPTIONS contains the list of C options used at compile time. The loader ignores this section (NOLOAD), but it is available for advanced programs such as debuggers to extract this information from the load module.
To introduce even more complexity. You can have class segments. These are an advanced topic where you want groups of classes to be independently loaded. Most people use the default of 1 segment.
Layout of the load module
Class layout
You can see the layout of the classes in the segment.
- Class B_TEXT starts at offset 0 and is length 4F4.
- Class C_CODE64 is offset 4F8 (4F4 rounded up to the nearest doubleword) and of length 1A38.
CLASS B_TEXT LENGTH = 4F4 ATTRIBUTES = CAT, LOAD, RMODE=ANY OFFSET = 0 IN SEGMENT 001 ALIGN = DBLWORD - - - - - - - - - - - - - - - - - - - - - - - - - - - CLASS C_CODE64 LENGTH = 1A38 ATTRIBUTES = CAT, LOAD, RMODE= 64 OFFSET = 4F8 IN SEGMENT 001 ALIGN = DBLWORD ff
Within each class
CLASS C_CODE64 LENGTH = 1A38 ATTRIBUTES = CAT, LOAD, RMODE= 64 OFFSET = 4F8 IN SEGMENT 001 ALIGN = DBLWORD --------------- SECTION CLASS ------- SOURCE -------- OFFSET OFFSET NAME TYPE LENGTH DDNAME SEQ MEMBER 0 $PRIV000010 CSECT 1A38 /0000001 01 0 0 $PRIV000011 LABEL B8 B8 PyInit_zconsole LABEL 8B8 8B8 or_bit LABEL C30 C30 cthread LABEL 1158 1158 cleanup LABEL 11B8 11B8 printHex LABEL
- The label $PRIV000010 CSECT is generated because I did not have a #pragma CSECT(CODE,”….”) statement in my source. If you use #pragma CSECT(STATIC,”….”) you get the name in the CLASS C_WSA64 section (see the following section)
- The C function “or_bit” is at offset B8 in the class C_CODE64.
The static area
For the example below, the C module had #pragma CSECT(STATIC,”SZCONSOLE”)
CLASS C_WSA64 LENGTH = 6B8 ATTRIBUTES = MRG, DEFER , RMODE= 64 OFFSET = 0 IN SEGMENT 002 ALIGN = QDWORD --------------- CLASS OFFSET NAME TYPE LENGTH SECTION 0 $PRIV000012 PART 10 10 SZCONSOLE PART 5A0 ZCONSOLE 5B0 ascii_tab PART 100 ascii_tab 6B0 gstate PART 4 gstate
There are two global static variables, common to all routines, ascii_tab, and gstate. They each have an entry defined in the class.
All of the static variables internal to routines are in the SZCONSOLE section. They do not have a explicit name because they are internal.
One thought on “One minute MVS: Binder and loader”