Transferring a dataset from z/OS to Windows or Linux and using it can be a challenge.
A record in a data set on z/OS has a 4 byte Record Descriptor Word on the front of the record. The first two bytes give the length of the record (and the other two bytes are typically 0)
FTP has two modes for transferring data ASCII and BIN.
ASCII
With ASCII mode, FTP reads the record,
- Removes the RDW
- Converts it from EBCDIC to ASCII
- Adds a “New Line” character to the end of data
- Sends the data
- Writes the data to a file stream.
On Unix and Windows a text file is a long stream of data. When the file is read, a New Line character ends the logical record, and so you display the following data on a “New Line”.
Binary mode
Binary mode is used when the dataset has hexadecimal content, and not just printable characters. The New Line hex character could be part of a some hexadecimal data, so this character cannot be used to delineate records.
FTP has an option for RDW
quote site RDW
The default is RDW FALSE.
If RDW is FALSE then FTP removes the RDW from the data before sending it. At the remote end, the data is a stream of data, and you have no way of identifying where one logical record ends, and the next logical record starts.
If RDW is TRUE, then the 4 byte RDW is sent as part of the data. The application reading the file can read the information and calculate where the logical record starts and ends.
For example on z/OS the dataset has (in hex) where the bold data is displayed when you edit or browse the dataset. The italic data is not displayed.
00040000C1C2C3C4
00020000D1CD2
00050000E1E2E3E4E5
If the data was transmitted with RDW FALSE the data in the file would be
C1C2C3C4D1D2E1E2E3E4E5
If the data was transmitted with RDW TRUE the data in the file would be
00040000C1C2C3C400020000D1CD200050000E1E2E3E4E5
Conceptually you can process this file stream using C code:
short RDW; // 2 byte integer
short dummy; // 2 byte integer
RDW = fread(2); // get the length
dummy = fread(2); // ignore the 0s
mydata = fread(RDW);
...
RDW = fread(2); // get the length
dummy = fread(2); // ignore the 0s
mydata = fread(RDW);
In practice this will not work because z/OS has numbers which are Big Endian, and X86 and ARM machines are Little Endian. (With Big Endian – the left byte is most significant, with Little Endian, the right bit is most significant – the bytes are transposed.)
On z/OS 0x0004 is decimal 4. On X86 and ARM 0x0400 is 4.
In practice you need code on X86 and ARM, like the following, to get the value of a half word from a z/OS data set.
char RDW[2]; // 2 characters
RDW = fread(2); // get the length
length = 256 * RDW[0] + RDW[1]
and similarly for longer integers.
Python
If you are using the Python struct facility, you can pass a string of data types and get the processed values.
- The string “>HH” says two half words, and the > says the numbers are Big Endian.
- The string “<HH” says two half words and the < says they are Little Endian
- The string “HH” says two half words – read in the default representation.
Conversion
You’ll need to do your own conversion from EBCDIC to ASCII to make things printable!