There is synchronous write, and synchronous write, you just have to know if you are talking about a synchronous write as seen by an application, or as issued by the z/OS operating system. The synchronous disk IO as used by the IO subsystem is new (a couple of years old) it is known as zHyperLink.
If you know about sync and async requests to a coupling facility – you already know the concepts.
The application synchronous IO
For the last 40 years an application did an IO request which looked to be synchronous. Immediately after a read IO request had finished you could use the data. What happens under the cover is as follows:
- Issue the IO request to read from disk – for example a c fread function.
- The operating system determines which disk to use, and where on the disk to read from.
- The OS issues the IO request.
- The OS dispatcher suspends the requesting task.
- The OS dispatcher dispatches another task.
- When the IO request has completed, it signals an IO-complete interrupt. This interrupts the currently executing program to set a flag saying the original task is now dispatch-able.
- The dispatcher resumes the original task which can now process the data.
40 years ago a read could take over 20 milliseconds.
A short history of disk IO.
Over 40 years disks have changed from big spinning disks – 2 meters high to PC sized disks with many times the capacity.
- With early disks the operating system had to send commands to move the heads over the disk, and to wait until the right part of the disk passed under the head.
- The addition of read cache to the disks. For hot data, the wanted record may be in the cache – and so avoids having to read from the physical disks
- Adding write cache – and a battery meant there were two steps. 1) Send the data to the write cache, 2) move the data from the write cache to the spinning disks
- The use of PC type disks – and solid state disks with no moving parts.
These all relied on the model start an IO request, wait for the IO complete interrupt.
The coupling facility
The coupling facility(CF) is a machine with global shared memory, available to systems in a Sysplex.
When this was being developed the developers found that it was sometimes quicker to issue an IO instruction and wait for it to complete, than have the model used above of starting an IO, and waiting for the interrupt. The “issue the IO instruction and wait”, the synchronous request, might take 1000 microseconds. The “start the IO, wait, and process the interrupt”, the asynchronous request might take 50 microseconds.
How long does the synchronous instruction take? – How long is a piece of string?
Most of the time spent in the synchronous instruction is the time on the cable between the processor and the disk controller – a problem with the speed of light. If the distance is long (a long bit of cable), the instruction takes too long, and it is more efficient to use the Async model to communicate to the CF. Use a shorter cable (it may mean moving the CF closer to the CPU) and the instruction is quicker.
How about synchronous disk IO?
The same model can be used with disk IO. The underlying technology (eg DB2) had to change the way it does IO to exploit this.
When used for disk read – the data is expected to be in the disk controller cache. If not then the request will time out, and an Async request will be made.
This can be used for disk write to put the data into the disk controller cache, but this may not be as useful. If you are mirroring your logs, with local disks and remote disks, the IO as seen by DB2 will not compete until the local and remote IOs have completed. Just like the CF it means the DASD controller (3990) needs to be close to the CPU.
I found Lightning Fast I/O via zHyperLink and Db2 for z/OS Exploitation a good article which mentions synchronous IO.
I noticed that in older releases of z/OS, IO response times were in units of 128 microseconds. For example when an IO finishes the response contains the IO delays in the different IO stages. In recent releases, the IO response times are now in microseconds, as you may get response times down to the 10’s of microseconds, and so reporting it in units of 128 microseconds is not accurate enough.