I had a weekend away, and my experience of toilets at an airport gave me insight into servers used in computing. In the blog post below – where I say toilet – think server.
When I first joined IBM, 40 years ago we had “opinion surveys” where you could raise issues, and management would ignore them. At one feedback session, someone said we need bigger capacity toilets. There were comments like “We know you are full of ****, do you need bigger bowls?”. He meant that we needed more cubicles, because on a Friday afternoon, when some people came back from the pub, they would sit on the toilet and go to sleep. This was my first insight into the multiple meanings of the term capacity.
Later when I was just starting in performance, they replaced our mainframe machine which had one 60 MIPS CPU, with the newest machine with 6 CPUs each at 10 MIPS. The accountants saw this as the same sized computer. To us, a single CPU-bound transaction took 6 times longer – but you could now do 6 transactions in parallel, so overall the throughput was comparable. Like toilets, most of the time was spent doing I/O.
For my weekend away, I spent time at an airport in Scotland. The departures side has a central section, and a wing on either side. Our departure gate was in one wing. This had one toilet for men, and another for women. The men’s toilet had about 10 cubicles, and so if you needed one you normally did not have to wait very long. If one cubicle was out of service, this did not have a major impact on throughput. Unfortunately these toilets were closed for refurbishment. The closest toilets were back in the central area, but with only two cubicles, and you often had to wait. This showed there was insufficient capacity. If you could not wait, you had to walk further to find the next toilets. These had more capacity, but some times you still had to wait. For the ladies toilet, the queue was out of the door, and along the corridor – so a real lack of capacity – which shows lack of planning (or a male architect).
By the time I had been to the toilet and walked back to my gate, they were closing the flight!
What insight did I learn?
- If you have several big servers, and one is shut down, you need enough capacity on the other servers to cope.
- If you shut down one big server, the time spent per transaction will increase, as there is an increased waiting time for the servers.
- Depending on the location of the servers, you may have extra delay getting to the servers.
- Routing work to small servers may not help, as the servers may get overloaded. This small server is overloaded, but those two server are not busy – but you cannot route work to them.
- A larger server can handle peaks in workload better than multiple smaller ones.
- Some architects are not good at designing systems, for availability and capacity. You need to know the duration of your typical transaction and plan for that. Some transactions may take longer than others. In Copenhagen airport, the toilets are unisex which helps solve the availability and capacity problems!