After I stumbled on a change to my Python program which gave 10 times the throughput to a Web Server, I realised that I knew only a little about using REST. It is the difference between the knowledge to get a Proof Of Concept working, and the knowledge to run properly in production; it is the difference between one request a minute to 100 requests a second.
This blog post compares REST and traditional client server and suggests ways of using REST in production. The same arguments also apply to long running classical client server applications.
A REST request is a stateless, self contained request which you send to the back-end server, and get one response back. It is also known as a one shot request. Traditional client server applications can send several requests to the back-end as part of a unit of work.
In the table below I compare an extreme REST transaction, and an extreme traditional Client Server
Attribute | REST | Client Server |
Connection | Create a new connection for every request. | Connect once, stay connected all day, reuse the session, disconnect at end of day. |
Workload Balancing | The request can select from any available server, and so on average, requests will be spread across all connections. If a new server is added, then it will get used. | The application connects to a server and stays connected. If the session ends and restarts, it may select a different server. If a new server is added, it may not be used. |
Authentication | Each request needs authentication. If the userid is invalidated, the request will fail. Note that servers cache userid information, so it may take minutes before the request is re-authenticated. | Authentication is done as part of the connection. If the userid is invalidated during the day, the application will carry on working until it restarts. |
Identification | Both userid+password, and client certificate can be used to give the userid. | Both userid+password, and client certificate can be used to give the userid. If you want to change which identity is used, you should disconnect and reconnect. |
Cost | It is very expensive to create a new connection. It is even more expensive when using TLS, because of the generation of the secret key. As a result it is very very expensive to use REST requests. | The expensive create connection is done once, at start of day. Successive request do not have this overhead, so are much cheaper |
Renew TLS session key | Because there is only one transfer per connection you do not need to renew the encryption key. | Using the same session key for a whole day is weak, as it makes it easier to break it. Renewing the session key after an amount of data has been processed, or after a time period is good practice. |
Request | Some requests are suitable for packaging in one request, for example where just one server is involved. | This can support more complex requests, for example DB2 on system A, and MQ on system B. |
Number of connections | The connection is active only when it is used. | The connection is active even though it has not been used for a long time. This can waste resources, and prevent other connections from being made to the server. |
Statistics | You get an SMF record for every request. Creating an SMF record costs CPU. | You get one SMF record for collection of work, so reducing the overall costs. The worst case is one SMF record for the whole day. |
What are good practices for using REST (and Client Server) in production?
Do not have a new connection for every request. Create a session which can be reused for perhaps 50 requests or ten minutes, depending on workload. This has the advantages :
- You reduce the costs of creating the new connection for every request, by reusing the session.
- You get workload balancing. With the connection ending and being recreated periodically, you will get the connections spread across all available connections. You should randomise the time a connection is active for, so you do not get a lot of time-out activity occurring at the same time
- You get the re-authentication regularly.
- The TLS key is renewed periodically.
- You avoid the long running connections doing nothing.
- For a REST request you may get fewer SMF records, for a Client-Server you get more SMF requests, and so more granular data.
How can I do this?
With Java you can open a connection, and have the client control how long it is open for.
With Python and the requests package, you can use
s = requests.Session()
res = s.get(geturl,headers=my_header,verify=v,cookies=jar,cert=cpcert)
…
res = s.get(geturl,headers=my_header,verify=v,cookies=jar,cert=cpcert)
etc
With Curl you can reuse the session.
Do I need to worry if my throughput is low?
No, If you are likely to have only one request to a server, and so cannot benefit from having multiple requests per connection you might just as well stay with a “one shot” and not use any of the tuning suggestions.