I am of the opinion that the Hypertext Transfer Protocol is a marvel. It’s earliest version could not be simpler: it only describes the most basic GET request, and yet GET remains the workhorse of the web thirty-three years later. The rest of what would become the 1.1 version followed shortly after, and it has served as the foundation for the overwhelming majority of web applications built since then. The fact that the 404 responses and ‘If-Modified’ headers we still use in 2024 were largely prescribed before there were any applications to use them is simply astonishing.
That being said, thirty-three years is a long time when it comes to technology. The requirements for modern applications are vastly different than they were in the early days of the web, so I thought it would be interesting to take a closer look at how, and perhaps whether, HTTP (specifically version 1.1) retains it’s utility in such a dramatically changed environment.
Formal but Flexible
The irony of the vaunted connectivity provided by the Internet is that the core design principle of HTTP is one of separation. As a stateless request/response protocol, communication can only happen via the decidedly bureaucratic submission of a formal request, followed by waiting to receive an official response. And because it is stateless, any follow-up request that a response might prompt has to repeat the entire procedure over again from the top.
The motivations for this may have been purely practical concessions to technical limitations of the time, but the result has been a more flexible and integrated web ecosystem than might otherwise have been the case. By forcing all communication to abide by it’s restrictions, HTTP makes it easier for disparate applications to interact. It doesn’t much matter to a client app what language or framework the server is using, as long as they both speak HTTP they are going to be able to successfully interact. For a prime example of how differently things might have turned out in the absence of these ground rules, consider for a moment how well Androids and iPhones are able to handle basic messaging between their respective platforms.
The format of the requests and responses themselves is tightly proscribed, made up of a header section with fields providing context and a body containing any data being transmitted. Here again the restrictions belie a surprising flexibility. The number of standardized header fields available is somewhat limited, but they often leave their exact definitions open to interpretation and an application may add as many custom headers as they want. The data in the body can only be text, but it can be formatted in any way you want and can be as large or as small as required. In fact the only real limitations on HTTP data bodies are the ones imposed by clients themselves.
While it has served with distinction thus far, it’s fair to wonder whether HTTP is still able to meet the requirements of current applications. Users now expect to be able to interact with one another, and to be proactively notified of important updates. Both of these things present special challenges for HTTP. There is simply no way for a server to reach out to a client on it’s own; it has to be asked first. And there is no way for clients to connect with each other directly; they can only communicate via a server acting as intermediary.
Newer protocols like WebSockets, HTTP/2 and HTTP/3 have come along that have specific provisions for allowing servers to push data out to clients directly, or to transfer binary data instead of plain text. These can be significantly more complex to work with, and I think that for the majority of use cases HTTP 1.1 is still perfectly capable of delivering fluid and interactive user experiences. We just might have to be a little more intentional about taking advantage of it’s inherent flexibility.
For a concrete example, lets imagine we are building an app that allows multiple users to simultaneously edit the same document. A standard approach would resemble the introductory app for most web frameworks, with a document model that has fields for at least a title and a body, as well as the usual routes for an index listing, a display page and an edit form. The challenges for this design present themselves right away. One is that due to HTTP’s request/response architecture, when two users separately but simultaneously submit an update to the same document, whichever request arrives at the server last can potentially overwrite all the changes made by the first user. The other obvious problem is that if we want to display updates to our document that are happening while we are in the process of editing it we will need some way of continuously retrieving and merging in those updates. If our documents are large the repeated requests for it could incur heavy data transfer overhead, and if the merges require manual user intervention it could quickly become a bad user experience.
One common way to handle the competing updates problem in HTTP apps is with conditional requests, where each request includes special validation headers that the server can use to refuse to accept an update if the document has changed in the interim. This strategy gets us part of the way there, and it helps us out two ways: we can use conditional requests to avoid clobbering other user’s edits, and we can also use them to only retrieve the latest document version if it differs from the one we already have. But conditional requests alone are not going to be sufficient for our needs. With multiple users making changes to only two fields, we still need a way to merge those competing changes. In the case of the document body especially there are likely to be a great many such conflicts even when users are working in entirely different sections.
For my approach to this problem I took inspiration from the subtle difference between the PUT and PATCH request methods. Often used interchangeably by developers to indicate any kind of update request, a PUT is actually intended to update a resource in it’s entirety, while PATCH is meant for making specific edits to just part of a resource. There is no reason that we have to update the entire document all at once; we could instead break up our updates by the individual fields, sending PATCH requests for each of them to the server separately. This would help somewhat, but we could take this one step further and atomize our updates within the field. Instead of sending the entire document body to the server at one time, we could send character-scale changes as they occur. If the server is coded such that it can apply small diffs instead of wholesale replacements, then this would dramatically reduce the collision surface for multiple users. In the normal course of events the only time we should encounter a conflict is if two users attempt to update the same character or word at the same time, which is far less likely to occur.
If-Match: (6)[score nnd seven]
The above represents a request to change a single character in the document body, correcting the misspelled ‘nnd’ in the ‘Four score and seven years ago’ of the Gettysburg address to ‘and’. There is only one officially registered format for the value of ‘Range’ headers but, again, there’s nothing to prevent us from implementing our own. So here
chars=12-14 indicates the section of the text that is to be replaced with the data contained in the body (
document[body]=a). The ‘If-Match’ header is for the server to use to verify that the change we want to make is valid and also uses a bespoke value format. In this case we are providing some of the surrounding text from the document where the change occurs for search context (
score nnd seven), and the number 6 indicates how much of the context snippet precedes the start of our change.
For these I’m using the ‘If-None-Match’ header to send a hash of the full contents of the field as it exists in the browser, and the server should only return the value of the title field in it’s response if it is different than the one I already have. This prevents unnecessary data transfer and lightens the processing load a bit on the client side by avoiding pointless display updates.
And finally, to try out the multi-user aspect I used a small script that submitted PATCH requests like the above to simulate someone typing out a block of sample text into the document while I had it loaded into my browser. Putting all this together resulted in the following:
While clearly in need of further refinement before it could be part of a finished application, it still provides an example of how HTTP can accommodate even the complex requirements of dynamic modern user experiences with some adaptive re-use of it’s features.
It’s hard to get definitive data on exactly how much traffic on the web currently uses which version of which protocol. But there is no question that HTTP 1.1 is not going away anytime soon. While the landscape for application development has changed significantly since it’s introduction, it remains the backbone of the Internet and I for one am glad of it. There is much more to be said about the ways in which HTTP continues to be used (and misused; I have a particular bone to pick with the way we developers tend to ignore all but a few of the response codes), but we may very well have another thirty-three years to get to that.
Loved the article? Hated it? Didn’t even read it?
We’d love to hear from you.