Upstream / downstream terminology used backwards? (E.g. nginx)
I've always thought of upstream and downstream along the lines of an actual stream, where the flow of information is like water. So upstream is where water/data comes from (e.g. an HTTP request) and downstream is where it goes (e.g. the underlying system that services the request).
I've been looking at API gateways recently and noticed that some of them used the inverse of this definition. I shrugged it off as an oddity at the time. I then discovered that nginx, which some API gateways are based on, also uses the terminology in the opposite way to what I expected. nginx calls the servers that it sends requests to "upstream servers", and presumably the incoming requests would therefore be "downstream clients".
Conceptually it seems like nginx would be pushing the requests "uphill" if going to an "upstream server", which is totally counter-intuitive... Gravity is reversed in the land of reverse proxies and API gateways, apparently!
I've seen other discussions talking about upstream / downstream representing dependencies between systems but for middleware or infrastructure components that sit between systems the idea of dependencies is a little looser, and I find it more helpful to think in terms of flow of information still - because THAT'S usually the source of your dependencies anyway.
Have I got my understanding of the stream analogy fundamentally wrong or are these software components getting the concepts backwards?
Solution 1:
In HTTP world, the "upstream server" term was introduced in the HTTP/1.0 specification, RFC 1945:
502 Bad Gateway
The server, while acting as a gateway or proxy, received an invalid response from the upstream server it accessed in attempting to fulfill the request.
Formal definition was added later, in RFC 2616:
upstream/downstream
Upstream and downstream describe the flow of a message: all messages flow from upstream to downstream.
According to this definition:
- if you are looking at a request, then the client is upstream, and the server is downstream;
- in contrast, if you are looking at a response, then the client is downstream, and the server is upstream.
At the same time, in HTTP most of the data flow is not for requests, but for responses. So, if you'll consider flow of responses, then the "upstream server" term sounds pretty reasonable and logical. And the term is again used in the 502 response code description (it is identical to HTTP/1.0 one), as well as some other places.
The same logic can be also seen in terms "downloading" and "uploading" in natural language. Most of the data flow is from servers to clients, and that's why "downloading" means loading something from a server to a client, and "uploading" - from a client to a server.