Are HTTP_PROXY, HTTPS_PROXY and NO_PROXY environment variables standard?

It seems that a lot of programs are designed to read these environment variables to decide what proxy to go through in order to connect to a resource on the internet. Those programs may also have their own, individual proxy settings, but if those are not set, they'll happily use these environment variables...

  • HTTP_PROXY
  • HTTPS_PROXY
  • NO_PROXY

I just want to know:

  • Are these environment variables standard?
  • Is there a written specification (may be by the OS manufacturers?) that recommends the use of these environment variables?

I agree with BillThor's statement that This is more a convention than a standard.
I don't know the origin of these variables but in case of HTTP on *nix many conventions seem to originate from behavior of libcurl HTTP library and curl command line program.

At https://curl.haxx.se/docs/manual.html there's description of environment variables related to using HTTP proxy which libcurl/curl understands:

ENVIRONMENT VARIABLES

Curl reads and understands the following environment variables:
http_proxy, HTTPS_PROXY, FTP_PROXY

They should be set for protocol-specific proxies. General proxy should be set with
ALL_PROXY

A comma-separated list of host names that shouldn't go through any proxy is set in (only an asterisk, '*' matches all hosts)
NO_PROXY

If the host name matches one of these strings, or the host is within the domain of one of these strings, transactions with that node will not be proxied.

Please notice that http_proxy is spelled lowercase as the only one among these variables. Some libraries/programs look for lowercase names of these variables whereas others look for upppercase names. To be safe one should define both lowercase and uppercase versions of each variable.

Another issue is that cited description of how host names are matched against NO_PROXY is not precise and does not answer the following questions:

  • Should values be fully qualified domain names (FQDN) thus ending with a dot like foo.example.com. or not?
  • Should foo.example.com match only this one domain or should it also match any subdomain like bar.foo.example.com? If the latter then should it also match any subdomain in any subdomain like bar.baz.foo.example.com?
  • Is .foo.example.com (dot at the beginning) allowed and if so then what should it match ?
  • Is asterisk (*) allowed as part of value (*.example.com, *example.com) and if so then how is it treated?

Lack of formal specification leads to confusion and bugs. Here one has to mention libproxy library which aims to provide correct and consistent support for proxy configuration. From project's home page:

libproxy exists to answer the question: Given a network resource, how do I reach it? It handles all the details, enabling you to get back to programming.

Further reading:

  • Issue I raised against curl – Please document syntax and semantics of NO_PROXY environment variable.
  • Python's issue – urllib: no_proxy variable values with leading dot not properly handled and related SO question – Python 2.7.13 does not respect NO_PROXY and makes urllib2.urlopen() error with “Tunnel connection failed: 403 Forbidden”

This is more a convention than a standard. It is likely supported by one or more protocol handler libraries which actually make the connections. Java uses similar properties in its protocol libraries.

Understanding and using common conventions makes development much simpler. It also helps implement the principle of least surprise and make programs more likely to just work.