What are the advantages of putting secret values of a website as environment variables?

Solution 1:

The author lists their reasoning, although it's a bit disjoint. Their primary argument is that it's easy to accidentally check in a config file, and that config files have varying formats and may be scattered around the system (all three of which are at best mediocre arguments for security related config like auth tokens and credentials).

Given my own experience, you've essentially got the following three options, with associated advantages and disadvantages:

Store the data in config files.

When taking this approach, you should ideally isolate them from the repository itself, and make sure they're outside of the area that the app stores it's content in.

Advantages:

  • Very easy to isolate and control access to, especially if you're using things like SELinux or AppArmor to improve overall system security.
  • Generally easy to change for non-technical users (this is an advantage for published software, but not necessarily for software specific to your organization).
  • Easy to manage across large groups of servers. There's all kinds of tools for configuration deployment out there.
  • Reasonably easy to verify what the exact configuration being used is.
  • For a well written app, you can usually change the configuration without interrupting service by updating the config file and then sending a particular signal to the app (usually SIGHUP).

Disadvantages:

  • Proper planning is needed to keep the data secure.
  • You might have to learn differing formats (though these days there's only a handful to worry about, and they generally have similar syntax).
  • Exact storage locations may be hard-coded in the app, making deployment potentially problematic.
  • Parsing of the config files can be problematic.

Store the data in environment variables.

Usually this is done by sourcing a list of environment variables and values from the startup script, but in some cases it might just state them on the command-line prior to the program name.

Advantages:

  • Compared to parsing a config file, pulling a value out of an environment variable is trivial in pretty much any programming language.
  • You don't have to worry as much about accidentally publishing the configuration.
  • You gain some degree of security by obscurity because this practice is uncommon, and most people who hack your app aren't going to think to look at environment variables right away.
  • Access can be controlled by the application itself (when it spawns child processes, it can easily scrub the environment to remove sensitive info).

Disadvantages

  • On most UNIX systems, it's reasonably easy to get access to a process's environment variables. Some systems provide ways to mitigate this (the hidepid mount option for /proc on LInux for example), but they aren't enabled by default, and don't protect against attacks from the user who owns the process.
  • It is non-trivial to see the exact settings something is using if you handle the above mentioned security issue correctly.
  • You have to trust the app to scrub the environment when it spawns child processes, otherwise it will leak information.
  • You can't easily change the configuration without a complete restart of the app.

Use command-line arguments to pass in the data.

Seriously, avoid this at all costs, it's not secure and it's a pain in the arse to maintain.

Advantages:

  • Even simpler to parse than environment variables in most languages.
  • Child processes don't automatically inherit the data.
  • Provides an easy way to quickly test out particular configurations when developing the application.

Disadvantages:

  • Just like environment variables, it's easy to read another process's command-line on most systems.
  • Extremely tedious to update the configuration.
  • Puts a hard limit on how long the configuration can be (sometimes as low as 1024 characters).

Solution 2:

Environment variables will be inherited by every child process of the web server. That's every session that connects to the server, and every program spawned by them. The secrets will be automatically revealed to all of those processes.

If you keep secrets in text files, they have to be readable by the server process, and so potentially by every child process too. But at least the programs have to go and find them; they're not automatically provided. You might also be able to make some child processes run under different accounts, and make the secrets readable only by those accounts. For example, suEXEC does this in Apache.

Solution 3:

Even if there are some security related trade offs to be made when it comes to environment variables or files, I don't think security was the main driving force for this recommendation. Remember the authors of 12factor.net are also (or were also?) developers of the Heroku PaaS. Getting everyone to use environment variables probably simplified their development quite a bit. There's so much variety in different config files formats and locations and it would have been difficult for them to support them all. Environment variables are easy in comparison.

It doesn't take much imagination to guess at some of the conversations that were had.

Developer A: "Ah this secret config file UI is too cluttered! Do we really need to have a drop down that switches between json, xml, and csv?"

Developer B: "Oh, life would be so grand if only everyone used environment variables for the app config."

Developer A: "Actually there are some plausible security-related reasons to do that. Environment variables probably won't get accidentally checked into source control."

Developer B: "Don't you set the environment variables with a script that launches the daemon, or a config file?"

Developer A: "Not in Heroku! We'll make them type them into the UI."

Developer B: "Oh look, my domain name alert for 12factor.net just went off."1


1: source: made up.

Solution 4:

TL;DR

There are a number of reasons for using environment variables instead of configuration files, but two of the most common ones to overlook is the utility value of out-of-band configuration and enhanced separation between servers, applications, or organizational roles. Rather than present an exhaustive list of all possible reasons, I address just these two topics in my answer, and touch lightly on their security implications.

Out-of-Band Configuration: Separating Secrets from Source Code

If you store all your secrets in a configuration file, you have to distribute those secrets to each server. That either means checking the secrets into revision control alongside your code, or having an entirely separate repository or distribution mechanism for the secrets.

Encrypting your secrets doesn't really help solve for this. All that does is push the issue to one remove, because now you have to worry about key management and distribution, too!

In short, environment variables are an approach to moving per-server or per-application data out of source code when you want to separate development from operations. This is especially important if you have published source code!

Enhance Separation: Servers, Applications, and Roles

While you could certainly have a configuration file to hold your secrets, if you store the secrets in source code you have a specificity problem. Do you have a separate branch or repository for each set of secrets? How do you ensure the right set of secrets gets to the right servers? Or do you reduce security by having "secrets" that are the same everywhere (or readable everywhere, if you have them all in one file), and therefore constitute a bigger risk if any one system's security controls fail?

If you want to have unique secrets on each server, or for each application, environment variables do away with the problem of having to manage a multitude of files. If you add a new server, application, or role, you don't have to create new files or update old ones: you just update the environment of the system in question.

Parting Thoughts on Security

While a thorough exploration of kernel/memory/file security is out of scope for this answer, it's worth pointing out that properly-implemented, per-system environment variables are no less secure than "encrypted" secrets. In either case, the target system still has to hold the decrypted secret in memory at some point in order to use it.

It's also worth pointing out that when values are stored in volatile memory on a given node, there's no on-disk file that can be copied and attacked offline. This is generally considered an advantage to in-memory secrets, but it's certainly not conclusive.

The issue of environment variables vs. other secrets-management techniques is really more about security and usability trade-offs than it is about absolutes. Your mileage may vary.

Solution 5:

Personally, I wouldn't recommend setting environmental variables in .bashrc as these become visible to all processes started by the shell but to set them at the daemon/supervisor level (init/rc script, systemd config) so that their scope is limited to where needed.

Where separate teams manage operations, environment variables provide an easy interface for operations to set the environment for the application without having to know about the configuration files/formats and/or to resort to mangling of their content. This is especially true in multi-language/multi-framework settings where the operations teams can chose the deployment system (OS, supervisor processes) based on operational needs (deployment ease, scalability, security, etc).

Another consideration is CI/CD pipelines - as code goes through different environments (i.e. dev, test/qa, staging, production) the environmental particulars (deployment zones, database connection particulars, credentials, IP addresses, domain names, etc, etc) are best set by dedicated configuration management tools/frameworks and consumed by the application processes from the environment (in a DRY, write once, run anywhere fashion). Traditionally where developers tend to manage these operational concerns, they tend to check-in configuration files or templates besides code - and then end up adding workarounds and other complexity when operational requirements change (e.g. new environments/deployment/sites come along, scalability/security weigh in, multiple feature branches - and suddenly there are hand-rolled deployment scripts to manage/mangle the many configuration profiles) - this complexity is a distraction and an overhead best managed outside of code by dedicated tools.

  • Env-vars simplify configuration/complexity at scale.
  • Env-vars place operational configuration squarely with the team responsible for the non-code related aspects of the application in a uniform (if not standard) non-binding way.
  • Env-vars support swapping out the master/supervisor processes (e.g. god, monit, supervisord, sysvinit, systemd, etc) that back the application - and certainly even the deployment system (OSes, container images, etc) or so on as operational requirements evolve/change. While every language framework nowadays has some process runtime of sorts, these tend to be operationally inferior, suited more for dev environments and/or increase complexity in multi-language/multi-framework production environments.

For production, I favour setting the application env-vars in an EnvironmentFile such as/etc/default/myapplication.conf that is deployed by configuration management and set readable only by root such that systemd (or anything else for that matter) can spawn the application under a dedicated Deprivileged system user in a Private group. Backed with dedicated user groups for ops and sudo - these files are unreadable by world by default. This is 12factor compliant supporting all the goodness of Dev+Ops plus has all the benefits of decent security while still allowing developers/testers to drop in their own EnvironmentFiles in the dev/qa/test environments.