Should I commit .tfstate files to Git?

I am a little bit puzzled on the question whether to commit .tfstate files to Git or not. The Terraform documentation states:

Terraform also put some state into the terraform.tfstate file by default. This state file is extremely important; it maps various resource metadata to actual resource IDs so that Terraform knows what it is managing. This file must be saved and distributed to anyone who might run Terraform. We recommend simply putting it into version control, since it generally isn't too large.

Now, on the other hand, the accepted and upvoted answer on Best practices when using Terraform states:

Terraform config can be used to provision many boxes on different infrastructure, each of which could have a different state. As it can also be run by multiple people this state should be in a centralised location (like S3) but not git.

(Emphasis by the original author, not by me)

Who is right, and if so, why?


Solution 1:

There are a few reasons not to store your .tfstate files in Git:

  1. You are likely to forget to commit and push your changes after running terraform apply, so your teammates will have out-of-date .tfstate files. Also, without any locking on these state files, if two team members run Terraform at the same time on the same .tfstate files, you may overwrite each other's changes. You can solve both problems by both a) storing .tfstate files in an S3 bucket using Terraform remote state, which will push/pull the .tfstate files automatically every time you run terraform apply and b) using a tool like terragrunt to provide locking for your .tfstate files.
  2. The .tfstate files may contain secrets. For example, if you use the aws_db_instance resource, you have to specify a database password, and Terraform will store that, in plaintext, in the .tfstate file. This is a bad practice on Terraform's behalf to begin with and storing unencrypted secrets in version control only makes it worse. At least if you store .tfstate files in S3, you can enable encryption at rest (SSL provides encryption while in motion) and configure IAM policies to limit who has access. It's very far from ideal and we'll have to see if the see open issue discussing this problem about it ever gets fixed.

For more info, check out How to manage Terraform state and Terraform: Up & Running.

Solution 2:

TL;DR:

Important! Storing in source control could expose potentially sensitive data and risks running Terraform against an old version of state. Don't do it.

Terraform no longer recommends storing state in source control. Your 'good' options are remote or local.

Remote state grants significant benefits vs both local and storing in source control. Details of these are below.


Original answer:

Yevgeniy's answer is a good one. The issue is somewhat less controversial now as Terraform have updated their docs to state:

Terraform also puts some state into the terraform.tfstate file by default. This state file is extremely important; it maps various resource metadata to actual resource IDs so that Terraform knows what it is managing. This file must be saved and distributed to anyone who might run Terraform. It is generally recommended to setup remote state when working with Terraform. This will mean that any potential secrets stored in the state file, will not be checked into version control

So there is no longer a disagreement between established best practice and official recommendations.


Update 2019-05-17

In the most recent version of the docs this has been changed to say:

... This state is stored by default in a local file named "terraform.tfstate", but it can also be stored remotely, which works better in a team environment. ...

I don't expect the advice will ever revert to source control being the preferred method of storing state.

Despite the docs quote above remote state is still beneficial as a solo developer

Remote state allows the solo developer to:

  • Work on/run their Terraform code from several devices
  • Easily backup and protect against losing the state file, depending on backend chosen
  • Segregate sections of their architecture via outputs
  • Automatically encrypt state file at rest, depending on backend chosen

Solution 3:

This is probably going to come down to preference but I would say git (or any other source control) is not a particularly good option for storing of state files as they are an output of the code you are writing much like a compiled binary or even minimised JS or LESS compiled to CSS.

On top of that things may change quite rapidly in the state files as an output to things being run rather than things being actually changed in the code which makes the whole thing rather awkward.

However, you do need some way of sharing these state files with any remote team members or even other devices if you are developing on different laptops/machines. You will also want some way to store and back these up because you're going to have some real pain if you lose a state file as Terraform uses the state files to work out what things it's managing so as not to step on the toes of other tooling.

I'd say S3 is probably the best place you can put them right now. It's pretty much free, durability is excellent as is availability, there's very good native support for it in Terraform using the remote state resource. And probably most importantly you only have to create an S3 bucket to get started. Having to build a Consul or etcd cluster first without Terraform (otherwise you have a chicken and egg problem of where do you store the state for creating those?) is a bit of a pain even if you intend to use either of those products.

Obviously if you're using OpenStack then Swift should make a good alternative (although I've not used it). I've also not used Hashicorp's Atlas but if you're happy to pay for that service it might be equally useful.

Solution 4:

I see an advantage to share terraform.tfstate via other means, rather than Git.

For example: S3, Dropbox, etc.. (with versioning turned on)

Then it will be possible to roll back to previous infrastructure state.

For example, you roll back repository from commit B, back to commit A. If terraform.tfstate is unchanged - terraform will think how to roll back all stuff you've added during commit B. And rollback will be easy.

In case terraform.tfstate was also rolled back to commit A - then terraform will think that terraform.tfstate is in sync with required configuration and will not apply the rollback to your infrastructure.