What are the perks of using Postgres as a Terraform backend?

Requires Postgres 9.5+ and Terraform 0.12

Some background and concepts

The Terraform Backend is an abstraction that stores a state file or table (as we’ll see further) of the apply operation, keeping all the resources created in a curated and consistent place. The backend documentation explicitly mention this:

Working in a team: Backends can store their state remotely 
and protect that state with locks to prevent corruption.

That’s true, although we must consider the following stated in Amazon S3 Data Consistency Model documentation, which I quoted:

Amazon S3 provides read-after-write consistency for PUTS of new objects in 
your S3 bucket in all regions with one caveat. The caveat is that if you make a HEAD or GET 
request to the key name (to find if the object exists) before creating the object, 
Amazon S3 provides eventual consistency for read-after-write.

Amazon S3 offers eventual consistency for overwrite PUTS and DELETES in all regions.

*Updates to a single key are atomic.* For example, if you PUT to an existing key, a subsequent 
read might return the old data or the updated data, but it will never return corrupted or partial data.

Atomic key update means that PUTs are consistent, but there are other considerations. By default, S3 backend is not consistent unless you configure the S3 backend via DynamoDB table:

Stores the state as a given key in a given bucket on Amazon S3. 
This backend also supports state locking and consistency checking via Dynamo DB, 
which can be enabled by setting the dynamodb_table field to an existing DynamoDB table name.

That ends being two services for doing the same thing: serve as a consistent backend. And even if you have implemented an infrastructure dedicated to serialize team’s work, you probably want to consider the following pluses of using pg as your Terraform backend:

  • Full and rich locking system, which makes it consistent and a very stable source of truth.
  • Streaming Replication for backend mirroring.
  • One single backend (a Database instance), can hold as many TF backends you want to have (each one on its own database).

Keeping your backend databases mirrored and isolated, is a good practice for avoiding collisions at provisioning time anddd adds safety practices to avoid multiple projects colliding or affecting other resources.

Prerequisites

Remember, backends are pre-existing resources or pre-Terraform resources, so you can’t automate its creation inside the same backend (the egg and the chicken problem).

The easiest, is to spawn a DaaS (such as Cloud SQL or RDS) with a minimal size and backups enabled. This will return an endpoint, which can be combined to form the FQDN for the service.

For example, AWS provides the RDS service, which can be setup with single command like the bellow:

aws rds create-db-instance --db-name --db-instance-class --master-username --master-password

Backend

The backend definition could be located in its backend.tf file:

terraform {
  backend "pg" {}
  required_version = "> 0.12.0"
}

The backend’s connection string could be definedd in the tfvars. In the example bellow we take the environement variable containing the conn_str assignament of the connection string and we use it as a parameter for the terraform init phase:

export PROJECT=<project>
export BACKEND_CONNSTR="conn_str=postgres://user:pass@<URI>/<backend_project_db>"

In the Makefile or script, you can plug it like in this example:

	echo "$${BACKEND_CONNSTR}" > terraform/environments/$(ENV)/$(ENV).tfvars && \
	cd terraform/environments/$(ENV) && \
	terraform init \
		-backend-config=$(ENV).tfvars ...