Postgres roles best practice implementation

Folks,

I could use your help making my Postgres user access control design better and more aligned with best practices. I'm helping roll out a small production Postgres server but I am not a DB admin, so I know just enough to be dangerous.

There is one server with one install of Postgres v9.2. This install hosts multiple databases, each fully serving a different "customer". In other words, customer1 will not, should not use database2, and so forth. During normal operations, the databases are eached accessed by a matching instance of CakePHP, all co-located on the same server as Postgres. While there may be possible optimizations on this deployment, I'm mostly interested in Psql roles.

Based on what I read, it seems three types of roles would make sense:

  • Superuser postgres with non default password
  • An administrator role that does not have superuser privileges for routine maintenance, DB creation, backup, restore. Should be able to do anything with all the customer databases.
  • User roles with just the ability to CRUD in their respective database. More rights on their own DB could be tolerated if it cleans up the implementation.

Implementing that design is where I'm much less confident. Ownership of DB versus table as well as who should inherit from whom is a bit muddy. Below are my databases and my users. Is that enough info to evaluate the implementation?

     Role name |                   Attributes                   |     Member of     
    -----------+------------------------------------------------+-------------------
     admin     | Create role, Create DB                         | {user1, user2}
     postgres  | Superuser, Create role, Create DB              | {}
     user1     |                                                | {}
     user2     |                                                | {}

    postgres=# \l
                                 List of databases
       Name    |  Owner   | Encoding | Collate | Ctype |   Access privileges   
    -----------+----------+----------+---------+-------+-----------------------
     admin     | postgres | UTF8     | en_US   | en_US | =Tc/postgres         +
               |          |          |         |       | postgres=CTc/postgres+
               |          |          |         |       | admin=CTc/postgres
     postgres  | postgres | UTF8     | en_US   | en_US | 
     template0 | postgres | UTF8     | en_US   | en_US | =c/postgres          +
               |          |          |         |       | postgres=CTc/postgres
     template1 | postgres | UTF8     | en_US   | en_US | =c/postgres          +
               |          |          |         |       | postgres=CTc/postgres
     user1     | admin    | UTF8     | en_US   | en_US | =Tc/admin            +
               |          |          |         |       | admin=CTc/admin      +
               |          |          |         |       | user1=CTc/admin
     user2     | admin    | UTF8     | en_US   | en_US | =Tc/admin            +
               |          |          |         |       | admin=CTc/admin      +
               |          |          |         |       | user2=CTc/admin

To prevent external connections and passwords in the clear, pg_hba.conf is as follows:

local   all             all                                     md5
host    all             all             127.0.0.1/32            md5
host    all             all             ::1/128                 md5

Solution 1:

I know this is an old question but I will try to answer it even now, as I got to do some research related to this.

What you are trying to do is called multi-tenancy at database level. This can be achieved in two ways:

  1. In a single database cluster, somewhat how the OP described, however, my personal choice would be this:

    • postgres user uses peer authentication and is not allowed password connections. MD5 authentication, in my opinion, is a bad practice. If you get into any kind of trouble with database consistency or this kind of things, you will still be able to login if you let postgres use peer auth.
    • Each customer should get their own schema and not database. There are multiple reasons for this:
      • Owning the whole database would grant a lot of privileges.
      • Owning only specific tables would bring problems for developers and would always require to ask admins to add permissions and stuff.
      • As such, in a normal setup, each of them would get access to create things inside their schema, including tables, views, triggers, etc.
      • All of them use the same connection string except the username. In postgres, by default, if you have a schema with the name of your user, it is automatically in your search_path.
    • I would opt for not having an admin user that is able to access each schema, as a security measure. You should do backups either by dumping each schema with their own user or using the PITR technique of PostgreSQL. You would still need to use the postgres user to create new schemas, I would go for a sudo rule and a script for that.
    • Many security good practices recommend dropping the default schema so - there we go.
    • This solution is extremely suitable if the DB for each customer is small and you got tons of customers.
    • If your application handles multi-tenancy, it can use a single connection pool for all customers. Of course, this eliminates many of the security enhancements above but could have performance benefits, specially when you have a large number of customers (if you have like 500-1000 separate datasources and you use connection pooling, it will be quite overwhelming).
  2. Each customer gets their own database cluster. This is my preferred solution especially because I usually work with applications that have big databases per each customer.

    • This one brings very good data separation. You can use separate storage volumes for each customer, allocate CPU and Memory limitations (using docker?).
    • Really good flexibility on what each customer needs in their instance. They could be either similar or have distinct features.
    • Very easy to scale on both directions (up and out).
    • I also use separate virtual IPs where each cluster listens for connections, making the scale-out to not needing datasource reconfiguration.
    • PITR backups are per customer so it will be easier to restore a single customer comparing to the per-schema multi-tenancy.
    • On complex setups, each customer might need multiple databases, schemas, users and roles, etc. so this is a way better solution in those cases.

You can also use a combination of the above and use pgBouncer as a router.