Docker PostgreSQL change database encoding to UTF-8
I want to run via docker-compose a postgres container which has COLLATE and CTYPE 'C' and database encoding 'UTF-8'. But this looks to be impossible.
This is the part on the docker-compose.yml:
database:
image: postgres:latest
volumes:
- db:/var/lib/postgresql/data
environment:
POSTGRES_PASSWORD: test
LC_COLLATE: C
LC_CTYPE: C
LANG: C.UTF-8
And this is the log output:
The database cluster will be initialized with locales.
The default text search configuration will be set to "english".
COLLATE: C
CTYPE: C
MESSAGES: C.UTF-8
MONETARY: C.UTF-8
NUMERIC: C.UTF-8
TIME: C.UTF-8
The default database encoding has accordingly been set to "SQL_ASCII".
I must have the database encoding in UTF-8 and the COLLATE and CTYPE in 'C' and not 'C.UTF-8' as otherwise a dependend application cannot connect.
I didn't find anything in any documentation or anywhere else.
Solution 1:
You need to conjoin two pieces of the puzzle here:
https://www.postgresql.org/docs/9.5/app-initdb.html
initdb, teachs you how to pass encoding information to the database creation function.
The postgres official Docker image, states you can pass options, to initdb:
https://hub.docker.com/_/postgres
Ergo, the answer would be something like:
database:
image: postgres:latest
volumes:
- db:/var/lib/postgresql/data
environment:
POSTGRES_PASSWORD: test
POSTGRES_INITDB_ARGS: '--encoding=UTF-8 --lc-collate=C --lc-ctype=C'
Or similar arguments. I ignored the lang option, as this is not an official "pass this flag to postgres" option on the man page (the first link I included).
My tests did not run this using docker compose, it was on the command line using the -e option. This is the exact same concept however; "environment" in docker compose is -e on the command line. To wit:
https://docs.docker.com/engine/reference/commandline/run/
--env , -e Set environment variables
Test #1 with only the password env set:
docker run -e POSTGRES_PASSWORD=test postgres:latest
Here's the output of the default run:
postgres@cbf23636dabc:~$ psql
psql (13.4 (Debian 13.4-1.pgdg100+1))
Type "help" for help.
postgres=# \l
List of databases
Name | Owner | Encoding | Collate | Ctype | Access privileges
-----------+----------+----------+------------+------------+-----------------------
postgres | postgres | UTF8 | en_US.utf8 | en_US.utf8 |
template0 | postgres | UTF8 | en_US.utf8 | en_US.utf8 | =c/postgres +
| | | | | postgres=CTc/postgres
template1 | postgres | UTF8 | en_US.utf8 | en_US.utf8 | =c/postgres +
| | | | | postgres=CTc/postgres
Test #2, with environment variables set as above in the suggested docker compose only on CLI:
docker run -e POSTGRES_PASSWORD=test -e POSTGRES_INITDB_ARGS='--encoding=UTF-8 --lc-collate=C --lc-ctype=C' postgres:latest
And then the output:
postgres@b6b80c876f3e:~$ psql
psql (13.4 (Debian 13.4-1.pgdg100+1))
Type "help" for help.
postgres=# \l
List of databases
Name | Owner | Encoding | Collate | Ctype | Access privileges
-----------+----------+----------+---------+-------+-----------------------
postgres | postgres | UTF8 | C | C |
template0 | postgres | UTF8 | C | C | =c/postgres +
| | | | | postgres=CTc/postgres
template1 | postgres | UTF8 | C | C | =c/postgres +
| | | | | postgres=CTc/postgres
Note also, the section on the official Postgresql Docker image page, where it describes initialization scripts. This is something you may look into as well.