Postgresql - configuring as Upstart job problem

I need Postgresql configured to start with the Upstart system because I use Upstarts events to start another app which depends on pgsql to be running. This is the tutorial/script I've used:

http://bradleyayers.blogspot.com/2011/10/upstart-job-for-postgresql-91-on-ubuntu.html

When I restart the server (shutdown -r now), postgresql isnt running (not visible as a job via 'top' command). I then tried running only the following command manually:

root@server:~# exec su -c "/usr/lib/postgresql/9.1/bin/postgres -D /var/lib/postgresql/9.1/main -c config_file=/etc/postgresql/9.1/main/postgresql.conf" postgres

And my ssh session simply disconects not returning anything. If I reconnect and again check running jobs, pgsql is still not running. So I tried running the command without 'exec' and here is the response:

root@server:~# su -c "/usr/lib/postgresql/9.1/bin/postgres -D /var/lib/postgresql/9.1/main -c config_file=/etc/postgresql/9.1/main/postgresql.conf" postgres

2012-12-03 19:31:36 MSK FATAL:  could not create lock file "/var/run/postgresql/.s.PGSQL.5432.lock": No such file or directory

I assume the problem is related to postgresql itself not upstart system. I suppose the file it mentions should exist so it can be accessed but it doesn't for some reason. did someone else stumble upon this, or has a potential solution to this?


Solution 1:

I had the same desire to configure pg this way. For me, I wanted multiple clusters, each with their own independent scheduler (pgagent). When I shut down an individual cluster pgagent will stop automatically, but when I start a cluster I want pgagent to start automatically for that cluster also. If I forget to start the scheduler when I start a cluster, I'm in trouble.

I had Googled around, but never found a good solution to running PostgreSQL under Upstart. Most of the solutions explicitly started the postmaster instead of using the pg_wrapper commands. With the way Upstart works, this seems dangerous and could result in data loss in rare situations.

Thus, I forged ahead and tried to create my own Upstart scripts that would do the job. I found that it was very difficult to capture the correct PID of both the cluster and its pgagent instance. Eventually however, I realized that with PostgreSQL, you don't actually care about PIDs. You care about versions and clusters. Once I realized that, it all came together, and I created the following three scripts:

The first I call pg_versions.conf.

description "PostgreSQL Version Controller"
author "Brian Myers"

start on runlevel [2345]
stop on runlevel [016]

env DEFAULT_VERSIONS="9.3"

pre-start script
  if [ -z $VERSIONS ]; then
    VERSIONS=$DEFAULT_VERSIONS
  fi
  for version in $VERSIONS 
  do
    for cluster in $(pg_lsclusters -h | grep $version | cut -d" " -f 2) 
    do
      if [ `tail -1 /etc/postgresql/$version/$cluster/start.conf` = "auto" ]; then
        start pg_cluster version=$version cluster=$cluster
      fi
    done
  done
end script

post-stop script
  if [ -z $VERSIONS ]; then
    VERSIONS=$DEFAULT_VERSIONS
  fi
  for version in $VERSIONS 
  do
    for cluster in $(pg_lsclusters -h | grep $version | cut -d" " -f 2) 
    do
      stop pg_cluster version=$version cluster=$cluster
    done
  done
end script

Next is pg_cluster.conf.

description "PostgreSQL Cluster Controller"
author "Brian Myers"

instance $version-$cluster

pre-start script
  if [ `pg_lsclusters -h | grep $version | grep $cluster | cut -d" " -f 4` = "down" ]; then
    pg_ctlcluster $version $cluster start || :
    start pg_agent version=$version cluster=$cluster || :
  fi
end script

post-stop script
  if [ -e "/var/run/postgresql/pgagent-$version-$cluster.pid" ]; then
    stop pg_agent version=$version cluster=$cluster
  fi
  if [ `pg_lsclusters -h | grep $version | grep $cluster | cut -d" " -f 4` = "online" ]; then
    pg_ctlcluster $version $cluster stop
  fi
end script

And finally pg_agent.conf.

description "PgAgent Controller"
author "Brian Myers"

instance ${version}-${cluster}

setuid postgres

pre-start script
  PORT=`pg_lsclusters -h | grep $version | grep $cluster | cut -d" " -f 3`
  if [ -z `psql -c "select schema_name FROM information_schema.schemata WHERE schema_name = 'pgagent';" -d postgres -p $PORT | grep pgagent` ]; then
    stop ; exit 0
  fi
  PGAGENTDIR=`which pgagent`
  PGAGENTOPTIONS="host=/var/run/postgresql dbname=postgres user=postgres port=$PORT"
  start-stop-daemon --start --oknodo --name "pga$version$cluster" --exec $PGAGENTDIR -- $PGAGENTOPTIONS
  pgrep -f "$PGAGENTDIR.+$PORT" > /var/run/postgresql/pgagent-$version-$cluster.pid
end script

post-stop script
  start-stop-daemon --stop --oknodo --pidfile /var/run/postgresql/pgagent-$version-$cluster.pid
  if [ -w /var/run/postgresql/pgagent-$version-$cluster.pid ]; then
    rm -f /var/run/postgresql/pgagent-$version-$cluster.pid
  fi
end script

If you want more than just the 9.3 version, just add the versions to the env DEFAULT_VERSIONS="9.3" line separated by spaces.

With these, I can:

Start all clusters not already running: sudo initctl start pg_versions

Start all clusters for a particular version not already running: sudo initctl start pg_versions version=9.3

Start a particular cluster, auto-starting pgagent for that cluster, but only if the cluster is pgagent enabled: sudo initctl start pg_cluster version=9.3 cluster=main

Start a cluster's pgagent if the cluster is pgagent enabled: sudo initctl start pg_agent version=9.3 cluster=main

Change start to stop to get the inverse behavior. Of course everything starts up on boot and shuts down on stop via pg_ctlcluster, so no data loss. I did have to disable the init.d scripts via bum.

I'm sure these could be cleaned up or done in even better ways. The pg_agent script for example -- I could never figure out why using script or exec couldn't capture the correct PID. Eventually I gave up and managed the pid file myself, but it's still a mystery. It was probably my very soft shell scripting skills.

Note also that if you shut down clusters manually with pg_ctlcluster these Upstart jobs will still show as running even though the associated versions/clusters are not. Not a big deal because you can just restart them with either pg_ctlcluster or initctl, but for that reason I suggest using initctl to control your clusters if you deploy these jobs.

In any case, these work quite well for me.