SQLite on Google Cloud Persistent Disk

Update (30 Aug 2019)

Dqlite by Canonical provides a distributed, high-availability SQLite which may suit this use case. It's open-source under Apache 2.0, and written in C, so it could be a drop-in replacement for SQLite. See also discussion on HN for more context.


Earlier answer

It sounds like from your comments that MySQL and Google Cloud SQL are not an option due to your architecture requiring the use of a single SQLite file.

Also, per SQLite docs, NFS is not an option due to the locking issues.

Here are some other options to consider.

Alternative distributed filesystem

In addition to NFS, there are a number of other distributed filesystems you may want to evaluate, such as Ceph, GlusterFS, OrangeFS, ZFS, etc. In addition to your own research, consider reaching out to SQLite users or developers for guidance and past experiences.

Use NFS, but enforce single writer-at-a-time

The NFS issue appears to be about locking, which is only needed for writes: as long as you can guarantee that only one process has the database locked for writes at-a-time, several other processes can open it for reads so this should be OK (please confirm/verify that this is the case).

Thus, as long as there's an external method to ensuring a single-writer, you may be able to use NFS. Consider using a distributed lock service such as Apache ZooKeeper, HashiCorp Consul or CoreOS etcd for the lock service, and you can store your SQLite on NFS.

This, of course, relies on each process with direct access to SQLite database to properly close it when it no longer needs to write to it, so correctness is hard to enforce, as it relies on all software to be correct and cooperating.

Lightweight RPC server

You mentioned that your architecture (which cannot be changed at this time) relies on SQLite, but if it's possible to have them call an RPC service instead of opening the file directly, you can have that server be the only point of opening the SQLite database and avoid the locking issue from multiple concurrent users. However, this means that you would have to change all the client code to call the RPC service instead of opening the SQLite database directly, which is a non-trivial amount of work.

Conclusion

None of these options are trivial and will require work. The reason is that:

In contrast to many other database management systems, SQLite is not a client–server database engine. Rather, it is embedded into the end program.

As such, it is not the right solution for multiple accessors, and hence a bunch of workarounds are required.

Longer term, if you're in a situation where you would have to make a significant changes to continue to be able to maintain this system, you may want to consider migrating to MySQL or Google Cloud SQL instead of investing into workarounds to continue to use SQLite.


Re: using SQLite for multiple access: the FAQ you linked to says:

(5) Can multiple applications or multiple instances of the same application access a single database file at the same time?

Multiple processes can have the same database open at the same time. Multiple processes can be doing a SELECT at the same time. But only one process can be making changes to the database at any moment in time, however.

which means that you should not use SQLite as a general-purpose database, it's intended to be used as an embedded database for a single writer, though it can also support several readers.

Re: Persistent Disks: see my answer to a related question on Stack Overflow; the short story is that a persistent disk on Google Compute Engine can be mounted either:

  • read-write to a single instance
  • read-only to several instances

Thus, you cannot share persistent disk in read-write mode to several VMs as you are suggesting. If you want a shared database, as yagmoth555 pointed out in the comments, you should use a SQL database, such as MySQL.

Conveniently, Google Cloud SQL provides a managed MySQL instance which you can use from multiple VMs, as it provides proper locking as well as backups, failover, etc.