Which network file systems do not implement locking correctly?

Which network file systems do not implement locking correctly?

I believe that the correct answer is in short : "All of them".

This should be qualified to the context of multiple processes on multiple computers accessing the same database over the network. There should not be a problem when all the processes accessing the database are running on the same computer, if the database engine (SQLite) uses exclusion tools such semaphores or mutex (and uses them correctly...).

The reason for this is that the information of what is locked is usually kept in the context of "process X locks Y". This may work very well when all processes execute on the same computer, but not very well when they are on different computers.

When a process from one computer accesses a file on another computer, the local operating system substitutes itself for the remote process, in effect working as an agent for a process that it knows nothing about. It may even have difficulty in distinguishing between two different processes executing on the same remote computer and mistake one for the other.

To correctly lock files, the entire file or a section thereof, would really need one operating system that executes on all of the involved computers with a centralized data repository for file locks. Unfortunately, neither Linux nor Windows can do that in the general case.

You would find a better treaty of the subject for SQLite in the "How To Corrupt Your Database Files" section of the older article of File Locking And Concurrency In SQLite Version 3. This details the system calls SQLite uses to assure concurrency on Windows and Linux, both for locking file sections and for flushing out updated data from the computer to the database. All the listed functions would work well only in the context of one computer. Appropriately, this article contains the sentence "Your best defense is to not use SQLite for files on a network filesystem".

Note that the problem of flushing data to the database is especially worrisome, as the operating system usually delays writes, so you might have the case where the updating process has flushed the data and released its lock, but the new data may not yet have arrived into the database when another process tried to read it, thus ending easily with a corrupted database.

Some more information is found in the SQLite article Atomic Commit In SQLite, section Things That Can Go Wrong. It adds the case where different processes and computers may use different locking mechanisms, where one does not block the other.

In database management systems that solved this problem, such as Oracle or SQL Server, the solution is to designate one specific computer as the only one that can update the database, so locking is much simplified. Other computer network systems such as Apache Hadoop supply their own locking mechanisms to solve these problems.

Another article you would find interesting is On the Brokenness of File Locking.