Should I expect programs that run from a tmpfs folder to run faster? (with and without I/O, inside and outside a Docker container)
Solution 1:
Running on tmpfs will be faster only if you have a lot of disk I/O that isn't fulfilled from page cache.
If the I/O is reads and the files are already in cache, tmpfs will make no difference.
If the I/O is async writes without flushing and the workload is bursty rather than continuous and there is enough cache to soak up a burst of writes and the writes can be flushed in the background between the bursts, tmpfs will make no difference.
If there is no disk I/O to be done, tmpfs will make no difference.
If you have a lot of disk I/O and your process is blocked iowaiting for storage to catch up, running on tmpfs will make a huge difference. The slower your disks, the more difference it will make, right up to the point where you become CPU bottlenecked.
Solution 2:
(4) The executable is run from within a Docker container, and the executable and all files involved are in a tmpfs folder created from within the container, under an "internal" drive (that does not persist after the container is halted).
(5) The executable is run from within a Docker container, and the executable and all files involved are in "regular" disk folders.
Tmpfs aside (which has already been addressed) you will not notice any difference between running executables on the host and from within a Docker container. Same with volume storage. Docker runs both using an overlay file system and there is probably a very small performance hit for the overlay but the host sees the executable running as a process just like if it was started on the host.
Solution 3:
Realistically, there should be near-zero or zero difference. There may be valid reasons for doing such a thing, but I daresay for 99.9% of all cases it's an anti-optimization.
Normal I/O will go to the buffer cache, and dirty pages will be written out asynchronously by a kernel thread. Bandwidth = RAM speed, delay = RAM delay (plus, maybe a few µs for a page fault here and there).
When the system runs out of physical RAM, you will block (obviously), there's no other way. When there are no pages left that you could write to, you will have to wait until some are being freed. There is no other way.
Running out of RAM is, however, by definition, not likely (or possible) to happen in your particular case. Otherwise, if there is an actual possibility of running out of RAM, the idea of creating a tmpfs would be quite stupid in the first place.
I/O to tmpfs will go to... the buffer cache. Yes, now it runs under a different name, but in reality it is exactly the same, due to unified VM system. Bandwidth and delay are thus exactly the same (give or take maybe 1% difference due to filesystem subtleties which may be slightly different). No writeback is happening, yes. But who cares, seeing how it would be happening asynchronously, anyway. On the contrary, now you do not have the luxury of someone freeing dirty pages in the background, so the likelihood for hitting the ceiling, if that's possible, is actually higher.
When the system runs out of physical RAM, you're still only writing to pages mapped in memory, so in theory you should be able to do that faster. However, in practice, there is still only so-and-so-much phsyical RAM, and you've just run out of it! Sure, there may still be room in your tmpfs, but for some reason, you also need to write a memory page otherwise, and there are none left.
So the kernel must do something to keep the system running. It has to somehow rearrange its resources to mitigate the problem (possibly swap out the docker process and the whole container?!). The kernel will try hard, but it's not like it can perform magic, it has to do something, and its choices are limited. Evenmoreso as you've deliberately taken away a huge number of pages that it could otherwise use freely as needed for any purpose (including the request that must be served right now).