Linux I/O buffering reference

Solution 1:

Wow. Obviously, you are after very particular pieces and parts to solve a particular problem. Or else you don't realize you just signed away the next week of your life to research. Getting through all of the kernel internals for your topics is going to be painful, so here's what you need to do: ask one, simple question and start tracing. Given a simple, unconfused question to focus on, there are many developers on the Linux Kernel Mailing List who are experts at explaining why the internals behave the way they do in your situation. It may take a couple of rounds, but they can help.

Another method you can use, given a question with a single purpose, is to trace a questionable activity into the kernel and learn about the individual pieces it touches rather then just trying to understand it all. Fortunately for you, there is a command called ftrace or even SystemTap (stap) that will start you on your adventure. Many kernel developers want more people to ask important questions about their kernel and these tools will help them do it. Linux Weekly News has run several articles on ftrace lately: Tracing: no shortage of options (Jul 2008), A Look at ftrace (Mar 2009), Debugging the kernel using ftrace - Part 1 (Dec 2009) and Part 2 (Dec 2009), Secrets of the Ftrace function (Jan 2010), and finally the good old documentation that ships with the kernel (2008).

By using a tracing utility, you learn about how the kernel does buffering and a host of other things unique to your kernel, hardware (controller, chipset, CPU, disk technology), filesystem, IO scheduler. Every distribution is different in this regard. If you have complex storage devices (Cluster FS, SAN with an Enterprise Array, SSD), then be prepared to get your hands dirty learning about their quirks too. A word of warning about cluster fileystems: they frequently involve a userspace component that can cause lots of unexpected delays most of us attribute to the kernel, but are much more complex than that.

By far, the best text I've been able to find so far was written by Neil Brown in 2009 entitled "Linux kernel design patterns". Neil hits a lot of topics you brought up and many more.

One thing I know for sure is that this landscape is continually changing, especially in the scheduling arena. Just try to understand what is going on in your particular corner and count your blessings that you don't have to code to one of those components.