Why is my RAID1 read access slower than write access?
Solution 1:
poige is exactly right about the write cache, but here are more details.
dd with zeros and using write cache is not the right way to benchmark (unless you want to test the write cache of course, which is probably only useful for a file system, to see how much it syncs metadata, creates new files, etc.) (and likely dd is always the wrong type of benchmark, but it works for a very basic test)
I suggest you use dd with at least one the following options:
conv=fdatasync -> this will make it flush to disk before finishing and calculating speed
oflag=direct -> this will make it skip the OS cache but not the disk cache
conv=sync -> more like skipping the disk cache too, but not really ... just flushing it every block or something like that.
And don't use zero either. Some smart hardware/software/firmware might use some shortcuts if the data is so predictable as zeros. This is especially true if there is compression which I am guessing you aren't using. Instead, use a random file in memory (such as /dev/shm). urandom is slow, so you need to write it somewhere temporarily to read it again. Create a 50MB random file:
dd if=/dev/urandom of=/dev/shm/randfile bs=1M count=50
Read the file many times to write it (here I use cat to read it 6 times):
dd if=<(cat /dev/shm/randfile{,,,,,}) of= ... conv=fdatasync
rm /dev/shm/randfile
Also keep in mind that raid1 reads are fastest with parallel operations, so the disks can be used independently. It's probably not smart enough to coordinate the disks to read different parts of the same operation with different disks.
Solution 2:
The key to the answer to your question is read-ahead. Once upon a time, I also happened to have that issue.
IOW, for optimal sequential read performance all disks should be permanently involved into Input.
When you use dd
w/o directio
(see man dd
), write operation is not being performed immediately, but goes through OS cache, so it has more chances to involve all the disks sequentialy and achieve maximum possible performance.