When is the data in the journal written to the disk?
(1) mentions that "With a journal the file is first written to the journal, punch-in, and then the journal writes the file to disk when ready. Once it has successfully written to the disk, it is removed from the journal, punch-out, and the operation is complete."
So, when I create a file it's written to the journal and written to the disk later. If I create a file of 1MB, them actually 2MB of data is written to the disk, 1 MB to the journal and another to the disk later. This might actually decrease the lifespan of the disk. My question is when is data in the journal transferred to the disk? If it's not done immediately then subsequent reads for the data in the disk is not possible. Also, is the write complete to the user when the data is written to the journal or to the disk?
Also, there is a mention that because of the journaling the defragmentation in some of the file systems is less. How is disk defragmentation related to journal?
(1) http://www.howtogeek.com/howto/33552/htg-explains-which-linux-file-system-should-you-choose/
when is data in the journal transferred to the disk?
Depends on two main things: the file system in use and the physical storage device. XFS uses write barriers. EXT3 uses write barriers, if enabled. EXT4 has barriers on by default. Traditional HDDs use caches. Solid-State Drives may or may not have a cache. Ultimately, it is a combination of the operating system, file system and underlying hardware architecture and specifications that determine when data is persisted on the storage device.
is the write complete to the user when the data is written to the journal or to the disk?
This also depends on the application in use and your operating system. Linux
has the fsync
system call that applications and file systems use to
flush cached data to the physical devices. Not all applications use fsync
to
explicitly flush cached data to storage. You can always issue a sync
command to manually flush file system buffers.
How is disk defragmentation related to journal?
Disk fragmentation affects performance, especially when dealing with large files whose blocks are not contiguous. There are different techniques for mitigating fragmentation. For example, XFS and other file systems use an allocate-on-flush technique to minimize fragmentation.
Some better links for information about journaling are :
Journaling file system
Anatomy of Linux journaling file systems
The later explains the three journaling strategies : writeback, ordered, and data; where ordered is normally the default :
Ordered mode is metadata journaling only but writes the data before journaling the metadata. In this way, data and file system are guaranteed consistent after a recovery.
So, unless you have set your journaling strategy to data mode (also called Journal mode), where both metadata and data are journaled, your disk will not suffer much from the fact that it is journaled.
The journal itself is allocated on a fixed area of the disk, and therefore doesn't add to the fragmentation. Some filesystem variants will also let it grow and shrink, so some fragmentation may occur.
On a journaling file-system, fsck will normally run the journal automatically, and if the filesystem is otherwise clean, will skip doing a full filesystem check.