Why do files end up being filled entirely with null bytes after replication using DFSR?
We're using Microsoft's Distributed File System for replication. In our scenario we have a single writer that creates/overwrites/deletes files and several distributed readers. The writer is running Windows Server 2008 R2 Enterprise x64 SP 1, the readers are running Windows Server 2003 R2 Standard Edition x86 SP 2. Some of the readers run DFSR with version 5.2.3790.4656 and some others with the patched version 5.2.3790.4799.
Files are written using System.IO.File.WriteAllText
and writes can occur in quick succession on one and the same file.
Most of the time replication works fine but sometimes files end up being filled entirely with null bytes after replication. We checked the DFSR logs on all involved machines and found the following kind of usn journal entry reports on the readers:
20150302 11:05:00.498 2512 USNC 2202 UsnConsumer::UpdateIdRecord ID record updated from USN_RECORD:
+ USN_RECORD:
+ RecordLength: 80
+ MajorVersion: 2
+ MinorVersion: 0
+ FileRefNumber: 0x800000000f7c8
+ ParentFileRefNumber: 0x31000000152806
+ USN: 0x872e876720
+ TimeStamp: 20150302 11:05:00.498 CET
+ Reason: Basic Info Change Close Rename New Name
+ SourceInfo: 0x4
+ SecurityId: 0xebe
+ FileAttributes: 0x2220
+ FileNameLength: 18
+ FileNameOffset: 60
+ FileName: xyz.txt
What got us wondering is that there are usn journal entry reports at all (the readers are just supposed to read but not to modify anything) and the fact that the sparse attribute is set.
To find out whether there's some process performing unexpected writes or doing something suspicious we monitored filesystem activity with Process Monitor. The next occurance of null-byte files on a reader got us the following:
10:59:55,2311121 Dfsr.exe 1584 760 IRP_MJ_CREATE path\to\xyz.txt-{GUID}-vVERSION SUCCESS Desired Access: Generic Read/Write/Execute, Write DAC, Write Owner, Access System Security, Disposition: Create, Options: Sequential Access, Synchronous IO Non-Alert, Complete If Oplocked, Open For Backup, Open No Recall, Attributes: N, ShareMode: Read, Write, Delete, AllocationSize: 0, OpenResult: Created
10:59:55,2312485 Dfsr.exe 1584 760 IRP_MJ_CLEANUP path\to\xyz.txt-{GUID}-vVERSION SUCCESS
10:59:55,2313007 Dfsr.exe 1584 760 IRP_MJ_CLOSE path\to\xyz.txt-{GUID}-vVERSION SUCCESS
10:59:55,2314394 Dfsr.exe 1584 760 IRP_MJ_CREATE path\to\xyz.txt-{GUID}-vVERSION SUCCESS Desired Access: Read Attributes, Write Attributes, Synchronize, Disposition: Open, Options: Sequential Access, Synchronous IO Non-Alert, Open For Backup, Open Reparse Point, Open No Recall, Attributes: N, ShareMode: Read, Write, Delete, AllocationSize: n/a, OpenResult: Opened
10:59:55,2314626 Dfsr.exe 1584 760 IRP_MJ_FILE_SYSTEM_CONTROL path\to\xyz.txt-{GUID}-vVERSION SUCCESS Control: FSCTL_MARK_HANDLE
10:59:55,2314780 Dfsr.exe 1584 760 IRP_MJ_QUERY_INFORMATION path\to\xyz.txt-{GUID}-vVERSION SUCCESS Type: QueryNameInformationFile, Name: path\to\xyz.txt-{GUID}-vVERSION
10:59:55,2314996 Dfsr.exe 1584 760 FASTIO_QUERY_INFORMATION path\to\xyz.txt-{GUID}-vVERSION SUCCESS Type: QueryBasicInformationFile, CreationTime: 10.03.2015 10:59:55, LastAccessTime: 10.03.2015 10:59:55, LastWriteTime: 10.03.2015 10:59:55, ChangeTime: 10.03.2015 10:59:55, FileAttributes: ANCI
10:59:55,2315081 Dfsr.exe 1584 760 IRP_MJ_QUERY_INFORMATION path\to\xyz.txt-{GUID}-vVERSION SUCCESS Type: QueryAttributeTagFile, Attributes: ANCI, ReparseTag: 0x0
10:59:55,2315194 Dfsr.exe 1584 760 IRP_MJ_QUERY_INFORMATION path\to\xyz.txt-{GUID}-vVERSION SUCCESS Type: QueryCompressionInformationFile
10:59:55,2315391 Dfsr.exe 1584 760 IRP_MJ_QUERY_VOLUME_INFORMATION path\to\xyz.txt-{GUID}-vVERSION BUFFER OVERFLOW Type: QueryInformationVolume, VolumeCreationTime: 14.07.2014 14:59:54, VolumeSerialNumber: 88F0-15DC, SupportsObjects: True, VolumeLabel: uvw
10:59:55,2315481 Dfsr.exe 1584 760 IRP_MJ_QUERY_INFORMATION path\to\xyz.txt-{GUID}-vVERSION BUFFER OVERFLOW Type: QueryAllInformationFile, CreationTime: 10.03.2015 10:59:55, LastAccessTime: 10.03.2015 10:59:55, LastWriteTime: 10.03.2015 10:59:55, ChangeTime: 10.03.2015 10:59:55, FileAttributes: ANCI, AllocationSize: 0, EndOfFile: 0, NumberOfLinks: 1, DeletePending: False, Directory: False, IndexNumber: 0xe00000001589e, EaSize: 0, Access: Read Attributes, Write Attributes, Synchronize, Position: 0, Mode: Sequential Access, Synchronous IO Non-Alert, AlignmentRequirement: Long
10:59:55,2316459 Dfsr.exe 1584 760 IRP_MJ_CREATE path\to\xyz.txt-{GUID}-vVERSION SUCCESS Desired Access: Generic Read/Write/Execute, Write DAC, Write Owner, Access System Security, Disposition: Open, Options: Sequential Access, Synchronous IO Non-Alert, Complete If Oplocked, Open For Backup, Open No Recall, Attributes: N, ShareMode: Read, Write, Delete, AllocationSize: n/a, OpenResult: Opened
10:59:55,2316691 Dfsr.exe 1584 760 IRP_MJ_FILE_SYSTEM_CONTROL path\to\xyz.txt-{GUID}-vVERSION SUCCESS Control: FSCTL_MARK_HANDLE
10:59:55,2316796 Dfsr.exe 1584 760 IRP_MJ_CLEANUP path\to\xyz.txt-{GUID}-vVERSION SUCCESS
10:59:55,2316876 Dfsr.exe 1584 760 IRP_MJ_CLOSE path\to\xyz.txt-{GUID}-vVERSION SUCCESS
10:59:55,2317891 Dfsr.exe 1584 760 IRP_MJ_SET_SECURITY path\to\xyz.txt-{GUID}-vVERSION SUCCESS Information: Owner, Group, DACL
10:59:55,2318748 Dfsr.exe 1584 760 IRP_MJ_FILE_SYSTEM_CONTROL path\to\xyz.txt-{GUID}-vVERSION SUCCESS Control: FSCTL_SET_SPARSE
10:59:55,2319307 Dfsr.exe 1584 760 IRP_MJ_WRITE path\to\xyz.txt-{GUID}-vVERSION SUCCESS Offset: 0, Length: 0
10:59:55,2319442 Dfsr.exe 1584 760 IRP_MJ_SET_INFORMATION path\to\xyz.txt-{GUID}-vVERSION SUCCESS Type: SetEndOfFileInformationFile, EndOfFile: 240
10:59:55,2320066 Dfsr.exe 1584 760 IRP_MJ_SET_INFORMATION path\to\xyz.txt-{GUID}-vVERSION SUCCESS Type: SetAllocationInformationFile, AllocationSize: 240
10:59:55,2320382 Dfsr.exe 1584 760 IRP_MJ_WRITE path\to\xyz.txt-{GUID}-vVERSION SUCCESS Offset: 240, Length: 0
10:59:55,2320505 Dfsr.exe 1584 760 IRP_MJ_SET_INFORMATION path\to\xyz.txt-{GUID}-vVERSION SUCCESS Type: SetBasicInformationFile, CreationTime: 16.12.2013 10:57:23, LastAccessTime: 19.02.2015 11:00:25, LastWriteTime: 10.03.2015 10:59:55, ChangeTime: 10.03.2015 10:59:55, FileAttributes: ANCI
10:59:55,2320688 Dfsr.exe 1584 760 IRP_MJ_FILE_SYSTEM_CONTROL path\to\xyz.txt-{GUID}-vVERSION SUCCESS Control: FSCTL_WRITE_USN_CLOSE_RECORD
10:59:55,2321256 Dfsr.exe 1584 760 IRP_MJ_CLEANUP path\to\xyz.txt-{GUID}-vVERSION SUCCESS
10:59:55,2321506 Dfsr.exe 1584 760 IRP_MJ_CLOSE path\to\xyz.txt-{GUID}-vVERSION SUCCESS
Going by this observation it looks like DFSR is the one producing them. Why is it doing that? Also, why does it mark the file as sparse? The writer does not mess with the files beyond writing them using the technique outlined above. Note that the offset where the EOF marker is moved to matches the size of the file as created by the writer.
So far we're unable to reproduce the issue reliably. Does anyone have an idea what's causing this and, more importantly, how to fix it?
Solution 1:
Is your DFS-R staging area on the same volume as the DFS-R folder? For performance reasons, it should be. If not, then DFS-R is having to copy the file from staging volume to the destination volume rather than doing a straight move.
Here's where the speculation kicks in. It may be that during this copy operation, DFS-R is creating a sparse file, and then filling the blocks, and "unsparsing" it when complete. If something interrupts this process (like antivirus, Undelete, or some other file filter driver program scanning the DfsrPrivate folder), then you may end up with a temporary sparse file that doesn't get filled with its contents.
You can test for this behavior by using Process Monitor on the files that are replicating properly and seeing if they are marked/unmarked as sparse at any point in the process.
I'm not a fan of mixing 2008 and 2003 when it comes to DFS-R. I was soooo glad to get the last 2003 machine off our DFS tree.