I have a replication group with two 2k8r2 servers as partners in a master<->master replication (so, both sides are always in sync and can update each other). This group is replicating two folders both used by IIS to share files. I have a webapp that dynamically generates output files and these servers are load balanced, so I need to have the output in both folders in basically real-time.

Now, it seems like replication hasn't worked since the initial replication with no indication of why. Event logs have no activity since the day it was set up about a week ago.

Output of dfsrdiag replicationstate /member:server2:

C:\Users\fnc>dfsrdiag replicationstate /member:server2 /v
[INFO] Computer Name: server2
[INFO] Computer DNS: server2.domain
[INFO] Domain Name: domain
[INFO] Domain DNS: domain.domain
[INFO] Site Name: Default-First-Site-Name
[INFO] Connected to WMI services on computer: server.domain
[INFO] Issuing query: SELECT * FROM DfsrConnectionInfo
[INFO] Issuing query: SELECT * FROM DfsrIdUpdateInfo
[ERROR] Failed to execute WMI query

[INFO] Execution Time: 0 seconds
Operation Failed

Server 1 returns the same error.

Output of dfsrdiag backlog /RGName:NameOfFolder /RFName:"Outputs" /SMem:server1 /RMem:server2:

Member <server2> Backlog File Count: 32558
Backlog File Names (first 100 files)

Operation Succeeded

I don't understand what's going on. I have another pair of servers with DFSR replication serving our corporate file shares just fine, so this is puzzling.


Solution 1:

The issue turned out to be that I cloned the VMs resulting in a duplicate volume ID. After rebuilding DFSR on both partners and changing the volume ID of one of the servers, replication is now again working.

I also had an issue with a stale namespace which required a registry key to be deleted to fix.