Why VM snapshots are affecting performance?

I read in one of the VMware KB articles that snapshots will directly affect VM performance.

But my team keeps asking me how snapshots can affect performance.

I would like to give them solid reason behind the statement the snapshots are performance killers.

Can anyone explain a little bit theory about how snapshots are actually affecting the performance? Is it just because Disk I/O rate of hard disk would be slow?


When you create a snapshot, the original disk image is "frozen" in a consistent state, and all write accesses from then on will go to a new differential image. Even worse, as explained here and here, the differential image has the form of a change log, that records every change made to a file since the snapshot was taken. This means, that read accesses would have to read not only one file, but also all difference data (the original data plus every change made to the original data). The number increases even more when you cascade snapshots.


When you create a snapshot on a VM this creates a Delta Disk and the operating system writes to this file instead of the original VMDK. This file is called VM_Name-Delta.VMDK but if the system needs to refers to a file before the snapshot it will refers to VM_Name.VMDK increasing the I/O of this operation. If you take multiple snapshots you are referring to the last delta file of the last snapshot not the original VMDK thus increasing I/O.

Example.

OS ---> Snapshot (File A Created) ---> (Snapshot File B Created)

If I need to refer to File A it will be looking through 3 VMDK's to find this.

Also if you include the memory state of the VM at the time of snapshotting this creates a this again is a delta file and refers to the original memory files if needed.

A file is created this lists all the files created at the time of the snapshot process


As far as I can tell, VMWare is using copy-on-write logic to implement their snapshots. Therefore, when you create one, every operation done on your VM (eg. almost everything in runtime) would cause a little bit of the VM to be copied until the whole thing was essentially cloned.

Another performance issue with this is that reads would have to cascade to the original copy if the working copy doesn't yet have data (because nothing changed to cause a copy).

If you want to have the snapshots as a backup but can't tolerate a small performance decrease, consider cloning the VM instead.