Delete Your Old VMware Snapshots

For the love of Pete, please delete your old snapshots regularly!

Old snapshots have caused incidents and even outages more than once in my career and it is really easy to preemptively look for them and get them removed before anything happens.

Why#

To put it plainly, they can cause issues - like 03:00 in the morning pager alert issues and additionally eat up storage space like crazy.

  • Degraded performance of the VM having the snapshot
  • Degraded performance to full outages for other VMs on the same data store due to rapidly increasing snapshot sizes

VMware recommends a series of steps to reduce risk when using snapshots:

  • Do not use snapshots as backups.

    The snapshot file is only a change log of the original virtual disk, it creates a place holder disk, virtual_machine-00000x-delta.vmdk, to store data changes since the time the snapshot was created. If the base disks are deleted, the snapshot files are not sufficient to restore a virtual machine.

  • Maximum of 32 snapshots are supported in a chain. However, for a better performance use only 2 to 3 snapshots.

  • Do not use a single snapshot for more than 72 hours.

    The snapshot file continues to grow in size when it is retained for a longer period. This can cause the snapshot storage location to run out of space and impact the system performance.

  • When using a third-party backup software, ensure that snapshots are deleted after a successful backup.

    Note: Snapshots taken by third party software (through API) may not appear in the Snapshot Manager. Routinely check for snapshots through the command-line.

Find Old Snapshots#

Given the recommendation to have no snapshots older than 72 hours (3 days) we can create a simple one-line script to check for those:

PS> Get-VM | Get-Snapshot | Where {$_.Created -lt (Get-Date).AddDays(-3)} | Select-Object VM,Name,Created
VM      Name         Created
--      ----         -------
win2016 Testing Stuff Monday, 28 December 2020 03:30:17
win2019 Patchday     Tuesday, 29 December 2020 06:30:38

Delete Old Snapshots#

Warning

Please handle the next command with care, as it will irreversibly delete snapshots inside any connected VMware environment!

We simply take the command from above and pipe to Remove-Snapshot primed to not asked for any confirmations.

Get-VM | Get-Snapshot | Where {$_.Created -lt (Get-Date).AddDays(-3)} | Remove-Snapshot -Confirm:$false

Automation#

I wanted to spend a few words on how to introduce such a automation into your organization as the technical implementation is not really challenging.

You can throw together a config file with a user and your vCenter servers using Export-Clixml/Import-Clixml and schedule it on one of your windows servers using task scheduler or with cron and PowerShell Core on Linux as the full functionality of the script is around 1 line as seen above.

If there never was such an automation inside your organization it could lead to a lot of problems if you just enabled it from the get go.

I usually try to avoid this by creating the script and running it in “read-only” mode, meaning I report on what snapshots would have been deleted but don’t actually delete them. Most times someone will come to you within the grace period and have some really good reason to exclude one or multiple VMs from this snapshot purge, be prepared to create this kind of blacklist if there are actually good reasons for long lived snapshots.

After a few weeks of getting that daily report to everyone concerned I enable the script to actually delete snapshots.