Is Azure Managed Disks enough to ensure high-durability for a database?
I want to set up a database in a high durability set-up on Azure. I've previously relied on DB-as-a-service offerings, but can't do that in this case, so I'd like your feedback on the plan below. Is this enough to ensure reliable storage of data?
- An Azure Web App takes in metric data from the web, does some minor processing and sampling, and sends the data in batches to VM2.
- VM2 runs the Clickhouse database, and stores data on an Azure Managed Disk
- Some periodical job takes snapshots of the disk and stores them to cold storage
My understanding of managed disks is that they are enough to ensure reliable storage of data, that they should take away any concerns of data loss due to hardware failure. Is this correct?
Another concern is data loss due to human error, i.e. accidentally running "DROP TABLE xx" on the wrong data. I think storing periodical backups takes away this concern (i.e. allows for recovery to the last backup). Do you agree?
The recovery plan is that if VM2 fails, some monitoring process catches this and spins up a new VM2 instance attached to the same managed disk. The Web App similarly restarts if it fails.
I understand that this setup isn't high-availability, if a VM fails there will be some window of time before it is able to store new data. This is acceptable to me. But I want to ensure that data that gets stored will not be lost, i.e. durably stored with very high probability. Is this enough to ensure that? Do you see any problems?
Solution 1:
I'm asking if an Azure Managed Disk works as a substitute for replicating the database, for the purpose of ensuring high durability.
I don't think so. A managed disk offers high availability by eliminating the disk as a single point of failure. But managed disks don't know anything about the resources that are writing to the disk.
Database administrators don't rely on filesystem backups to protect databases from human error; filesystem backups alone can't account for concurrent database use and such. DBAs do rely on backups, just not filesystem backups.
By the same token, managed disks expose a highly available filesystem. As a DBA, I don't think I'd rely on managed disks to replace replication for the same reason I wouldn't use a filesystem backup to replace a proper database backup.