How do megasites like Youtube perform backups?
How do megasites like Youtube perform backups? According to https://www.quora.com/Where-does-YouTube-store-so-many-videos, 2014 they stored 76 PB every year, a number that most certainly has increased a lot since then. Is it even possible to backup this or do they solve the problems backups usually are used for in some other way (such as so much redundancy that it simply can't fail)?
With an enormous number of tapes and disks. And talented site reliability engineers to continuously improve recovery processes.
Read Google's SRE chapter Data Integrity: What You Read Is What You Wrote. The wisdom on recovery is appropriate in many environments. Replication and redundancy are not recoverability. Deliver a recovery system, rather than a backup system
The chapter also discusses challenges in doing so at scale. Absent a YouTube recovery case study, unclear as to if YouTube has full backups to external media. But there is a Google Music case study. It chronicles restoring a half million tracks required getting 5,000 tapes, plus a second stage to fix gaps and failures.
To me, the most impressive part was less the vast numbers of tapes, and more the custom development done. To identify and find a half million files, recover failed tapes from parity, and to tell user client software to upload missing tracks.
(such as so much redundancy that it simple can't fail)
This is dangerously wrong. Anything can fail or be accidentally deleted. And enormous parallel systems are not simple. A conclusion from that SRE chapter:
Recognizing that not just anything can go wrong, but that everything will go wrong is a significant step toward preparation for any real emergency.