Backup & Recovery

How to Test If Your Backups Actually Work

A completed backup job and a working restore are two different events. Most businesses have only ever seen the first. Here is how to find out which one you actually have.

MGMarcus Grigg8 min read

The short version

  • A 'backup successful' status only proves the job finished writing data. Whether that data is complete and recoverable is a separate question the dashboard never answers.
  • The only honest test is a restore: recover into an isolated location and confirm the data opens and is complete. If you cannot name the date you last did that, your backups are unproven.

How to test if your backups are working

The fastest way to find out whether your backups work is to stop trusting the word 'successful' and run a restore. Pick something that would hurt to lose, recover it into a safe, isolated location, and open it. If it comes back complete and usable, your backup works. If you have never done this, you do not have a working backup. You have an untested one, and that is a different thing.

The reason this matters is that two separate events have quietly merged in most people's heads. The backup job running is one. The data being recoverable is another. A backup is a write operation; a restore is a read-and-rebuild. Your dashboard reports on the write. The restore is the part you will actually need one day, and it is the part nobody has checked.

The rest of this guide covers what a success message really tells you, the handful of ways a healthy-looking backup turns out to be worthless, a step-by-step test you can run this week on any system, how often to test, and what it looks like when the testing is automated instead of done by hand.

What 'backup successful' actually tells you

When the console says 'Success', it is reporting that the job finished and wrote some bytes to a destination, and that is all it is reporting. It is not promising that those bytes will reassemble into a file that opens or a database that mounts. The job completing and the data being recoverable are two separate claims, and only the second one ever saves you.

Look at what the backup software can and cannot see. It can confirm it copied data from A to B without the transfer erroring out. What it usually does not do is read every block back and rebuild it to confirm the result is whole, because that takes about as long as the backup itself, and almost nobody sizes their hardware for it. So a green tick means the data was sent. It says nothing about whether the data arrived in one piece and will come back when you call for it. The gap between backup success and backup verification is the whole subject of this guide.

Why a backup can succeed and still fail to restore

Backups rarely fail loudly. They fail quietly, in ways the dashboard cannot see, and they keep reporting good health right up until the morning you need them. Four catch people out more than the rest, and at least one is probably true of some setup you are responsible for.

  • A job that silently stopped, or never picked up the new server you stood up last quarter, and kept reporting its last-known status. The dashboard stayed green because it was still succeeding at backing up the thing it knew about. Nobody noticed the thing it did not.
  • Microsoft 365 data nobody was ever backing up. Under the shared-responsibility model Microsoft keeps the platform running, but recovering your content is your problem, and recycle bins and retention only give you short-term operational recovery. Plenty of teams assume Microsoft has this covered. Mostly it does not, and it is worth reading up on the detail: whether Microsoft 365 backs up your data.
  • Ransomware that reached the backups too. Many modern strains hunt for backup repositories and delete or encrypt them before they trigger the visible attack, because an untouched backup defeats the ransom. The day you go to recover is the day you learn the backup was encrypted along with everything else.
  • A restore that technically runs but comes back missing or corrupted in exactly the place that mattered. The files can be present and the count can look about right, and then the one finance folder will not open or the database will not mount. That kind of corruption only shows up on read.

Schrodinger's backup: unknown until you open the box

There is an old line in the trade that captures this better than any audit checklist: the condition of any backup is unknown until you attempt a restore. People call it Schrodinger's backup. Until you open the box and pull the data out, it sits in both states at once, and any confidence you have in it is really just a guess. A green tick does not settle the question. Only a restore does. So here is the question worth asking: when did you last restore a backup and watch it come back clean? If you cannot answer with a date, the box is still closed.

Until someone restores it, a backup is just a hope with a file size.

How to test a backup restore, step by step

You do not need a consultant or a project to find out where you stand. You need an afternoon and a willingness to be a little uncomfortable with the answer. This works whoever your provider is, and it is deliberately tedious, which is rather the point: the tedium is why it does not get done, and why automating it changes things. Run it across both halves of your world, because most businesses have both: Microsoft 365 (Exchange, OneDrive, SharePoint, Teams) and infrastructure (servers, VMs, NAS).

  1. 1Pick a real recovery scenario rather than one convenient text file you already know is fine. Restore a whole server or VM, or a complete mailbox. Test the thing you would actually be panicking about.
  2. 2Restore to an isolated location. Never restore over production. The test should not be able to damage the thing it is testing.
  3. 3Open the recovered data and confirm it is complete and usable. The files should open and the database should mount. A file showing up in a folder list does not mean it opens, so open it.
  4. 4Time it. How long did recovery actually take, against how long you can afford to be down (your RTO)? How much data sat between the last backup and the failure (your RPO)? Measure both rather than guessing.
  5. 5Check scope. Is everything you added recently, the new server, the new starter's mailbox, the migrated share, actually in the backup? This is where the silent gaps live.
  6. 6Write down the date. Then ask your provider one flat question: what is the date of the last successful test restore, and can I see the report? A specific date with a report beats a vague answer.

A long pause, a "we would have to check," or a report nobody has ever shown you tells you as much as a clear answer would. For the infrastructure side specifically, backup and restore testing for servers, VMs and NAS is its own discipline, worth confirming separately from your Microsoft 365 cover.

How often should you test backup restores?

As often as your data changes, which for most businesses means continuously rather than once a year. Plenty of teams test annually for the audit and feel covered, but your data changed this morning and will change again this afternoon. A test from eight months ago proves something about an environment you no longer run, and it says nothing about the servers, mailboxes and shares added since. The longer the gap between tests, the more of your current environment has never been verified at all.

There is a mental model worth carrying here, because the standard moved and a lot of people did not get the memo. The classic 3-2-1 rule (three copies, two media types, one off-site) grew up. The current version is 3-2-1-1-0. The extra 1 is one immutable copy that ransomware cannot alter or encrypt. The 0 is zero recovery errors after verification, which in plain English means you tested the restore and it came back clean. The 0 is the part almost everyone skips, and it is the part that decides whether the other numbers were ever worth anything.

What automated daily testing looks like

Run that audit once by hand and the conclusion is obvious: it is necessary, and it is unsustainable to keep doing manually. Done properly, automated daily testing means every backup gets a test restore, every day, inside an isolated, throwaway recovery environment, with a dated report at the end. That is the bar, and it sounds ambitious only because so few clear it.

The output is a dated line for each system: rebuilt and opened clean on a named date. That is how the proof gets generated rather than just claimed.

The point of daily testing is simple. It keeps the gap between the last proven-good restore and right now under 24 hours, so a silent failure shows up in tomorrow morning's report instead of in the middle of a Friday-night recovery with everyone watching the progress bar.

Daily

How often the restore itself should be tested, beyond the backup job

24h

Longest data goes unproven when every backup is test-restored daily

0

Recovery errors after verification, the '0' in 3-2-1-1-0

The one number worth asking for

There is a single number worth more than all the others, and you can ask for it on Monday morning: the date of your last proven restore. That means the last time data actually came back, complete and usable, from a real recovery, rather than the last time a job finished. If the honest answer is "never" or "not sure," you have already learned the most important thing. The fix is rarely more backups or more storage, since you almost certainly have enough of both. What is missing is proof, and proof comes from one place: someone runs a restore.

It is worth being clear on one distinction, because it is where the testing actually pays off: a backup and disaster recovery are not the same thing. A backup is a copy of your data. Disaster recovery is the tested, timed plan for getting a whole business back on its feet, which is why your last proven restore date, your RTO and your RPO matter more than the raw size of your backup. Testing is the seam where a backup turns into recovery you can count on.

Common questions

How do I test if my backups are working?

Run a restore, do not just check the job log. Recover a real file and a whole machine or mailbox into an isolated location, then confirm the data opens and is complete. If it comes back usable, the backup works. If you have never restored from it, all you know is that the job keeps finishing, and whether the data is recoverable is still untested. The reliable signal is a date: when did you last restore and watch it come back clean?

What is backup restore testing?

Backup restore testing is actually recovering data from your backups and confirming it comes back complete, uncorrupted and usable, rather than just checking that the backup job finished. A backup completing tells you a copy was written. A restore test tells you that copy will actually save you. They are different events, and only the restore test proves recovery.

Why does my backup say 'successful' but the restore fails?

Because 'success' only reports on the backup job. It says nothing about whether the result can be recovered. Common causes are silent data corruption the job never checks for, a job that quietly stopped covering part of your environment, a new server or mailbox that was never added to scope, media or encryption errors that only surface on read, and retention rules that aged out the version you needed. None of these trip the success indicator, which is exactly why they go unnoticed until a restore.

How do I test a backup restore without risking production?

Always restore to an isolated location rather than over the live system, so the test cannot damage the thing it is testing. Pick a meaningful target (a full server, VM or mailbox), restore it into a separate, sandboxed environment, open the recovered data to confirm it is complete and usable, and time how long recovery took. Then record the date so you have a verifiable answer next time someone asks when the backup was last proven.

How often should you test backup restores?

As often as your data changes, which for most businesses means continuously rather than once a year. Annual or quarterly testing only proves the backup worked on that one day and tells you nothing about the systems added since. Automated daily test restores keep the gap between the last proven restore and now to a single day, so a silent failure surfaces in a report within 24 hours instead of in the middle of a real disaster.

Does Microsoft 365 back up my data automatically?

Not in the way most people assume. Under the shared-responsibility model, Microsoft keeps the platform running, but you are responsible for your own data. Native features like the recycle bin and retention policies give you short-term operational recovery. They are not a true point-in-time backup you control. To reliably recover Exchange, OneDrive, SharePoint and Teams data after deletion, corruption or ransomware, you need a separate backup that is itself restore-tested.

Can ransomware destroy my backups too?

Yes. Modern ransomware often hunts for and encrypts or deletes backups before triggering, because attackers know an untouched backup defeats their ransom demand. That is why immutable storage matters (copies that cannot be altered or deleted once written), alongside isolated recovery environments and regular restore testing, so you can prove you have a clean copy to recover from rather than discovering during the attack that your backup was hit as well.

What is the 3-2-1-1-0 backup rule?

It is the modern extension of the classic 3-2-1 rule (three copies of your data, on two types of media, with one copy off-site). The added 1 is one immutable or offline copy that cannot be altered or encrypted by ransomware, and the 0 is zero recovery errors after verification, meaning the backup has been tested and confirmed restorable. The final 0 makes restore testing part of the definition of a real backup rather than an optional add-on, and it is the part almost everyone skips.

Is a backup the same as disaster recovery?

No. A backup is a copy of your data. Disaster recovery is the tested, timed plan for getting your business operational again after an incident, including how fast you can recover (your RTO) and how much data you can afford to lose (your RPO). A backup that has never been restore-tested might hand you some data. It will not give you a predictable recovery. Testing is what turns a backup into something you can actually base a recovery plan on.

We test every backup nightly

When was the last time anyone proved your backups restore?

Sahelay restore-tests every backup it manages daily: servers, VMs, NAS and Microsoft 365. The Microsoft 365 copy is held in immutable storage off Microsoft’s platform. Every result comes with a date. If you would rather find your gaps before an emergency does, a 14-day trial puts your own data under that regime so you can watch it work. Nothing has to change to start it.

Or talk to a backup engineer: 1300 806 115