There’s nothing worse than losing your precious images due to some kind of failure in your computer. Luckily, even the simplest backup system can save you from such an incident by letting you restore lost data.
However, with today’s high-resolution cameras, it is easy to accumulate multiple terabytes of photos, videos and other files in a relatively short amount of time. The problem is that most standard backup solutions have a really hard time backing up large volumes of data effectively, in a reasonable amount of time, and without impacting your work during a backup.
In this post, I will show you a simple setup and a tool that makes backing up terabytes of images, videos and other data fast and automatic.
Warning: This post is a long one, and there is a reason for that: I give you concrete recommendation for adequate backup hardware and the complete, detailed workflow for backing up and restoring your terabytes of photos (and any other file you may have) – step by step.
Most other backup guides leave you scratching your head wondering how exactly to proceed. This guide is different. Just follow it step-by-step and your backup pains are history.
How an effective backup system should work
Basically, a data backup consists of multiple copies of your data that are kept up to date regularly and are kept in different geographic locations. A standard backup system would consist of the following elements:
- The main storage device: This may be a hard drive built into your computer or laptop where all the data is stored that you use in your daily work. It may also be one or more external hard drives or a mix of different devices.
- The on-site backup: Typically, this is some external storage device, e.g. a USB hard drive, onto which you copy all your data in regular intervals.
- The off-site backup: People often advocate the usage of online services like CrashPlan or BackBlaze for backing up files online to avoid total data loss in case of some catastrophic event like fire or theft. If that is not an option, you can also use another external hard drive that you keep in location that is physically separated from your main storage device.
Figure 1: A minimal backup system
A backup strategy may be that you copy your files (including your photos) to your on-site backup storage once a day and to the off-site backup storage once a week.
This can work well as long as you have relatively small amounts of data and as long as your data does not grow at high rates. Unfortunately, even if you’re a hobbyist, you may fail to fulfill these requirements.
What’s the challenge?
Here’s the thing: The likelihood that you do your backups regularly decreases with a) the time each backup takes, b) the cost it incurs, and c) the amount of effort it takes.
So, you need to make it quick, inexpensive and automatic to actually make it work. If it’s slow, expensive and requires tons of manual work, you may end up having a clever backup system that you never use. Go figure what will happen…
Let’s look at the factors of time and cost more closely to give you a feel for the nature of the problems you’re facing when you try to back up terabytes of data:
As a running example, I will assume that you have 3 TB of data to back up. With a 24 megapixel camera that produces raw files of 30 MB each, that’s roughly 100,000 photos.
You might think that today’s technology is advanced enough to enable fast transfer at an acceptable price point to rule out both time and cost as a factor. But it’s not… by a long shot!
Here are some ballpark figures for the time and cost required to back up our 3 TB of data today using different technologies:
The good news is that the numbers for options 2 and 3 look reasonable. Options 1 and 2 are too expensive. Option 5 is only relevant for older computers that lack a much faster USB 3.0 port. Finally, option 6 is just infeasible as it’s way too slow.
The bad news is that the transfer speeds used above are only attainable if you transfer all the data in one go, as a single file. As soon as you transfer 100.000 smaller files, your effective transfer speed will drop drastically.
With options 3 and 4, you’re facing a duration of 12 to 36 hours for a full backup depending on the nature of your data.
What have we learned so far:
- Lesson 1: Online backup is useless and expensive for large volumes of data.
- Lesson 2: Your backup is going to take very long unless…
So, what are your options again?
After juggling all of these numbers around, let’s boil it down to some recommendations with respect to your backup system:
- Your best bet is to get at least one fast external USB 3.0 hard drive to create a backup. For 3 TB of data, I would recommend a ‘Western Digital 4 TB My Book Duo’ or a similar device. Alternatively, you could get a single 4 TB external drive from any manufacturer for about half the money at a much lower speed.
- You should make sure that you are using USB 3.0 to connect your device to the computer. Alternative technologies are Thunberbolt, Firewire and eSata. But those are more exotic and may be more difficult to handle.
- You need a backup tool that uses the bandwidth of your backup channel (e.g. USB 3.0) smarter than just pushing all the files across for every backup (full backup).
Which backup software should you use?
Mac users may still find the following points interesting because this is also about the principles of clever terabyte backup which hold on both types of systems.
If you are a Mac user and you know of or use a software that operates using similar principles, please let me know below in the comments. I’d be really interested to learn about it.
If you type ‘backup software’ in Google, you’ll be amazed… and confused. There are hundreds of tools out there that seem to do the job. But they do it in vastly different ways. Unfortunately, picking the wrong one will cause problems since many of the concepts found in mainstream backup software are difficult to apply to terabyte-sized data.
For example, you don’t want the software to compress your data. This would take extremely long, it would complicate or even prevent direct access. In the case of RAW photos, JPEGs and videos, compression would not safe a lot of space anyway.
For huge amounts of data, it is best to chose a simple backup software that offers the following features:
- 1-to-1 representation of your data: The software should not attempt to encapsulate your data in some kind of archive or change its format in any way. It should create a plain, direct copy of it. This ensures that you can restore the data directly with any file management tool, that you’re not making yourself dependent on the backup software, and that the data is available quickly after a failure. You can even work with the files on your backup directly and immediately if absolutely needed.
- Fast incremental backup: An ‘incremental backup’ is a process by which only the changes to your files are actually backed up once you have completed a full backup (copied all your data onto the backup device initially). This is huge! With an incremental backup, you won’t have to wait 12 – 36 hours each time you trigger a backup. However, the software typically still has to check each and every file to see which ones have changed. Even if nothing has to be backed up, this process can still take a few hours. A ‘fast incremental backup’ (name invented by me) goes one step further and eliminates even those checks. It runs in the background as you’re working and keeps track of file changes. When you tell it to update the backup, it already knows what the changes are, and it only needs to transfer these changes without running further checks.
My favorite backup tool – MirrorFolder
The only tool I found that fulfills these requirements and much more is called MirrorFolder. It is actually a file synchronization tool to be correct, but the boundaries are a bit blurred here.
It runs quietly in the background and monitors all file operations. You can set up a backup schedule very flexibly. You can run a real-time backup, but you can also let MirrorFolder run a backup in regular intervals or upon connecting your backup device to the computer. All these different backup strategies can be mixed flexibly.
Besides the straight-forward file synchronization, the software can also create ZIP archives containing the original versions of any changed or deleted file. This allows you to restore files that where accidentally changed or deleted.
The software has many options that let you take full control of the backup process.
How to set up a backup with MirrorFolder
Download the MirrorFolder software from http://www.techsoftpl.com/backup and install it. After the installer has finished, you will have to restart your computer to complete the installation. You can use the software for free without restrictions for 30 days. A license will cost you $39.
You can create a backup as follows:
Figure 3: Options for a backup that runs automatically when the backup device is connected.
Note that, in the ‘Archive’ tab, I checked the options ‘Backup files before overwriting during synchronization’ and ‘Backup files before deleting during synchronization’. This tells MirrorFolder to archive any files that are going to be updated or deleted on the backup device during a backup. Using these options will enable you to roll back any changes that occurred on your data. If you want to limit the space required by these backup archives, you can set the option ‘Delete backup files older than’ to a specific number of days. At the very bottom of the ‘Archive’ tab, click the ‘Browse’ button to create or select a subfolder on the backup device where the archives should be stored. IMPORTANT: Choose an archive folder that is not contained in the destination folder for the backup.
Figure 4: Adding files to be excluded from your backup
The software starts the backup in the background and the column ‘State’ in the ‘Mirrors for’ section (Figure 2-6) will list the mirror as ‘Synchronizing…’. When the synchronization has finished, it will switch to ‘Synchronized’. Depending on how much data you have, and the speed of your devices and connection, it may take many hours before all the data has been backed up initially. But you can continue working on your computer without any restrictions. MirrorFolder is designed to deal with that.
Note that I chose to have my data automatically backed up when I connect the backup device and every 2 hours as long as the device is connected. This is very convenient, as the work I need to invest for my backups is reduced to plugging and unplugging my backup device(s).
As soon as the backup has finished, the software will notify you. At this point it is generally a good idea to unplug the backup device from the computer. Make sure you remove it safely using Windows’ ‘Safely Remove Hardware and Eject Media’ feature in the system tray (Figure 5). Then, physically unplug the backup device. If you leave the device connected, the data may be exposed to accidental or malicious deletion or alteration.
Figure 5: Removing the backup device safely.
How to restore your data
There are different cases in which you will want to restore all of your data or only part of it. Let’s take a closer look at each of these cases.
Main storage device failure – Full restore
If your main storage device has a hardware failure, you should disable the backup in MirrorFolder to prevent the backed-up data from being altered in any way. The data to be backed up is gone for the moment, so attempting to back it up anyway is not useful.
Thanks to the flexibility of the MirrorFolder software, there are many ways in which you can restore your data fully or partially. Here, I will give you the procedure for doing a full restore.
Before you can restore your data, you need to get a replacement for your main storage device (e.g. a new internal hard drive for your computer) and install the hardware.
To allow for a seamless migration, change the drive letter of the new main storage device to the letter that was formerly assigned to the old main storage device.
Here’s how to do this:
Now you can start to restore your data:
Now, you should be able to continue working.
Correcting accidental deletions on the spot
If you accidentally delete a set of files and you notice it right away, chances are that these files are still present on one of your backups. To restore the files, do the following:
Recovering lost files from an up-to-date backup
If files were removed accidentally and you did not notice it until much later, these changes will already have been propagated to the backup. But don’t worry. Your files are still available in the backup archives.
To be able to search effectively for lost files in the backup archives that MirrorFolder created for you, you should first tell Windows to include the contents of ZIP archives when you search for files in the file explorer app. Do the following:
Figure 11: Opening the Windows ‘File Explorer Options’
Now, proceed as follows to find the lost files:
In this post, I have shown you a very simple way to back terabytes of data without investing a lot of money. We’ve talked about the different types of backups you may want to establish, how to create a bare-bones system that simply works, and how to use a software called MirrorFolder to automate large parts for the backup process. No matter how much data you have, MirrorFolder is only going to back up those files that changed without even going through and checking all of them. This is extremely efficient and quick. It reduces the work that goes into backing up your data to plugging and unplugging a backup drive once a day, once a week, or once a month.
So, what are you waiting for. Do it right now? Let me know how its going and if you have any questions.
…and if you’re a Mac user, I’d love to hear from you too. Which backup solution are you using? Time Machine? Some other app? Do you know or use software tools that do similar things as MirrorFolder does? I’d be happy to hear about it.