how-to-back-up-terabytes-of-photos-quickly-and-safely-website-featured-image

How to Back Up Terabytes of Photos Quickly and Safely

how-to-back-up-terabytes-of-photos-quickly-and-safely-post-header-imageThere’s nothing worse than losing your precious images due to some kind of failure in your computer. Luckily, even the simplest backup system can save you from such an incident by letting you restore lost data.

However, with today’s high-resolution cameras, it is easy to accumulate multiple terabytes of photos, videos and other files in a relatively short amount of time. The problem is that most standard backup solutions have a really hard time backing up large volumes of data effectively, in a reasonable amount of time, and without impacting your work during a backup.

In this post, I will show you a simple setup and a tool that makes backing up terabytes of images, videos and other data fast and automatic.

Warning: This post is a long one, and there is a reason for that: I give you concrete recommendation for adequate backup hardware and the complete, detailed workflow for backing up and restoring your terabytes of photos (and any other file you may have) – step by step.

Most other backup guides leave you scratching your head wondering how exactly to proceed. This guide is different. Just follow it step-by-step and your backup pains are history.

How an effective backup system should work

Basically, a data backup consists of multiple copies of your data that are kept up to date regularly and are kept in different geographic locations. A standard backup system would consist of the following elements:

  1. The main storage device: This may be a hard drive built into your computer or laptop where all the data is stored that you use in your daily work. It may also be one or more external hard drives or a mix of different devices.
  2. The on-site backup: Typically, this is some external storage device, e.g. a USB hard drive, onto which you copy all your data in regular intervals.
  3. The off-site backup: People often advocate the usage of online services like CrashPlan or BackBlaze for backing up files online to avoid total data loss in case of some catastrophic event like fire or theft. If that is not an option, you can also use another external hard drive that you keep in location that is physically separated from your main storage device.

how-to-back-up-terabytes-of-photos-quickly-and-safely-minimal-backup-system

Figure 1: A minimal backup system

A backup strategy may be that you copy your files (including your photos) to your on-site backup storage once a day and to the off-site backup storage once a week.

This can work well as long as you have relatively small amounts of data and as long as your data does not grow at high rates. Unfortunately, even if you’re a hobbyist, you may fail to fulfill these requirements.

What’s the challenge?

Here’s the thing: The likelihood that you do your backups regularly decreases with a) the time each backup takes, b) the cost it incurs, and c) the amount of effort it takes.

So, you need to make it quick, inexpensive and automatic to actually make it work. If it’s slow, expensive and requires tons of manual work, you may end up having a clever backup system that you never use. Go figure what will happen…

Let’s look at the factors of time and cost more closely to give you a feel for the nature of the problems you’re facing when you try to back up terabytes of data:

  • Time

    Creating a backup means that you need to transfer all your data from your main storage device to a backup storage device. Whichever technology you’re using to do this, the speed at which your data is transferred is limited. The lower the transfer speed, the longer you have to wait for a backup to complete.

  • Cost

    You need to buy storage devices and/or you need to pay fees for online storage space. Especially the latter is a big factor as even the cheapest online backup services may charge you several hundred dollars per year for a multi-terabyte backup.

As a running example, I will assume that you have 3 TB of data to back up. With a 24 megapixel camera that produces raw files of 30 MB each, that’s roughly 100,000 photos.

You might think that today’s technology is advanced enough to enable fast transfer at an acceptable price point to rule out both time and cost as a factor. But it’s not… by a long shot!

Here are some ballpark figures for the time and cost required to back up our 3 TB of data today using different technologies:

  • 1.75 hours with internal SSD storage at a speed of 500 MB/s (option 1)

    • Cost: ~ $700
    • Note: The drives would need to be mounted internally in the computer to reach that speed which defeats the purpose of a backup.
  • 2.5 hours with external USB 3.0 SSD storage at a speed of 350 MB/s (option 2)

    • Cost: ~$900 (drives + external enclosure)
    • Note: The speed would be limited by the USB 3.0 interface.
  • 3.5 hours with an external USB 3.0 hard drive device at a speed of 250 MB/s (option 3)

    • Cost: ~$220
    • Note: The device would need to have two drives in RAID 0 configuration inside to reach that speed.
  • 6 hours with a simple external USB 3.0 hard drive at a speed of 150 MB/s (option 4)

    • Cost: ~$110
    • Limited by the speed of the enclosed hard drive itself
  • 2 days with an external USB 2.0 hard drive device at a speed of 20 MB/s (option 5)

    • Cost: ~$110
    • Limited by the speed of USB 2.0
  • 52 days with online backup at an upload speed of 5 Mbit/s (0.7 MB/s) (option 6)

    • Cost: $240 per year (Amazon Glacier service)
    • Note: Restoring all of your data will require at least a month under ideal conditions.

The good news is that the numbers for options 2 and 3 look reasonable. Options 1 and 2 are too expensive. Option 5 is only relevant for older computers that lack a much faster USB 3.0 port. Finally, option 6 is just infeasible as it’s way too slow.

The bad news is that the transfer speeds used above are only attainable if you transfer all the data in one go, as a single file. As soon as you transfer 100.000 smaller files, your effective transfer speed will drop drastically.

With options 3 and 4, you’re facing a duration of 12 to 36 hours for a full backup depending on the nature of your data.

What have we learned so far:

  • Lesson 1: Online backup is useless and expensive for large volumes of data.
  • Lesson 2: Your backup is going to take very long unless…

So, what are your options again?

how-to-back-up-terabytes-of-photos-quickly-and-safely-wd-my-book-duoAfter juggling all of these numbers around, let’s boil it down to some recommendations with respect to your backup system:

  1. Your best bet is to get at least one fast external USB 3.0 hard drive to create a backup. For 3 TB of data, I would recommend a ‘Western Digital 4 TB My Book Duo’ or a similar device. Alternatively, you could get a single 4 TB external drive from any manufacturer for about half the money at a much lower speed.
  2. You should make sure that you are using USB 3.0 to connect your device to the computer. Alternative technologies are Thunberbolt, Firewire and eSata. But those are more exotic and may be more difficult to handle.
  3. You need a backup tool that uses the bandwidth of your backup channel (e.g. USB 3.0) smarter than just pushing all the files across for every backup (full backup).

Which backup software should you use?

If you type ‘backup software’ in Google, you’ll be amazed… and confused. There are hundreds of tools out there that seem to do the job. But they do it in vastly different ways. Unfortunately, picking the wrong one will cause problems since many of the concepts found in mainstream backup software are difficult to apply to terabyte-sized data.

For example, you don’t want the software to compress your data. This would take extremely long, it would complicate or even prevent direct access. In the case of RAW photos, JPEGs and videos, compression would not safe a lot of space anyway.

For huge amounts of data, it is best to chose a simple backup software that offers the following features:

  • 1-to-1 representation of your data: The software should not attempt to encapsulate your data in some kind of archive or change its format in any way. It should create a plain, direct copy of it. This ensures that you can restore the data directly with any file management tool, that you’re not making yourself dependent on the backup software, and that the data is available quickly after a failure. You can even work with the files on your backup directly and immediately if absolutely needed.
  • Fast incremental backup: An ‘incremental backup’ is a process by which only the changes to your files are actually backed up once you have completed a full backup (copied all your data onto the backup device initially). This is huge! With an incremental backup, you won’t have to wait 12 – 36 hours each time you trigger a backup. However, the software typically still has to check each and every file to see which ones have changed. Even if nothing has to be backed up, this process can still take a few hours. A ‘fast incremental backup’ (name invented by me) goes one step further and eliminates even those checks. It runs in the background as you’re working and keeps track of file changes. When you tell it to update the backup, it already knows what the changes are, and it only needs to transfer these changes without running further checks.

My favorite backup tool – MirrorFolder

The only tool I found that fulfills these requirements and much more is called MirrorFolder. It is actually a file synchronization tool to be correct, but the boundaries are a bit blurred here.

It runs quietly in the background and monitors all file operations. You can set up a backup schedule very flexibly. You can run a real-time backup, but you can also let MirrorFolder run a backup in regular intervals or upon connecting your backup device to the computer. All these different backup strategies can be mixed flexibly.

Besides the straight-forward file synchronization, the software can also create ZIP archives containing the original versions of any changed or deleted file. This allows you to restore files that where accidentally changed or deleted.

The software has many options that let you take full control of the backup process.

How to set up a backup with MirrorFolder

Download the MirrorFolder software from http://www.techsoftpl.com/backup and install it. After the installer has finished, you will have to restart your computer to complete the installation. You can use the software for free without restrictions for 30 days. A license will cost you $39.

You can create a backup as follows:

Step1Plug your designated backup device into the computer and wait for it to be recognized.
Step2Make sure that this device does not contain any data that you want to preserve. Ideally, it should be empty. Any data on that device that is not identical to the data on the main storage device may be deleted when you start a backup.
Step3Open MirrorFolder and go to the ‘MirrorList’ tab (Figure 2-1).

how-to-back-up-terabytes-of-photos-quickly-and-safely-mirrorfolder-mirrorlistFigure 2: The MirrorFolder main window

Step4In the ‘Source’ section, click on ‘New’  (Figure 2-2) and choose your main storage device (Figure 2-5). This will typically be the entire drive, but you can also choose any subfolder anywhere on the drive if you don’t want to back up all the data.
Step5In the ‘Mirrors for’ section, choose ‘Selected source’ and click ‘Add’ (Figure 2-3). Choose the mirror drive or a subfolder on that drive (Figure 2-6) in the dialog box that pops up. I will also call this drive/folder the ‘destination’.
Step6Select the destination in the ‘Mirrors for’ section (Figure 2-6) and open the options dialog box by clicking ‘Options’ (Figure 2-4). Set up the options in the different tabs as shown in the screenshots below in Figure 3.

Figure 3: Options for a backup that runs automatically when the backup device is connected.

Note that, in the ‘Archive’ tab, I checked the options ‘Backup files before overwriting during synchronization’ and ‘Backup files before deleting during synchronization’. This tells MirrorFolder to archive any files that are going to be updated or deleted on the backup device during a backup. Using these options will enable you to roll back any changes that occurred on your data. If you want to limit the space required by these backup archives, you can set the option ‘Delete backup files older than’ to a specific number of days. At the very bottom of the ‘Archive’ tab, click the ‘Browse’ button to create or select a subfolder on the backup device where the archives should be stored. IMPORTANT: Choose an archive folder that is not contained in the destination folder for the backup.

Step7Add ‘ps*.tmp’ to the ‘Global Excluded Temporary File List’ on the ‘Options’ tab  (Figure 4-1). This ensures that Photoshop’s temporary files are not backed up and that the backup does not interfere with the operation of Photoshop.

Figure 3: Adding excluded files and saving

Figure 4: Adding files to be excluded from your backup

Step8Save the mirror list by clicking on the ‘Save Settings’ button in the toolbar (Figure 4-2).

The software starts the backup in the background and the column ‘State’ in the ‘Mirrors for’ section (Figure 2-6) will list the mirror as ‘Synchronizing…’. When the synchronization has finished, it will switch to ‘Synchronized’. Depending on how much data you have, and the speed of your devices and connection, it may take many hours before all the data has been backed up initially. But you can continue working on your computer without any restrictions. MirrorFolder is designed to deal with that.

Note that I chose to have my data automatically backed up when I connect the backup device and every 2 hours as long as the device is connected. This is very convenient, as the work I need to invest for my backups is reduced to plugging and unplugging my backup device(s).

As soon as the backup has finished, the software will notify you. At this point it is generally a good idea to unplug the backup device from the computer. Make sure you remove it safely using Windows’ ‘Safely Remove Hardware and Eject Media’ feature in the system tray (Figure 5). Then, physically unplug the backup device. If you leave the device connected, the data may be exposed to accidental or malicious deletion or alteration.

2016-05-19 14_48_32-2014 10 14+

Figure 5: Removing the backup device safely.

How to restore your data

There are different cases in which you will want to restore all of your data or only part of it. Let’s take a closer look at each of these cases.

Main storage device failure – Full restore

If your main storage device has a hardware failure, you should disable the backup in MirrorFolder to prevent the backed-up data from being altered in any way. The data to be backed up is gone for the moment, so attempting to back it up anyway is not useful.

Step1Go to the ‘MirrorList’ tab, chose the main storage device in the ‘Source’ list and right-click on the backup device in the ‘Mirrors for’ section.

2016-05-20 11_27_38-MirrorFolderFigure 6: Disabling a specific backup temporarily

Step2Choose ‘Disable’ from the pop-up menu.

Thanks to the flexibility of the MirrorFolder software, there are many ways in which you can restore your data fully or partially. Here, I will give you the procedure for doing a full restore.

Before you can restore your data, you need to get a replacement for your main storage device (e.g. a new internal hard drive for your computer) and install the hardware.

To allow for a seamless migration, change the drive letter of the new main storage device to the letter that was formerly assigned to the old main storage device.

Here’s how to do this:

Step1Open the ‘Disk Management’ app by clicking on the start button and typing ‘create and format’. Click on the ‘Best match’ that Windows presents to you.

2016-05-20 11_38_55-Figure 7:  Opening the ‘Disk Management’ app.

Step2Right-clik on the new main storage device and choose ‘Change Drive Letter and Paths’.

disk-management-appFigure 8: Changing the drive letter using the ‘Disk Management’ app

Step3Click the ‘Change…’ button and then choose the correct letter from the ‘Assign the following drive letter’ drop-down box. Click ‘OK’ to confirm. Windows will give you a warning. Click ‘Yes’ to change the drive letter.

2016-05-20 11_48_53-GreenshotFigure 9: Selecting the correct drive letter.

Now you can start to restore your data:

Step1Start MirrorFolder and go to the ‘Restore’ tab (Figure 10-1). In principle, you can use any tool that lets you copy files since the backup is 1-on-1 representation of your files. But using the ‘Restore’ tab of MirrorFolder is more convenient and gives you more control.

how-to-back-up-terabytes-of-photos-quickly-and-safely-mirrorfolder-restoreFigure 10: Running a full restore of your files.

Step2Under ‘Source’, select the folder in which the backup is stored (Figure 10-2).
Step3Delete the contents of the field ‘Zip Files’ (Figure 10-3). ‘Zip Files’ will be automatically set once you selected a source. If you restore your data with that default value, MirrorFolder is going to restore all the files you deleted, dating back for as long as archives are stored. So, keep the ‘Zip Files’ field empty to make sure you restore the state of your main storage device at the time when you last backed it up. This is typically what you want for a full restore.
Step4In the ‘Target’ field (Figure 10-4), enter the name of the folder to which you want to restore the data. In order to continue working without distractions after the restore process, make sure the ‘Target’ you choose is the same folder in which your data was stored on the old main storage device. E.g. if the source of your backup in the ‘MirrorList’ tab is ‘r:’ (see Figure 2-5), the target to restore to should also be ‘r:’, assuming you chose the same drive letter for the new main storage device and the device is empty.
Step5‘File Types’ should normally be ‘*.* which means that all files will be restored (Figure 10-5).
Step6Under ‘Date’ choose the current date and time (Figure 10-6). This means that all files will be restored (more precisely, all files created or modified on or before the chosen date).
Step7Choose ‘No Replace’ from the drop-down menu on the right side (Figure 10-7) to avoid that any files that may already be on the ‘Target’ are replaced.
Step8Click the ‘Restore’ button (Figure 10-8) to start the restore process. Each file will be listed with its ‘Status’ and additional information in the lower half of the window (Figure 10-9).
Step9Wait for the restore process to finish, unplug the backup device from the computer and enable the backup in MirrorFolder again. Keep in mind that, for terabytes of data, this is going to take many hours.

Now, you should be able to continue working.

Correcting accidental deletions on the spot

If you accidentally delete a set of files and you notice it right away, chances are that these files are still present on one of your backups. To restore the files, do the following:

Step1Disable the backup temporarily (see Figure 6) to keep MirrorFolder from running an automatic backup which would delete the respective files once the backup device is connected to your computer.
Step2Connect the backup device to the computer.
Step3Navigate to the location of the deleted files on the backup device and simply copy them to their original destination on the main storage device. You can use the Windows file explorer to do this.
Step4Disconnect the backup from the computer and enable the backup in MirrorFolder again.

Recovering lost files from an up-to-date backup

If files were removed accidentally and you did not notice it until much later, these changes will already have been propagated to the backup. But don’t worry. Your files are still available in the backup archives.

To be able to search effectively for lost files in the backup archives that MirrorFolder created for you, you should first tell Windows to include the contents of ZIP archives when you search for files in the file explorer app. Do the following:

Step1Open the ‘File Explorer Options’ by clicking on the Start button, typing ‘file explorer options’ and clicking on the ‘Best match’.

2016-05-20 12_31_27-

Figure 11: Opening the Windows ‘File Explorer Options’

Step2In the ‘Search’ tab of the ‘File Explorer Options’ check the option ‘Include compressed files (ZIP, CAB…)’.

how-to-back-up-terabytes-of-photos-quickly-and-safely-file-explorer-optionsFigure 12: Expanding the Windows Search to ZIP archives.

Now, proceed as follows to find the lost files:

Step1Connect the backup device and navigate to the folder that stores the archive of changes and deletions (see ‘Archive’ options in Figure 3).

how-to-back-up-terabytes-of-photos-quickly-and-safely-restore-lost-files-01Figure 13: Archive folder containing archives for different backups. ‘media_raid’ is the main storage device in this example.

Step2Type the name of the lost file (or parts of the name) in the search field at the top right of the File Explorer window. The search may take a few moments.

how-to-back-up-terabytes-of-photos-quickly-and-safely-restore-lost-files-02Figure 14: Typing a part of the file name (here: ‘Backup System’) in the ‘Search’ box.

Step3When results start appearing, look for the files you want to restore and identify the containing archive. The archive file is listed below the respective file name. If it is truncated, hover your mouse over the file entry and wait for a few moments. A tool tip with the full archive name will appear. Note that you may see several entries for the same file. In most cases, you want to restore the most recent one (stored in the backup archive with the most recent time stamp in its name).

how-to-back-up-terabytes-of-photos-quickly-and-safely-restore-lost-files-04Figure 15: Search results – Identifying the containing archive by hovering the mouse over the truncated archive name.

Step4Navigate back to the folder containing the archives, extract the respective archive and copy the desired file(s) back to their original location.

how-to-back-up-terabytes-of-photos-quickly-and-safely-restore-lost-files-07Figure 16: Extracting the archive to copy a lost file to its original folder. For example, right-click and choose ‘Extract All…’, or open in your preferred ZIP archive software.

Step5Delete any extracted files in the archive folder to save space and avoid confusion.

Conclusions

In this post, I have shown you a very simple way to back terabytes of data without investing a lot of money. We’ve talked about the different types of backups you may want to establish, how to create a bare-bones system that simply works, and how to use a software called MirrorFolder to automate large parts for the backup process. No matter how much data you have, MirrorFolder is only going to back up those files that changed without even going through and checking all of them. This is extremely efficient and quick. It reduces the work that goes into backing up your data to plugging and unplugging a backup drive once a day, once a week, or once a month.

So, what are you waiting for. Do it right now? Let me know how its going and if you have any questions.

…and if you’re a Mac user, I’d love to hear from you too. Which backup solution are you using? Time Machine? Some other app? Do you know or use software tools that do similar things as MirrorFolder does? I’d be happy to hear about it.

Leave a reply

21 Comments on "How to Back Up Terabytes of Photos Quickly and Safely"


Guest
Terry
1 year 6 months ago

Hi Klaus Thanks for showing us your backup methodology.
What about us poor mac people as I think Mirror Folder is Windows software (?)

Guest
Mike
1 year 6 months ago

Hi. Thanks for the GREAT article. As a serious amateur photograoher, and a Mac user, here’s what I do. My working files are actually on an external WD 2TB USB drive. I have another identical drive that is a mirror. The backup is done weekly, very early in the morning. Automatic, using Carbon Copy Cloner. I also use Crash Plan for automatic off site incremental backups. Getting very close to moving over to 2 WD 5TB external drives. Hope I have it covered…

Guest
Peter
1 year 6 months ago

For Mac I like SuperDuper which does all you describe.

Guest
1 year 6 months ago

Hi Klaus,

I know that backup strategies a a controversial topic and that even the requirements are often not agreeable among different users.

It’s sad to see that you (like many others) recommend a simple mirroring solution. In my opinion this is NOT a reliable backup strategy.

Your requirement to have a 1-to-1 representation of current data is not one that I support. It is even dangerous to have the backup files accessible like any other file. I have seen too many cases where the user selects the wrong disk for his operations and thus destroys the backup.

Every decent backup software has a disaster recovery option or even bare metal restore capability, so there is NO problem at all with the requirement to have the backup software at hand for a restore.

Mirroring also doesn’t meet an absolute requirement for safe backups: a retention policy and the ability to go back in time. In my career I have used the history more than once to rescue important data. The most likely use case is an unintended and/or unnoticed change to files or a folder structure which happens quite easily when the drag and drop feature within file managers is used.

Every (good) modern backup software also does lifetime incremental backups. This is industry standard now. In professional environments this has been the case since the last century and for good reasons. Lifetime incremental backup means that only the very first backup to an empty backup set is a full backup. All following backups are incremental.

I’m a Mac user and my backup strategy is quite simple.
For local backups I use TimeMachine (which btw. is a lifetime incremental backup) with two backup sets (different targets) which switch on a regular basis (normally three days).
Additionally I use remote backup to a friend – also lifetime incremental and with history.

So I have one remote backup which has a history of 30 days and two local backups with a history of about 6 months. In a disaster recovery case the remote backup is accessible easily so that the data hasn’t to be transferred over the wire. Many online backup services also provide this service.

Regards, Frank Tegtmeyer

Guest
1 year 6 months ago

Hi Klaus,

I missed the use of the archive zip files in your article. This indeed provides historical data. So your solution is sufficient besides my comment regarding the 1:1 copy.

My data is around 1TB at the moment with a growth rate of 50GB per month. The growth rate varies bit it’s around this value at the moment.

Thanks for taking the time to answer my comment.

Regards, Frank

Guest
Jonathan
1 year 6 months ago

I take issue with your early proclamation that online backup is “useless and expensive”. Your mention two companies in your article (backblaze and crash plan) that are cheaper than your concrete example. It is true that the initial backup can take quite some time, but that hardly makes it useless. A good backup strategy is a long term solution and shouldn’t be limited to what is immediately available. I currently have over 4TB (~250,000 photos) on backblaze for $60/year which is well worth it to me to have a completely automatic backup safe from my house being destroyed.

My personal backup system involves the following:
1) copy cards to internal drive
2) automatic sync from internal drive to external RAID device.
3) automatically upload to cloud backup provider.
4) clear cards at next shoot giving plenty of time for the copy to both onsite and offsite backups.

After the initial copy from my CF cards to my computer, there are always between 2 and 4 copies of the raw photos. A shoot of 2000 photos takes maybe a half hour to copy to my onsite backup and less than a day to make it to offsite backup.

Guest
1 year 6 months ago

You have a point with the growth rates.
But especially for crashplan (maybe other software too) you have the opportunity to use your own destinations which can be a neighbour connected through Ethernet or a friend 1000 miles away – even in a foreign country. There you have no imposed bandwidth restrictions.

For the initial backup most online services – and of course your friend/neighbour – provide seeding. The same is true for getting the data back fast in case of an emergency. One has to check if the delivery is restricted to some countries – last time I checked crashplan backups were delivered to the U.S. only.

It’s a matter of balancing your security needs and the time and cost factors.

Regards, Frank

Guest
1 year 6 months ago

First:
An online backup is always the last line of defence ONLY. It’s meant to be used if all local and fast options don’t work, for example if your household gets robbed or your house burns down or the police collects all your computer equipment for some reason. I’m always shocked when people tell me they rely completely on online backup. That’s a call for disaster.

Second: At least in bigger (german) cities and their surroundings you get fast enough connections nowadays. For other countries the situation may be completely different (better or worse). I have 50Mbit/s upload speed, so this is no limitation for me at my current growth rate. The remotes I use have the same.
If one of the parameters (speed/growth rate) changes I have to rethink my backup strategy again of course. Maybe some day my local network is saturated by backup data. Also then I have to find new strategies.

Regards, Frank

Guest
1 year 6 months ago

Klaus, you are right except your last sentence. This is not a viable alternative – it’s simply a thing you have to do even if you have an additional online backup.

And for the case of a disaster it is of course better to be able to get your data back than not. Even if it takes months.

But as I said – you are right that local and fast solutions come first. Online backup is an add-on that MAY help in the worst case scenario.

Regards, Frank