An implementation of an automated data backup system under the Mac OS X operating system

This is the first guest post on this blog. It is a great article by Bernd Limbach. Thank you very much Bernd. If you are interested in writing and sharing, please contact me:

A third-party article by Bernd Limbach. This is a translation by the author of his original article in German language.

Data backup seems to be a much bigger topic as everybody may think. For the time I have been reading Olaf’s German Blog by now, data backup was mentioned at least two times, mostly in the comments section. Here the references to the two German articles:

http://www.olafbathke.de/fotograf-kiel-blog/2010/08/04/umfrage-wie-gros-ist-dein-datenlager/
and
http://www.olafbathke.de/fotograf-kiel-blog/2010/04/14/cloud-computing-und-bildbearbeitung-was-halten-fotografen-davon/

Out of these two articles everybody can extract these important points:

1) RAID is not a good idea for a backup device.
2) External hard drives, DVD’s or similar media can be used as a backup media.
3) Redundancy is important.

These three points should be sufficient to implement a data backup scheme.

A small excursion to the topic RAID :
My Mac Pro is set up with a RAID01 system via the build-in Mac OS X software RAID functionality, which survived the upgrade from 10.5 to 10.6. without any hick-up. Four hard drives are used for this RAID system. There my original RAW files are stored. To me the following points were important for choosing the described set-up:

– The overall storage capacity of the hard drives should have sufficient capacity to serve me for about 4 years.
– There should be a certain protection against a hardware failure of one hard drive.

Of course this does not ensure the loss of valuable data in case of user mistakes.

Just as a side note, during my work I joined in to trouble shoot a failed RAID1. One hard drive was defective and the system did not start anymore. Even with support from the manufacturer of that computer we had not been able to rebuilt the RAID1 leaving only the option to reinstall the operating system without a RAID set-up.

And another hint to think about. Even the same models of hard drives of the very same manufacturers can fail. In theory the purchase at different retailers should reduce the possibility of such a situation. Some people even buy the same size of discs of different manufacturers to reduce the possibility of a hardware failure. I do not have any personal experience with this strategy, so far everything is good in my computer. Everybody needs to make the decision by him- or herself.

Conclusion for RAID:

It should be treated with certain precautions. As a backup it does not give the data security it may sounds.

Enough of philosophy, let’s see how a backup strategy can be implemented. For this I did some research for available informations and ideas in the Internet. Out of that collection of information I decided what I wanted to use and sat down to implement them in my system. While I did that for my Mac Pro system, I would expect that under Windows a similar way can be followed. People who implemented a backup scheme under Windows are invited to share their thoughts and actual set-up. I envision the Windows side to be a nice addition to this article! Please leave a comment or contact Olaf.

Here are my thoughts:

1) The data backup should be done automatically.
2) I did not want to use the build-in Time Machine .
3) Still programs on board should be sufficient to set-up the backup system.
4) The file system should be mirrored 1:1, so in the worst case I will be able to copy the folder structure manually.

To 1) automatically data backup

By nature I am a relatively lazy person, so an automated backup was mandatory. I did not wanted to interfere here.

To 2) No use of Time Machine

I did not study the technology and features of Time Machine, because I of my thought no. 4. Still, Time Machine may be pretty good in providing a decent backup strategy and includes daily, weekly and monthly backup schemes. It sounds good, but I will not go into more detail here. Maybe someone else uses it, stories are also very welcome!

To 3) Usage of on board tools of OS X

While Time Machine is an on-board tool, I still will not address it here.

A few months back I found the following article on the Internet: “Tutorial: Backups with Launchd”.

With this I could start to implement my thoughts.

What is launchd?

Launchd is a system to start, stop and manage Daemons, programs and shell-scripts. In Mac OS X it replaced the good old init and various other old structures. For our fellow Windows user just the note that launchd or init is the first program to be started when booting the computer. Only after that the kernel and device drivers will be loaded to be able to use a mouse to navigate the graphical user interface. Let’s just ignore shells or the DOS command prompt here.

With launchd a lot of tasks can be done, for example specific tasks can be started. Somebody will think, hey, that’s exactly what’s Cron for. Yeah, right, I would expect this also can be implemented with cron! I must admit, I do not know any program on board of Windows, which can perform such a task.

Launchd can also check on the existence or availability of specific Volumes (i.e. a Hard Disc). That’s exactly what I used and what is written in the tutorial. Reading the comments of the tutorial at macresearch.org will bring much more inside what to so, so make sure, you read them too!

Three external hard drives are used for the data backup:

– PhotoBackup01-10: Backup Disc for the days 1 to 10 of the month.
– PhotoBackup11-20: Backup Disc for the days 11 to 20 of the month.
– PhotoBackup21-31: Backup Disc for the days 21 to 31 of the month.

For some of you this seems to be overkill, but Chase Jarvis has two good reads about a backup strategy, which had highly influenced my implementation:

How to back up your photography – the basics .
Important storage and backup solutions for your photography .

Also have a look at his video.

A minimal aspect related to back-up can be found in the comment of Bayou Bill in the Discerning Photographer’s blog dated 6/30/2010.

Well, taking a backup at all is a good and important idea. As some sort of a minimum two separate hard-drives externally connected can satisfactorily serve this. This would match points 2) and 3) from the top of this article.

Via a shell script (Batch file in DOS jargon) the files will be copied to the backup volumes.

To copy the files the common command Rsync is used, which is embedded in the mentioned shell script at macresearch.org. In principal it can be exchanged with a program of similar functionality. Original I wanted to use cpdup, which is developed at the DragonFlyBSD project. A non existing original port to Mac OS X and my inability to compile a port by myself let me use Rsync. And yes, it also would have been contrary to my statement 3), using only on board tools for implementation. While Rsync can be used to copy files and folders network wide, the backup volumes do not need to be attached to the computer directly. I have not tested that capability.

Status messages are written in the System Log file. This is not visible directly to the normal user, so access to an administrative account aka root access is mandatory for debugging.

Beside all my other points of interest and work it took me a couple of weeks to finally figure everything out and get it to run as desired.

Conclusion:

My implemented of a backup system strategy is not the simplest. A little knowledge of shell commands and scripts as well as launchd are necessary. But all four points of my thoughts could be satisfied. The backup is started automatically, only on board tools without Time Machine are used and the file system, how the files are organized in folders and their hierarchy, could be achieved. At present the backup system is functional, but not in an optimal state. This may change in the future.

To manage my photos I use Apple’s Aperture 3 (and yes, I did ignore its Vault feature) and after quitting of the application a backup is not automatically started, which would be desirable. As an alternative before shutting down the computer a backup could be run. My best guess is, that AppleScript to start a backup after quitting Aperture as well as launchd before the computer shut-down will be able to perform the task. This has not been tried yet.

If I would not be so lazy to exchange the backup volumes according to their purpose and would bring one backup volume always to a safe place when not in use, then the complete backup strategy would be operational as intended.

With this I would like to finish this article, which is quite technical. I hope this will inspire some of you to go ahead and finally do a backup. Well, print your photograph, this also is some sort of a backup, right? But I guess, this also will be another story.

Do you have further hints, tips, similar of different experience with the topic backup, please leave a comment. Olaf and the author will certainly welcome them very much!

Leaves me only to thank you for your patience to read this lengthy article. Also a big thank you to Olaf for encouragement to publish this article in his blog.

About the author:

Bernd is a hobby photographer, photographs can be seen on his homepage and on flickr. Beside shooting he likes to climb and is also a mountaineer, as can be seen from his photographs. To manage stills and also short videos he uses Apple’s Aperture 3, most of the post-processing is done with Aperture as well. Beside Aperture Pixelmator, Photoshop Elements 8 as well as Hugin for panoramic photographs populate his Application folder.

Leave a Reply