Scheduled backups in Linux (Ubuntu 9.04)
by Adam on Sep.23, 2009, under Linux, Ubuntu 9.04
This tutorial will be written using Ubuntu 9.04 for its examples, but these tactics will work with most if not all Linux distros. You will however need to have root access to the machine you are running all of this on.
Backing up data is nothing new, and it is often overlooked. Sad really since it’s relatively easy to do in Linux. (Windows too, but we’re not discussing Windows here :) ).
You will not need to install any software to get this working in Ubuntu 9.04. Probably won’t need it for any other mainstream linux distro. We’re going to be using the tar command here, and vixie-cron, both of which come bundled with Jaunty (Ubuntu 9.04).
Alright, let’s get started. The tar command in Linux is used to store and extract files in a tarfile. Essentially it’s just a container for all of the other files. Asfar as I’ve been able to tell a tarfile is not compressed in any way. But you can force compression by passing an option to the tar command, the ‘z’ option to be precise. This causes the file to become a gzip file and uses gzip compression. Smaller filesize, more backups that fit on your backup media :)
The command to ‘tar’ a directory into a backup is as follows. I’ll give you my example tar command from my backup:
To break down the above command, the tar portion is self explanatory, we are invoking the tar command there, the -czpf is the options section. ‘c’ tells tar we are creating an archive here, as opposed to ‘x’ which would be extracting from an existing archive. ‘z’ is the option to use gzip compression. ‘p’ tells tar to preserve file permissions in the archive. And ‘f’ tells tar that we wish to store the archive to a file we will specify. /mnt/mybook/backups/homedir.tar.gz is the name of a mounted network drive on my home network where I store the backups. and the filename homedir.tar.gz, and /home/adam is the directory I am backing up.
If you run that command in your terminal right now, after changing the directories of course, it will create a permission-correct copy of your home directory where you tell it to. Something important to remember is the destination goes first, and THEN the directory or file you wish to back up.
Another example, let’s say you wanted to backup your log directory to a backup directory you created in root, /backups/.
This will create a file called varlog.tar.gz in the directory /backups/, and fill it with the contents of your /var/log directory. This is the reason you will need to run this command as root, because in a secure environment, you will not be able to access all of the files in the /var/log directory, and many other directories on your ubuntu install, because your username will lack permission to do so. This could cause your tarfile to be missing some files, making your backup fairly useless.
So, now we know HOW to make a backup, but what about scheduling it so your machine automatically backs up your files without your intervention? This is where CRON comes in. Cron in Linux was very intimidating to me as a new linux user. And I could write a whole mile-long tutorial on just CRON, but for the purposes of this tutorial, we’ll just stick to the basics.
You first want to become root, so type
and then enter your system’s root password when it asks. If you have not yet enabled your root account, here is a quick way to do so: http://www.adamsmash.com/?p=266. The reason you want to become root, is because we’re going to edit root’s cron table. Root will have no permission issues backup any directories or running any scripts, so root’s cron is where we’re going to create the cron job to call our backup.
You will now be in a root terminal. From here, type:
This will allow you to edit the root cron table. In ubuntu they do an ok job of explaining what all the *s mean, essentially there are 5 places to enter times, and then a command goes at the end. Cron looks at those times and runs the command if it matches.
The 5 asterisks correspond to 5 different time entries. First is Minute, Second is Hour, Third is Day of Month, Fourth is Month, and Fifth is Day of Week.
A Cron entry of:
Would run the command FOO every minute, of every hour, of every day, of every month, all week long for the rest of eternity :) You do not want to leave them all as stars. The * is a wildcard in Linux signifying “Everything”
A better examplee:
This cron entry would run the command every hour on the 1st minute after that hour. So 1:01, 2:01, 3:01 and so on, and it would do it every hour, every single day.
This cron entry would run the command every day at 3:01 am. (Hour is in military time, 15 would be 3pm)
So now that we got that squared, we need to pick a time for backups. Depending on when you sleep, or when your computer will not be in use, you’ll need to figure out what time to tell Cron to execute your backup.
For me, I chose midnight to do mine. I’m well asleep by then, and my PC is not doing much else, so it’s a perfect time. The cron entry to run something at midnight every single day is:
The first 0 is to tell Cron to run on the 0th minute, and the second 0 is the zeroth hour, which is 12:00. 24 also works I think, so 0 24 * * * FOO would be permissible as well.
Now you have a good foundation for scheduling backup jobs using CRON. If anything is unclear, please feel free to comment, I’ll clear it up as best I can. :)