4.4. Automating Routine TasksComputers are intended to do things for us. Yet, we spend much of our computing lifetimes repeatedly performing the same tasks, many of which are, in the end, rather mindless. System administration can sometimes become a chore chock-full of tedium and routine: the same keystrokes, the same checks, the same results. Wouldn't it be ideal if we were able to put the computer to its intended use, and have it do these things for us? Linux provides an abundance of ways to automate tasks. Shell scripts can help, but the real power lies in two particular facilities: cron and, to a lesser extent, at. These are tools you'll rely upon to perform such routine tasks as archiving logs, updating the system database with newly installed applications, writing server statistics, and much, much more. 4.4.1. croncron is a Linux system daemon that executes scheduled scripts . The cron daemon, or crond, runs as a service that starts when your system starts. Every minute, it checks its own schedule database for tasks that need to be performed. The heart of cron is the /etc/crontab file, which is shown below from the default, unaltered Fedora Core installation: /etc/crontab
The crontab file first defines some environment variables : the shell in which the tasks will run, the PATH environment variable, the system account to which mail notifications will be sent, and the home directory to use. After the run-parts comment, the command schedule is listed. 4.4.2.1. The crontab Command Schedule SyntaxBelieve it or not, a single line in this file provides all the information required for the system to perform a full set of tasks. In order to understand it, we need to break this line out into fields. Note: cron always deals in 24-hour time: 7:24 means 7:24 a.m., and 19:24 means 7:24 p.m.
4.4.2.2. Adding to crontabLet's put this to more practical use with the scenario mentioned back in Chapter 3. You may remember that we introduced the power of the command line with the command find /var/backup/* -ctime +5 -exec rm {} \;, which I use to remove backups that are more than five days old three times a week. This automated process is powered by cron, so let's take a look at how this is configured on my system: /etc/crontab
The first two fields tell us that this task will be run at 5:12for purity's sake, I've scheduled the task to run at a time when I'm not using the system, though it probably won't make much of a difference in terms of performance. The next two fields are filled with asterisks, meaning they'll be run regardless of the date. The potentially interesting value here, though, is the fifth "day of week" field, which has the value 2,4,6this means that the task is scheduled for Tuesday, Thursday, and Saturday. crontab also allows for ranges. 1-5 in the fifth would schedule a task to run Monday to Friday. cron can just as easily execute shell scripts . Let's create a script named backup.sh to actually create these backups. Save this file in the /home/username/bin directoryyou may need to create this directory if you haven't done so already. ~/bin/backup.sh
This above script achieves several tasks:
Before we go ahead and run this script, we need to create the /var/backup directory, grant everyone write access to this directory, and make the backup.sh script executable: [kermit@swinetrek ~]$ su Password: [root@swinetrek kermit]# mkdir /var/backup [root@swinetrek kermit]# chmod a+w /var/backup [root@swinetrek kermit]# exit exit [kermit@swinetrek ~]$ chmod u+x ~/bin/backup.sh [kermit@swinetrek ~]$ Let's test the script before we modify the crontab file to ensure that it works as we think it should: [kermit@swinetrek ~]$ backup.sh cp: cannot stat `/var/log/httpd/*': Permission denied [kermit@swinetrek ~]$ We get this error message because we don't have access to the /var/log/httpd directory; we need to run this script as root: [kermit@swinetrek ~]$ su Password: [root@swinetrek kermit]# backup.sh [root@swinetrek kermit]# exit exit [kermit@swinetrek ~]$ ls /var/backup 200612312359 [kermit@swinetrek ~]$ In the file listing, we can see that a directory has been created with the current date and time. Delve deeper into this directory until you're satisfied that the script is working as expected. The last step in the process is to add the execution of the script to the crontab file on the schedule we've already defined: /etc/crontab
This entry differs from the original only in that it executes a script rather than executing a lone command. Because the script can contain commands and logic, it's a sound approach to solving more complex routine operations. 4.4.2.3. Using the /etc/cron.schedule DirectoriesWe've already discussed what the default crontab entries do. The first line runs the command run-parts /etc/cron.hourly every hour: run-parts will go into the /etc/cron.hourly directory and execute every executable file it finds there. Entries also exist for cron.daily, cron.weekly, and cron.monthly directories. If you prefer, you can simply store your backup.sh script in the /etc/cron.daily directory. It will run with the other daily scripts. 4.4.2. AnacronAnacron is, to some extent, an extension of cron. Like cron, it's intended to execute commands on a schedule, taking care of routine tasks. However, unlike cron, Anacron makes no assumption that the machine is up and running 24/7. In that sense, Anacron provides some measure of redundancy to the functions of cron. Like other Linux applications, Anacron gets its direction from a text configuration file. In Fedora, this file is /etc/anacrontab. At first glance, the anacrontab file looks less daunting than crontab, even though we know how simple the crontab file can actually be. /etc/anacrontab
As in crontab, the first few lines of this file define some environment variables : in this case, SHELL and PATH. The remaining lines describe, over a number of fields, the jobs that Anacron must carry out. From left to right, these fields are as follows:
The operation of Anacron is pretty straightforward. When run, it reads the list of jobs from /etc/anacrontab and checks whether or not each job has been run in the last specified number of days. If not, Anacron runs the job after waiting for the delay period. If the job has been run in the specified time period, it leaves it alone. It's pretty simple. We can start the Anacron daemon using one of the service tools we looked at earlier in this chapter. By default, it should be set up to start with your machine, but you can change this setting if you like. Note: By default, both cron and Anacron are configured to run all of the scripts in the /etc/cron.schedule directories. However, only one of cron or Anacron will actually run the scripts.Each of these directories contains a script to keep Anacron up to date. For example, whenever cron runs the scripts in /etc/cron.daily, one of those scripts updates the file that Anacron uses to record when the task was last run. Later, when Anacron goes to run these scripts, it will see that cron has already run them, so it won't run them again. 4.4.3. atNow, we've got cron to perform regularly scheduled tasks on your system. We've got Anacron to pick up cron's slack if the machine isn't up and running 24/7. That seems like a pretty full complement of task scheduling methods, doesn't it? As true as that may be, we've still left one piece out of the automated task puzzle: at. at is a classic Linux hack, intended to take up where other applications leave off. In the cases of both cron and anacron, it's not a trivial task to add a simple, one-off task to the schedule. Let's say, for example, that you need to download a very large file , and you want to do it at a time when there's no-one else on the network, so plenty of bandwidth is available. You could add an entry to /etc/crontab, but you'd have to remember to remove it in order to avoid downloading the same file again in the future. at serves this niche purpose perfectly. Better yet, the syntax for using at couldn't be simpler: [kermit@swinetrek ~]$ at 3:00 at> wget http://sitepoint.com/verylargefile.zip at> <EOT> job 1 at 2005-12-31 03:00 [kermit@swinetrek ~]$ Note: <EOT> stands for end of transmission, and is triggered by hitting CtrlD. Use this to indicate that you have finished entering commands to be executed at the given time. at schedules the task for the next instance of the time you specify. In the above example, at would execute the given command at 3:00 a.m. Note: Remember that at, like cron and Anacron, uses 24-hour time. In summary, if you're looking to schedule tasks, Linux has you well covered. You have cron and Anacron for repeating tasks, and at for those one-off, occasional jobs. |