Hack 90 Using cron to Automate Tasks

Hack 90 Using cron to Automate Tasks

figs/beginner.gif figs/hack90.gif

Run scripts on a repetitive basis with the cron utility .

There will come a time when you've created a script so perfect for your day-to-day life that it becomes absolutely imperative to run on a regular basis. Sure, you could run it manually during your morning routine, but if you can automate the retrieval of data with scraping, why not automate the execution too?

Meet cron , a Unix utility whose life revolves around running things every minute, hour , day, week, month, or year. Give it a command or script and a schedule and let it go. Each user on the system can automate his own tasks with no restrictions: hear the date spoken every minute, have a backup performed every three days at 12:15, or automatically open your email every day at 7:00 A.M. and then again at 6:00 P.M. Whatever your scheduling needs, cron will satisfy them.

If you're running any flavor of Unix (including Mac OS X), you already have cron , and it's been running ever since you turned on your computer. To begin adding automated tasks, edit your personal crontab file, which just keeps track of which tasks you want to run and when. Type crontab -e in your shell to begin editing your schedule. Here are a few example crontab entries :

 0 7 * * Mon-Fri echo "Whoo, I rule!" > /dev/null 0 18 * * Mon-Fri echo "Ok, one more time for good luck." > /tmp/lucky 0 12 * * Sat-Sun echo "Alright, now this is superflous." >> /tmp/lucky 

These three entries will do insatiably pointless tasks every weekday morning at 7:00 A.M., then again at 6:00 P.M. the same day, and a "worst . . . example . . . ever!" note at noon on the weekends. The order of fields will eventually become simple to remember: minute, hour, day of month, month, day of week, and the actual command to run. A * represents every possible match for that field (i.e., every minute, every hour, etc.).

For example, to mirror your web site every Wednesday morning, you could do this:

 # back up gamegrene.com every Wednesday at 7:06 A.M. 6 7 * * 3 cd /path/to/backup  && wget -m http://gamegrene.com/ 

If you're a media file collector, you might be running some combination of the following hacks: [Hack #36]. A crontab set to run these automatically on a regular basis might look something like this:

 # download new POP3 mail. this assumes you've set up a  # POP account solely for attachments (such as a bunch of  # Yahoo! mailing lists), and have modified leechpop # to delete email messages after they're downloaded. # we start at 5:30 so that when I get home at 6:00,  # I'll have a nice collection downloaded and waiting. 30 17 * * Mon-Fri perl leechpop.pl -u morbus -p secret -s mail # download 2 megs worth of the dailywav.com, weekly. 50 23 * * Sun wget -q -A.wav -Q=2m -m http://dailywav.com/ # grab any new matches for "kittens" on webshots. I # check these once a week when I awake on Saturday. 15 7 * * Sat perl webshots.pl --max=40 "kittens" 

On the other hand, if you're a marketing analyst, you might be running a combination of [Hack #62]. The following crontab runs at staggered times before you get in to work, ready for your reading pleasure after you've had a chance to grab that first cup of coffee:

 # run a search for our product name to get counts. # we first prepend the current date for archiving. 29 8 * * Mon-Fri date >> ourproduct.txt 30 8 * * Mon-Fri perl scattersearch.pl "product name" >> ourproduct.txt  # run an Alexa report every week to see # related products and websites for our own. 15 8 * * Mon perl alexa.pl http://oursite.com/ related.html # graph our book's rank over at Amazon. 5 0 * * * perl grabrank.pl 

See Also

  • man cron , man crontab , and man 5 crontab . The last entry is probably most useful for learning about the available syntaxes (including ranges, step values, and more).



Spidering Hacks
Spidering Hacks
ISBN: 0596005776
EAN: 2147483647
Year: 2005
Pages: 157

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net