A broad area of everyday scripting is to work with files and directories, including entire subtrees of files. Much of the relevant material has already been covered in Chapter 4, but we will hit a few high points here. Because I/O is a fairly system-dependent thing, many tricks will vary from one operating system to another. If you are in doubt, you should either consult a reference or resort to experimentation. A Few Words on Text Filters Many tools that we use every day (both vendor-supplied and home-grown) are simply text filters; that is, they accept textual input, process or transform it in some way, and output it again. Classic examples of text filters in the Unix world are sed and tr, among others. Sometimes a file is small enough to be read into memory. This allows processing that might otherwise be difficult.
file = File.open(filename) lines = file.readlines # Manipulate as needed... lines.each { |x| puts x } Sometimes we'll need to process it a line at a time.
IO.foreach(filename) do |line| # Manipulate as needed... puts line end Finally, don't forget that any filenames on the command line are automatically gathered into ARGF, representing a concatenation of all input. (See the section "Working with ARGF.") In this case, we can use calls such as ARGF.readlines just as if ARGF were an IO object. All output would go to standard output as usual. Copying a Directory Tree (with Symlinks) Suppose that you wanted to copy an entire directory structure to a new location. There are various ways of doing this operation. But what if the tree has internal symbolic links? This becomes a little more difficult. Listing 8.5 shows a recursive solution with a little user-friendliness added in. It is smart enough to check the most basic error conditions and also print a usage message. Listing 8.5 Copy Tree require "ftools" def recurse(src, dst) Dir.mkdir(dst) Dir.foreach(src) do |e| # Don't bother with . and .. next if [".",".."].include? e fullname = src + "/" + e newname = fullname.sub(Regexp.new(Regexp.escape(src)),dst) if FileTest::directory?(fullname) recurse(fullname,newname) elsif FileTest::symlink?(fullname) linkname = `ls -l #{ fullname} `.sub(/.* -> /,"").chomp newlink = linkname.dup n = newlink.index($oldname) next if n == nil n2 = n + $oldname.length - 1 newlink[n..n2] = $newname newlink.sub!(/\/\//,"/") # newlink = linkname.sub(Regexp.new(Regexp.escape(src)),dst) File.symlink(newlink, newname) elsif FileTest::file?(fullname) File.copy(fullname, newname) else puts "??? : #{ fullname} " end end end # "Main" if ARGV.size != 2 puts "Usage: copytree oldname newname" exit end oldname = ARGV[0] newname = ARGV[1] if ! FileTest::directory?(oldname) puts "Error: First parameter must be an existing directory." exit end if FileTest::exist?(newname) puts "Error: #{ newname} already exists." exit end oldname = File.expand_path(oldname) newname = File.expand_path(newname) $oldname=oldname $newname=newname recurse(oldname, newname) Probably there are Unix variants in which there is a cp -R option that will preserve symlinksbut not any that we're using. Listing 8.5 was actually written to address that need in a real-life situation. Deleting Files by Age or Other Criteria Imagine that you want to scan through a directory and delete the oldest files. This directory might be some kind of repository for temporary files, log files, browser cache files, or similar data. Here we present a little code fragment that will remove all the files older than a certain timestamp (passed in as a Time object):
def delete_older(dir, time) save = Dir.getwd Dir.chdir(dir) Dir.foreach(".") do |entry| # We're not handling directories here next if File.stat(entry).directory? # Use the modification time if File.mtime(entry) < time File.unlink(entry) end end Dir.chdir(save) end delete_older("/tmp",Time.local(2001,3,29,18,38,0)) This is nice, but let's generalize it. Let's make a similar method called delete_if that takes a block which will evaluate to true or false. Let's then delete the file only if it fits the given criteria.
def delete_if(dir) save = Dir.getwd Dir.chdir(dir) Dir.foreach(".") do |entry| # We're not handling directories here next if File.stat(entry).directory? if yield entry File.unlink(entry) end end Dir.chdir(save) end # Delete all files over 3000 bytes delete_if("/tmp") { |f| File.size(f) > 3000 } # Delete all LOG and BAK files delete_if("/tmp") { |f| f =~ /(log|bak)$/i } Determining Free Space on a Disk Suppose that you want to know how many bytes are free on a certain device. We present here a very crude way of doing this, by running a system utility:
def freespace(device=".") lines = %x(df -k #{ device} ).split("\n") n = lines.last.split[1].to_i * 1024 end puts freespace("/tmp") # 16772204544 Better ways of doing this might exist. Sometimes the better they are, the more system-dependent they are. So that Windows users won't feel left out, we offer an equally ugly solution for them.
def freespace(device=".") lines = %x(cmd /c dir #{ device} ).split("\n") n = lines.last.split[2].delete(",").to_i end puts freespace "C:" # 5340389376 This code fragment assumes that the free space reported by dir is given in bytes (which isn't true for all variants of Windows). |