14.8. Working with Files, Directories, and TreesA broad area of everyday scripting is to work with files and directories, including entire subtrees of files. Much of the relevant material has already been covered in Chapter 4, "Internationalization in Ruby," but we will hit a few high points here. Because I/O is a fairly system-dependent thing, many tricks will vary from one operating system to another. If you are in doubt, you should resort to experimentation. 14.8.1. A Few Words on Text FiltersMany tools that we use every day (both vendor-supplied and home-grown) are simply text filters; that is, they accept textual input, process or transform it in some way, and output it again. Classic examples of text filters in the UNIX world are sed and tr, among others. Sometimes a file is small enough to be read into memory. This allows processing that might otherwise be difficult. file = File.open(filename) lines = file.readlines # Manipulate as needed... lines.each { |x| puts x } Sometimes we'll need to process it a line at a time. IO.foreach(filename) do |line| # Manipulate as needed... puts line end Finally, don't forget that any filenames on the command line are automatically gathered into ARGF, representing a concatenation of all input. (See section 14.2.2, "Working with ARGF.") In this case, we can use calls such as ARGF.readlines just as if ARGF were an IO object. All output would go to standard output as usual. 14.8.2. Copying a Directory Tree (with symlinks)Suppose that you want to copy an entire directory structure to a new location. There are various ways of doing this operation, but if the tree has internal symbolic links this becomes more difficult. Listing 14.5 shows a recursive solution with a little added user-friendliness. It is smart enough to check the most basic error conditions and also print a usage message. Listing 14.5. Copying a Directory Tree
Probably there are UNIX variants in which there is a cp -R option that will preserve symlinks, but not any that we're using. Listing 14.5 was actually written to address that need in a real-life situation. 14.8.3. Deleting Files by Age or Other CriteriaImagine that you want to scan through a directory and delete the oldest files. This directory might be some kind of repository for temporary files, log files, browser cache files, or similar data. Here we present a little code fragment that will remove all the files older than a certain time stamp (passed in as a Time object): def delete_older(dir, time) Dir.chdir(dir) do Dir.foreach(".") do |entry| # We're not handling directories here next if File.stat(entry).directory? # Use the modification time if File.mtime(entry) < time File.unlink(entry) end end end end delete_older("/tmp",Time.local(2001,3,29,18,38,0)) This is nice, but let's generalize it. Let's make a similar method called delete_if that takes a block that will evaluate to true or false. Let's then delete the file only if it fits the given criteria. def delete_if(dir) Dir.chdir(dir) do Dir.foreach(".") do |entry| # We're not handling directories here next if File.stat(entry).directory? if yield entry File.unlink(entry) end end end end # Delete all files over 3000 bytes delete_if("/tmp") { |f| File.size(f) > 3000 } # Delete all LOG and BAK files delete_if("/tmp") { |f| f =~ /(log|bak)$/i } 14.8.4. Determining Free Space on a DiskSuppose that you want to know how many bytes are free on a certain device. The following code example is a crude way of doing this, by running a system utility: def freespace(device=".") lines = %x(df -k #{device}).split("\n") n = lines.last.split[1].to_i * 1024 end puts freespace("/tmp") # 16772204544 A better way of doing this would be to wrap statfs in a Ruby extension. This has been done in the past, but the project seems to be a dead one. On Windows, there is a somewhat more elegant solution (supplied by Daniel Berger): require 'Win32API' GetDiskFreeSpaceEx = Win32API.new('kernel32', 'GetDiskFreeSpaceEx', 'PPPP', 'I') def freespace(dir=".") total_bytes = [0].pack('Q') total_free = [0].pack('Q') GetDiskFreeSpaceEx.call(dir, 0, total_bytes, total_free) total_bytes = total_bytes.unpack('Q').first total_free = total_free.unpack('Q').first end puts freespace("C:") # 5340389376 This code fragment should work on all variants of Windows. |