7.7. Deleting Directory Trees
Both of the copy scripts in the last section work as planned, but they aren't very forgiving of existing directory trees. That is, they implicitly assume that the "to" target directory either is empty or doesn't exist at all, and they fail badly if that isn't the case. Presumably, you will first somehow delete the target directory on your machine. For my purposes, that was a reasonable assumption to make.
The copiers could be changed to work with existing "to" directories too (e.g., ignore os.mkdir exceptions), but I prefer to start from scratch when copying trees; you never know what old garbage might be lying around in the "to" directory. So when testing the earlier copies, I was careful to run an rm -rf cpexamples command line to recursively delete the entire cpexamples directory tree before copying another tree to that name.
Unfortunately, the rm command used to clear the target directory is really a Unix utility that I installed on my PC from a commercial package; it probably won't work on your computer. There are other platform-specific ways to delete directory trees (e.g., deleting a folder's icon in a Windows explorer GUI), but why not do it once in Python for every platform? Example 7-27 deletes every file and directory at and below a passed-in directory's name. Because its logic is packaged as a function, it is also an importable utility that can be run from other scripts. Because it is pure Python code, it is a cross-platform solution for tree removal.
Example 7-27. PP3E\System\Filetools\rmall.py
The great thing about coding this sort of tool in Python is that it can be run with the same command-line interface on any machine where Python is installed. If you don't have an rm -rf type command available on your Windows, Unix, or Macintosh computer, simply run the Python rmall script instead:
C:\temp>python %X%\System\Filetools\cpall.py examples cpexamples Note: dirTo was created Copying... Copied 1379 files, 121 directories in 2.68999993801 seconds C:\temp>python %X%\System\Filetools\rmall.py cpexamples Removed 1379 files and 122 dirs in 0.549999952316 secs C:\temp>ls cpexamples ls: File or directory "cpexamples" is not found
Here, the script traverses and deletes a tree of 1,379 files and 122 directories in about half a secondsubstantially impressive for a noncompiled programming language, and roughly equivalent to the commercial rm -rf program I purchased and installed on my PC.
One subtlety here: this script must be careful to delete the contents of a directory before deleting the directory itselfthe os.rmdir call mandates that directories must be empty when deleted (and throws an exception if they are not). Because of that, the recursive calls on subdirectories need to happen before the os.mkdir call. Computer scientists would recognize this as a postorder, depth-first tree traversal, since we process parent directories after their children. This also makes any traversals based on os.path.walk out of the question: we need to return to a parent directory to delete it after visiting its descendents.
To illustrate, let's run interactive os.remove and os.rmdir calls on a cpexamples directory containing files or nested directories:
>>> os.path.isdir('cpexamples') 1 >>> os.remove('cpexamples') Traceback (innermost last): File "<stdin>", line 1, in ? OSError: [Errno 2] No such file or directory: 'cpexamples' >>> os.rmdir('cpexamples') Traceback (innermost last): File "<stdin>", line 1, in ? OSError: [Errno 13] Permission denied: 'cpexamples'
Both calls always fail if the directory is not empty. But now, delete the contents of cpexamples in another window and try again:
>>> os.path.isdir('cpexamples') 1 >>> os.remove('cpexamples') Traceback (innermost last): File "<stdin>", line 1, in ? OSError: [Errno 2] No such file or directory: 'cpexamples' >>> os.rmdir('cpexamples') >>> os.path.exists('cpexamples') 0
The os.remove still failsit's meant only for deleting nondirectory itemsbut os.rmdir now works because the directory is empty. The upshot of this is that a tree deletion traversal must generally remove directories "on the way out."
7.7.1. Recoding Deletions for Generality
As coded, the rmall script processes directory names and fails only if it's given names of simple files, but it's trivial to generalize the script to eliminate that restriction. The recoding in Example 7-28 accepts an arbitrary command-line list of file and directory names, deletes simple files, and recursively deletes directories.
Example 7-28. PP3E\System\Filetools\rmall2.py
This shorter version runs the same way, and just as fast, as the original:
C:\temp>python %X%\System\Filetools\cpall.py examples cpexamples Note: dirTo was created Copying... Copied 1379 files, 121 directories in 2.52999997139 seconds C:\temp>python %X%\System\Filetools\rmall2.py cpexamples Removed 1379 files and 122 dirs in 0.550000071526 secs C:\temp>ls cpexamples ls: File or directory "cpexamples" is not found
But it can also be used to delete simple files:
C:\temp>python %X%\System\Filetools\rmall2.py spam.txt eggs.txt Removed 2 files and 0 dirs in 0.0600000619888 secs C:\temp>python %X%\System\Filetools\rmall2.py spam.txt eggs.txt cpexamples Removed 1381 files and 122 dirs in 0.630000042915 secs
As usual, there is more than one way to do it in Python (though you'll have to try hard to find many spurious ways). Notice that these scripts trap no exceptions; in programs designed to blindly delete an entire directory tree, exceptions are all likely to denote truly bad things. We could get fancier and support filename patterns by using the built-in fnmatch module along the way too, but this was beyond the scope of these scripts' goals (for pointers on matching, see Example 7-17 and find.py in Chapter 4).
Also note that because the newer os.walk call we met in Chapter 4 provides a bottom-up tree search option, it gives another way to delete a tree without recursion (subdirectory triples are returned before their containing directory):[*]
# delete everything in the tree rooted at 'top' import os for (root, dirs, files) in os.walk(top, topdown=False): for name in files: os.remove(os.path.join(root, name)) for name in dirs: os.rmdir(os.path.join(root, name))