The os.path module contains functions that deal with long filenames (pathnames) in various ways. To use this module, import the os module, and access this module as os.path.
1.5.1 Working with Filenames
The os.path module contains a number of functions that deal with long filenames in a platform independent way. In other words, you won't have to deal with forward and backward slashes, colons, and whatnot. Let's look at Example 1-42.
Example 1-42. Using the os.path Module to Handle Filename
File: os-path-example-1.py import os filename = "my/little/pony" print "using", os.name, "..." print "split", "=>", os.path.split(filename) print "splitext", "=>", os.path.splitext(filename) print "dirname", "=>", os.path.dirname(filename) print "basename", "=>", os.path.basename(filename) print "join", "=>", os.path.join(os.path.dirname(filename), os.path.basename(filename)) using nt ... split => ('my/little', 'pony') splitext => ('my/little/pony', '') dirname => my/little basename => pony join => my/littlepony
Note that split only splits off a single item.
The os.path module also contains a number of functions that allow you to quickly figure out what a filename represents, as shown in Example 1-43.
Example 1-43. Using the os.path Module to Check What a Filename Represents
File: os-path-example-2.py import os FILES = ( os.curdir, "/", "file", "/file", "samples", "samples/sample.jpg", "directory/file", "../directory/file", "/directory/file" ) for file in FILES: print file, "=>", if os.path.exists(file): print "EXISTS", if os.path.isabs(file): print "ISABS", if os.path.isdir(file): print "ISDIR", if os.path.isfile(file): print "ISFILE", if os.path.islink(file): print "ISLINK", if os.path.ismount(file): print "ISMOUNT", print . => EXISTS ISDIR / => EXISTS ISABS ISDIR ISMOUNT file => /file => ISABS samples => EXISTS ISDIR samples/sample.jpg => EXISTS ISFILE directory/file => ../directory/file => /directory/file => ISABS
The expanduser function treats a username shortcut in the same way as most modern Unix shells (it doesn't work well on Windows), as shown in Example 1-44.
Example 1-44. Using the os.path Module to Insert the Username into a Filename
File: os-path-expanduser-example-1.py import os print os.path.expanduser("~/.pythonrc") # /home/effbot/.pythonrc
The expandvars function inserts environment variables into a filename, as shown in Example 1-45.
Example 1-45. Using the os.path Module to Insert Variables into a Filename
File: os-path-expandvars-example-1.py import os os.environ["USER"] = "user" print os.path.expandvars("/home/$USER/config") print os.path.expandvars("$USER/folders") /home/user/config user/folders
1.5.2 Traversing a Filesystem
The walk function helps you find all files in a directory tree (as Example 1-46 demonstrates). It takes a directory name, a callback function, and a data object that is passed on to the callback.
Example 1-46. Using the os.path Module to Traverse a Filesystem
File: os-path-walk-example-1.py import os def callback(arg, directory, files): for file in files: print os.path.join(directory, file), repr(arg) os.path.walk(".", callback, "secret message") ./aifc-example-1.py 'secret message' ./anydbm-example-1.py 'secret message' ./array-example-1.py 'secret message' ... ./samples 'secret message' ./samples/sample.jpg 'secret message' ./samples/sample.txt 'secret message' ./samples/sample.zip 'secret message' ./samples/articles 'secret message' ./samples/articles/article-1.txt 'secret message' ./samples/articles/article-2.txt 'secret message' ...
The walk function has a somewhat obscure user interface (maybe it's just me, but I can never remember the order of the arguments). The index function in Example 1-47 returns a list of filenames instead, which lets you use a straightforward for-in loop to process the files.
Example 1-47. Using os.listdir to Traverse a Filesystem
File: os-path-walk-example-2.py import os def index(directory): # like os.listdir, but traverses directory trees stack = [directory] files = [] while stack: directory = stack.pop() for file in os.listdir(directory): fullname = os.path.join(directory, file) files.append(fullname) if os.path.isdir(fullname) and not os.path.islink(fullname): stack.append(fullname) return files for file in index("."): print file .aifc-example-1.py .anydbm-example-1.py .array-example-1.py ...
If you don't want to list all files (for performance or memory reasons), Example 1-48 uses a different approach. Here, the DirectoryWalker class behaves like a sequence object, returning one file at a time:
Example 1-48. Using DirectoryWalker to Traverse a Filesystem
File: os-path-walk-example-3.py import os class DirectoryWalker: # a forward iterator that traverses a directory tree def _ _init_ _(self, directory): self.stack = [directory] self.files = [] self.index = 0 def _ _getitem_ _(self, index): while 1: try: file = self.files[self.index] self.index = self.index + 1 except IndexError: # pop next directory from stack self.directory = self.stack.pop() self.files = os.listdir(self.directory) self.index = 0 else: # got a filename fullname = os.path.join(self.directory, file) if os.path.isdir(fullname) and not os.path.islink(fullname): self.stack.append(fullname) return fullname for file in DirectoryWalker("."): print file .aifc-example-1.py .anydbm-example-1.py .array-example-1.py ...
Note the DirectoryWalker class doesn't check the index passed to the _ _getitem_ _ method. This means that it won't work properly if you access the sequence members out of order.
Finally, if you're interested in the file sizes or timestamps, Example 1-49 demonstrates a version of the class that returns both the filename and the tuple returned from os.stat. This version saves one or two stat calls for each file (both os.path.isdir and os.path.islink uses stat), and runs quite a bit faster on some platforms.
Example 1-49. Using DirectoryStatWalker to Traverse a Filesystem
File: os-path-walk-example-4.py import os, stat class DirectoryStatWalker: # a forward iterator that traverses a directory tree, and # returns the filename and additional file information def _ _init_ _(self, directory): self.stack = [directory] self.files = [] self.index = 0 def _ _getitem_ _(self, index): while 1: try: file = self.files[self.index] self.index = self.index + 1 except IndexError: # pop next directory from stack self.directory = self.stack.pop() self.files = os.listdir(self.directory) self.index = 0 else: # got a filename fullname = os.path.join(self.directory, file) st = os.stat(fullname) mode = st[stat.ST_MODE] if stat.S_ISDIR(mode) and not stat.S_ISLNK(mode): self.stack.append(fullname) return fullname, st for file, st in DirectoryStatWalker("."): print file, st[stat.ST_SIZE] .aifc-example-1.py 336 .anydbm-example-1.py 244 .array-example-1.py 526
Core Modules
More Standard Modules
Threads and Processes
Data Representation
File Formats
Mail and News Message Processing
Network Protocols
Internationalization
Multimedia Modules
Data Storage
Tools and Utilities
Platform-Specific Modules
Implementation Support Modules
Other Modules