Recipe 2.15. Adapting a File-like Object to a True File Object
Credit: Michael Kent
You need to pass a file-like object (e.g., the results of a call such as urllib.urlopen ) to a function or method that insists on receiving a true file object (e.g., a function such as marshal.load ).
To cooperate with such type-checking, we need to write all data from the file-like object into a temporary file on disk. Then, we can use the (true) file object for that temporary disk file. Here's a function that implements this idea:
import types, tempfile CHUNK_SIZE = 16 * 1024 def adapt_file(fileObj): if isinstance(fileObj, file): return fileObj tmpFileObj = tempfile.TemporaryFile while True: data = fileObj.read(CHUNK_SIZE) if not data: break tmpFileObj.write(data) fileObj.close( ) tmpFileObj.seek(0) return tmpFileObj
One way or another, you should think in terms of adaptation, in preference to type testing, even when you need to rely on some lower-level utility that insists on precise types. Instead of raising an exception when you get passed an object that's
Documentation on built-in file objects, and modules tempfile and marshal , in the Library Reference and Python in a Nutshell .
Recipe 2.16. Walking Directory Trees
Credit: Robin Parmar, Alex Martelli
You need to examine a "directory", or an entire directory tree rooted in a certain directory, and iterate on the files (and
The generator os.walk from the Python Standard Library module os is sufficient for this task, but we can dress it up a bit by coding our own function to wrap os.walk :
import os, fnmatch def all_files(root, patterns='*', single_level=False, yield_folders=False): # Expand patterns from semicolon-separated string to list patterns = patterns.split(';') for path, subdirs, files in os.walk(root): if yield_folders: files.extend(subdirs) files.sort( ) for name in files: for pattern in patterns: if fnmatch.fnmatch(name, pattern): yield os.path.join(path, name) break if single_level: break
The standard directory tree traversal generator os.walk is powerful, simple, and flexible. However, as it stands, os.walk lacks a few niceties that applications may need, such as selecting files according to some patterns, flat (linear) looping on all files (and optionally folders) in sorted order, and the ability to examine a single directory (without entering its subdirectories). This recipe shows how easily these kinds of features can be added, by wrapping os.walk into another simple generator and using standard library module fnmatch to check filenames for matches to patterns.
The file patterns are possibly case-insensitive (that's platform-dependent) but
For example, you can easily get a list of all Python and HTML files in directory /tmp or any subdirectory thereof:
thefiles = list(all_files('/tmp', '*.py;*.htm;*.html'))
Should you just want to process these files' paths one at a time (e.g., print them, one per line), you do not need to build a list: you can simply loop on the result of calling all_files :
for path in all_files('/tmp', '*.py;*.htm;*.html'): print path
If your platform is case-sensitive, alnd you want case-sensitive matching, then you need to specify the patterns more laboriously, e.g., ' *.[Hh][Tt][Mm][Ll] ' instead of just ' *.html '.
Documentation for the os.path module and the os.walk generator, as well as the fnmatch module, in the Library Reference and Python in a Nutshell .