Recipe 2.15. Adapting a File-like Object to a True File ObjectCredit: Michael Kent ProblemYou need to pass a file-like object (e.g., the results of a call such as urllib.urlopen ) to a function or method that insists on receiving a true file object (e.g., a function such as marshal.load ). SolutionTo cooperate with such type-checking, we need to write all data from the file-like object into a temporary file on disk. Then, we can use the (true) file object for that temporary disk file. Here's a function that implements this idea:
import types, tempfile
CHUNK_SIZE = 16 * 1024
def adapt_file(fileObj):
if isinstance(fileObj, file): return fileObj
tmpFileObj = tempfile.TemporaryFile
while True:
data = fileObj.read(CHUNK_SIZE)
if not data: break
tmpFileObj.write(data)
fileObj.close( )
tmpFileObj.seek(0)
return tmpFileObj
Discussion
This recipe
One way or another, you should think in terms of adaptation, in preference to type testing, even when you need to rely on some lower-level utility that insists on precise types. Instead of raising an exception when you get passed an object that's
See AlsoDocumentation on built-in file objects, and modules tempfile and marshal , in the Library Reference and Python in a Nutshell . |
Recipe 2.16. Walking Directory TreesCredit: Robin Parmar, Alex Martelli Problem
You need to examine a "directory", or an entire directory tree rooted in a certain directory, and iterate on the files (and
SolutionThe generator os.walk from the Python Standard Library module os is sufficient for this task, but we can dress it up a bit by coding our own function to wrap os.walk :
import os, fnmatch
def all_files(root, patterns='*', single_level=False, yield_folders=False):
# Expand patterns from semicolon-separated string to list
patterns = patterns.split(';')
for path, subdirs, files in os.walk(root):
if yield_folders:
files.extend(subdirs)
files.sort( )
for name in files:
for pattern in patterns:
if fnmatch.fnmatch(name, pattern):
yield os.path.join(path, name)
break
if single_level:
break
DiscussionThe standard directory tree traversal generator os.walk is powerful, simple, and flexible. However, as it stands, os.walk lacks a few niceties that applications may need, such as selecting files according to some patterns, flat (linear) looping on all files (and optionally folders) in sorted order, and the ability to examine a single directory (without entering its subdirectories). This recipe shows how easily these kinds of features can be added, by wrapping os.walk into another simple generator and using standard library module fnmatch to check filenames for matches to patterns.
The file patterns are possibly case-insensitive (that's platform-dependent) but
For example, you can easily get a list of all Python and HTML files in directory /tmp or any subdirectory thereof:
thefiles = list(all_files('/tmp', '*.py;*.htm;*.html'))
Should you just want to process these files' paths one at a time (e.g., print them, one per line), you do not need to build a list: you can simply loop on the result of calling all_files :
for path in all_files('/tmp', '*.py;*.htm;*.html'):
print path
If your platform is case-sensitive, alnd you want case-sensitive matching, then you need to specify the patterns more laboriously, e.g., ' *.[Hh][Tt][Mm][Ll] ' instead of just ' *.html '. See AlsoDocumentation for the os.path module and the os.walk generator, as well as the fnmatch module, in the Library Reference and Python in a Nutshell . |