Fixing DOS Line Ends | Larger System Examples II

Table of contents:

Macintosh Line Conversions

When I wrote the first edition of this book, I shipped two copies of every example file on the CD-ROM (view CD-ROM content online at http://examples.oreilly.com/python2) -- one with Unix line-end markers, and one with DOS markers. The idea was that this would make it easy to view and edit the files on either platform. Readers would simply copy the examples directory tree designed for their platform onto their hard drive, and ignore the other one.

If you read Chapter 2, you know the issue here: DOS (and by proxy, Windows) marks line ends in text files with the two characters (carriage-return, line-feed), but Unix uses just a single . Most modern text editors don't care -- they happily display text files encoded in either format. Some tools are less forgiving, though. I still occasionally see odd characters when viewing DOS files on Unix, or an entire file in a single line when looking at Unix files on DOS (the Notepad accessory does this on Windows, for example).

Because this is only an occasional annoyance, and because it's easy to forget to keep two distinct example trees in sync, I adopted a different policy for this second edition: we're shipping a single copy of the examples (in DOS format), along with a portable converter tool for changing to and from other line-end formats.

The main obstacle, of course, is how to go about providing a portable and easy to use converter -- one that runs "out of the box" on almost every computer, without changes or recompiles. Some Unix platforms have commands like fromdos and dos2unix, but they are not universally available even on Unix. DOS batch files and csh scripts could do the job on Windows and Unix, respectively, but neither solution works on both platforms.

Fortunately, Python does. The scripts presented in Examples Example 5-1, Example 5-3, and Example 5-4 convert end-of-line markers between DOS and Unix formats; they convert a single file, a directory of files, and a directory tree of files. In this section, we briefly look at each of the three scripts, and contrast some of the system tools they apply. Each reuses the prior's code, and becomes progressively more powerful in the process.

The last of these three scripts, Example 5-4, is the portable converter tool I was looking for; it converts line ends in the entire examples tree, in a single step. Because it is pure Python, it also works on both DOS and Unix unchanged; as long as Python is installed, it is the only line converter you may ever need to remember.

5.2.1 Converting Line Ends in One File

These three scripts were developed in stages on purpose, so I could first focus on getting line-feed conversions right, before worrying about directories and tree walking logic. With that scheme in mind, Example 5-1 addresses just the task of converting lines in a single text file.

Example 5-1. PP2EPyToolsfixeoln_one.py

###################################################################
# Use: "python fixeoln_one.py [tounix|todos] filename".
# Convert end-of-lines in the single text file whose name is passed
# in on the command line, to the target format (tounix or todos). 
# The _one, _dir, and _all converters reuse the convert function 
# here. convertEndlines changes end-lines only if necessary:
# lines that are already in the target format are left unchanged,
# so it's okay to convert a file > once with any of the 3 fixeoln 
# scripts. Notes: must use binary file open modes for this to 
# work on Windows, else default text mode automatically deletes 
# the 
 on reads, and adds an extra 
 for each 
 on writes;
# Mac format not supported; PyToolsdumpfile.py shows raw bytes;
###################################################################

import os
listonly = 0 # 1=show file to be changed, don't rewrite

def convertEndlines(format, fname): # convert one file
 if not os.path.isfile(fname): # todos: 
 => 
 
 print 'Not a text file', fname # tounix: 
 => 

 return # skip directory names

 newlines = []
 changed = 0 
 for line in open(fname, 'rb').readlines( ): # use binary i/o modes
 if format == 'todos': # else 
 lost on Win
 if line[-1:] == '
' and line[-2:-1] != '
':
 line = line[:-1] + '
'
 changed = 1
 elif format == 'tounix': # avoids IndexError
 if line[-2:] == '
': # slices are scaled
 line = line[:-2] + '
'
 changed = 1
 newlines.append(line)

 if changed:
 try: # might be read-only
 print 'Changing', fname
 if not listonly: open(fname, 'wb').writelines(newlines) 
 except IOError, why:
 print 'Error writing to file %s: skipped (%s)' % (fname, why)

if __name__ == '__main__':
 import sys
 errmsg = 'Required arguments missing: ["todos"|"tounix"] filename'
 assert (len(sys.argv) == 3 and sys.argv[1] in ['todos', 'tounix']), errmsg
 convertEndlines(sys.argv[1], sys.argv[2])
 print 'Converted', sys.argv[2]

This script is fairly straightforward as system utilities go; it relies primarily on the built-in file object's methods. Given a target format flag and filename, it loads the file into a lines list using the readlines method, converts input lines to the target format if needed, and writes the result back to the file with the writelines method if any lines were changed:

C:	empexamples>python %X%PyToolsfixeoln_one.py tounix PyDemos.pyw
Changing PyDemos.pyw
Converted PyDemos.pyw

C:	empexamples>python %X%PyToolsfixeoln_one.py todos PyDemos.pyw
Changing PyDemos.pyw
Converted PyDemos.pyw

C:	empexamples>fc PyDemos.pyw %X%PyDemos.pyw
Comparing files PyDemos.pyw and C:PP2ndEdexamplesPP2EPyDemos.pyw
FC: no differences encountered

C:	empexamples>python %X%PyToolsfixeoln_one.py todos PyDemos.pyw
Converted PyDemos.pyw

C:	empexamples>python %X%PyToolsfixeoln_one.py toother nonesuch.txt
Traceback (innermost last):
 File "C:PP2ndEdexamplesPP2EPyToolsfixeoln_one.py", line 45, in ?
 assert (len(sys.argv) == 3 and sys.argv[1] in ['todos', 'tounix']), errmsg
AssertionError: Required arguments missing: ["todos"|"tounix"] filename

Here, the first command converts the file to Unix line-end format (tounix), and the second and fourth convert to the DOS convention -- all regardless of the platform on which this script is run. To make typical usage easier, converted text is written back to the file in place, instead of to a newly created output file. Notice that this script's filename has a "_" in it, not a "-"; because it is meant to be both run as a script and imported as a library, its filename must translate to a legal Python variable name in importers (fixeoln-one.py won't work for both roles).

In all the examples in this chapter that change files in directory trees, the C: empexamples and C: empcpexamples directories used in testing are full copies of the real PP2E examples root directory. I don't always show the copy commands used to create these test directories along the way (at least not until we've written our own in Python).

5.2.1.1 Slinging bytes and verifying results

The fc DOS file-compare command in the preceding interaction confirms the conversions, but to better verify the results of this Python script, I wrote another, shown in Example 5-2.

Example 5-2. PP2EPyToolsdumpfile.py

import sys
bytes = open(sys.argv[1], 'rb').read( )
print '-'*40
print repr(bytes)

print '-'*40
while bytes:
 bytes, chunk = bytes[4:], bytes[:4] # show 4-bytes per line
 for c in chunk: print oct(ord(c)), '	', # show octal of binary value
 print 

print '-'*40
for line in open(sys.argv[1], 'rb').readlines( ):
 print repr(line)

To give a clear picture of a file's contents, this script opens a file in binary mode (to suppress automatic line-feed conversions), prints its raw contents (bytes) all at once, displays the octal numeric ASCII codes of it contents four bytes per line, and shows its raw lines. Let's use this to trace conversions. First of all, use a simple text file to make wading through bytes a bit more humane:

C:	emp>type test.txt
a
b
c

C:	emp>python %X%PyToolsdumpfile.py test.txt
----------------------------------------
'a1512b1512c1512'
----------------------------------------
0141 015 012 0142
015 012 0143 015
012
----------------------------------------
'a1512'
'b1512'
'c1512'

The test.txt file here is in DOS line-end format -- the escape sequence 1512 displayed by the dumpfile script is simply the DOS line-end marker in octal character-code escapes format. Now, converting to Unix format changes all the DOS markers to a single (12) as advertised:

C:	emp>python %X%PyToolsfixeoln_one.py tounix test.txt
Changing test.txt
Converted test.txt

C:	emp>python %X%PyToolsdumpfile.py test.txt
----------------------------------------
'a12b12c12'
----------------------------------------
0141 012 0142 012
0143 012
----------------------------------------
'a12'
'b12'
'c12'

And converting back to DOS restores the original file format:

C:	emp>python %X%PyToolsfixeoln_one.py todos test.txt
Changing test.txt
Converted test.txt

C:	emp>python %X%PyToolsdumpfile.py test.txt
----------------------------------------
'a1512b1512c1512'
----------------------------------------
0141 015 012 0142
015 012 0143 015
012
----------------------------------------
'a1512'
'b1512'
'c1512'

C:	emp>python %X%PyToolsfixeoln_one.py todos test.txt # makes no changes
Converted test.txt

5.2.1.2 Nonintrusive conversions

Notice that no "Changing" message is emitted for the last command just run, because no changes were actually made to the file (it was already in DOS format). Because this program is smart enough to avoid converting a line that is already in the target format, it is safe to rerun on a file even if you can't recall what format the file already uses. More naive conversion logic might be simpler, but may not be repeatable. For instance, a string.replace call can be used to expand a Unix to a DOS (1512), but only once:

>>> import string
>>> lines = 'aaa
bbb
ccc
'
>>> lines = string.replace(lines, '
', '
') # okay: 
 added
>>> lines
'aaa1512bbb1512ccc1512'
>>> lines = string.replace(lines, '
', '
') # bad: double 

>>> lines
'aaa151512bbb151512ccc151512'

Such logic could easily trash a file if applied to it twice.[1] To really understand how the script gets around this problem, though, we need to take a closer look at its use of slices and binary file modes.

[1] In fact, see the files old_todos.py, old_tounix.py, and old_toboth.py in the PyTools directory on the examples CD (see http://examples.oreilly.com/python2) for a complete earlier implementation built around string.replace. It was repeatable for to-Unix changes, but not for to-DOS conversion (only the latter may add characters). The fixeoln scripts here were developed as a replacement, after I got burned by running to-DOS conversions twice.

5.2.1.3 Slicing strings out-of-bounds

This script relies on subtle aspects of string slicing behavior to inspect parts of each line without size checks. For instance:

The expression line[-2:] returns the last two characters at the end of the line (or one or zero characters, if the line isn't at least two characters long).
A slice like line[-2:-1] returns the second to last character (or an empty string, if the line is too small to have a second to last character).
The operation line[:-2] returns all characters except the last two at the end (or an empty string, if there are fewer than three characters).

Because out-of-bounds slices scale slice limits to be in-bounds, the script doesn't need to add explicit tests to guarantee that the line is big enough to have end-line characters at the end. For example:

>>> 'aaaXY'[-2:], 'XY'[-2:], 'Y'[-2:], ''[-2:]
('XY', 'XY', 'Y', '')

>>> 'aaaXY'[-2:-1], 'XY'[-2:-1], 'Y'[-2:-1], ''[-2:-1]
('X', 'X', '', '')

>>> 'aaaXY'[:-2], 'aaaY'[:-1], 'XY'[:-2], 'Y'[:-1]
('aaa', 'aaa', '', '')

If you imagine characters like and instead of the X and Y here, you'll understand how the script exploits slice scaling to good effect.

5.2.1.4 Binary file mode revisited

Because this script aims to be portable to Windows, it also takes care to open files in binary mode, even though they contain text data. As we've seen, when files are opened in text mode on Windows, is stripped from markers on input, and is added before markers on output. This automatic conversion allows scripts to represent the end-of-line marker as on all platforms. Here, though, it would also mean that the script would never see the it's looking for to detect a DOS-encoded line -- the would be dropped before it ever reached the script:

>>> open('temp.txt', 'w').writelines(['aaa
', 'bbb
'])
>>> open('temp.txt', 'rb').read( )
'aaa1512bbb1512'
>>> open('temp.txt', 'r').read( )
'aaa12bbb12'

Without binary open mode, this can lead to fairly subtle and incorrect behavior on Windows. For example, if files are opened in text mode, converting in "todos" mode on Windows would actually produce double characters: the script might convert the stripped to , which is then expanded on output to !

>>> open('temp.txt', 'w').writelines(['aaa
', 'bbb
'])
>>> open('temp.txt', 'rb').read( )
'aaa151512bbb151512'

With binary mode, the script inputs a full , so no conversion is performed. Binary mode is also required for output on Windows, to suppress the insertion of characters; without it, the "tounix" conversion would fail on that platform.[2]

[2] But wait -- it gets worse. Because of the auto-deletion and insertion of characters in Windows text mode, we might simply read and write files in text mode to perform the "todos" line conversion when run on Windows; the file interface will automatically add the on output if it's missing. However, this fails for other usage modes -- "tounix" conversions on Windows (only binary writes can omit the ), and "todos" when running on Unix (no is inserted). Magic is not always our friend.

If all that is too subtle to bear, just remember to use the "b" in file open mode strings if your scripts might be run on Windows, and you mean to process either true binary data or text data as it is actually stored in the file.

Macintosh Line Conversions

As coded, the convertEndlines function does not support Macintosh single line terminators at all. It neither converts to Macintosh terminators from DOS and Unix format ( and to ), nor converts from Macintosh terminators to DOS or Unix format ( to or ). Files in Mac format pass untouched through both the "todos" and "tounix" conversions in this script (study the code to see why). I don't use a Mac, but some readers may.

Since adding Mac support would make this code more complex, and since I don't like publishing code in books unless it's been well tested, I'll leave such an extension as an exercise for the Mac Python users in the audience. But for implementation hints, see file PP2EPyToolsfixeoln_one_mac.py on the CD (see http://examples.oreilly.com/python2). When run on Windows, it does to-Mac conversions:

C:	emp>python %X%PyToolsfixeoln_one_mac.py tomac test.txt
Changing test.txt
Converted test.txt

C:	emp>python %X%PyToolsdumpfile.py test.txt
----------------------------------------
'a15b15c15'
----------------------------------------
0141 015 0142 015
0143 015
----------------------------------------
'a15b15c15'

but fails to convert files already in Mac format to Unix or DOS, because the file readlines method does not treat a bare as a line break on that platform. The last output line is a single file line, as far as Windows is concerned; converting back to DOS just adds a single at its end.

5.2.2 Converting Line Ends in One Directory

Armed with a fully debugged single file converter, it's an easy step to add support for converting all files in a single directory. Simply call the single file converter on every filename returned by a directory listing tool. The script in Example 5-3 uses the glob module we met in Chapter 2Chapter 2 to grab a list of files to convert.

Example 5-3. PP2EPyToolsfixeoln_dir.py

#########################################################
# Use: "python fixeoln_dir.py [tounix|todos] patterns?".
# convert end-lines in all the text files in the current
# directory (only: does not recurse to subdirectories). 
# Reuses converter in the single-file _one version.
#########################################################

import sys, glob
from fixeoln_one import convertEndlines
listonly = 0
patts = ['*.py', '*.pyw', '*.txt', '*.cgi', '*.html', # text file names
 '*.c', '*.cxx', '*.h', '*.i', '*.out', # in this package
 'README*', 'makefile*', 'output*', '*.note']

if __name__ == '__main__':
 errmsg = 'Required first argument missing: "todos" or "tounix"'
 assert (len(sys.argv) >= 2 and sys.argv[1] in ['todos', 'tounix']), errmsg

 if len(sys.argv) > 2: # glob anyhow: '*' not applied on dos
 patts = sys.argv[2:] # though not really needed on linux
 filelists = map(glob.glob, patts) # name matches in this dir only 

 count = 0
 for list in filelists:
 for fname in list:
 if listonly:
 print count+1, '=>', fname
 else:
 convertEndlines(sys.argv[1], fname)
 count = count + 1

 print 'Visited %d files' % count

This module defines a list, patts, containing filename patterns that match all the kinds of text files that appear in the book examples tree; each pattern is passed to the built-in glob.glob call by map, to be separately expanded into a list of matching files. That's why there are nested for loops near the end -- the outer loop steps through each glob result list, and the inner steps through each name within each list. Try the map call interactively if this doesn't make sense:

>>> import glob
>>> map(glob.glob, ['*.py', '*.html'])
[['helloshell.py'], ['about-pp.html', 'about-pp2e.html', 'about-ppr2e.html']]

This script requires a convert mode flag on the command line, and assumes that it is run in the directory where files to be converted live; cd to the directory to be converted before running this script (or change it to accept a directory name argument too):

C:	empexamples>python %X%PyToolsfixeoln_dir.py tounix 
Changing Launcher.py
Changing Launch_PyGadgets.py
Changing LaunchBrowser.py
 ...lines deleted...
Changing PyDemos.pyw
Changing PyGadgets_bar.pyw
Changing README-PP2E.txt
Visited 21 files

C:	empexamples>python %X%PyToolsfixeoln_dir.py todos 
Changing Launcher.py
Changing Launch_PyGadgets.py
Changing LaunchBrowser.py
 ...lines deleted...
Changing PyDemos.pyw
Changing PyGadgets_bar.pyw
Changing README-PP2E.txt
Visited 21 files

C:	empexamples>python %X%PyToolsfixeoln_dir.py todos  # makes no changes
Visited 21 files

C:	empexamples>fc PyDemos.pyw %X%PyDemos.pyw 
Comparing files PyDemos.pyw and C:PP2ndEdexamplesPP2EPyDemos.pyw
FC: no differences encountered

Notice that the third command generated no "Changing" messages again. Because the convertEndlines function of the single-file module is reused here to perform the actual updates, this script inherits that function's repeatability : it's okay to rerun this script on the same directory any number of times. Only lines that require conversion will be converted. This script also accepts an optional list of filename patterns on the command line, to override the default patts list of files to be changed:

C:	empexamples>python %X%PyToolsfixeoln_dir.py tounix *.pyw *.csh
Changing echoEnvironment.pyw
Changing Launch_PyDemos.pyw
Changing Launch_PyGadgets_bar.pyw
Changing PyDemos.pyw
Changing PyGadgets_bar.pyw
Changing cleanall.csh
Changing makeall.csh
Changing package.csh
Changing setup-pp.csh
Changing setup-pp-embed.csh
Changing xferall.linux.csh
Visited 11 files

C:	empexamples>python %X%PyToolsfixeoln_dir.py tounix *.pyw *.csh
Visited 11 files

Also notice that the single-file script's convertEndlines function performs an initial os.path.isfile test to make sure the passed-in filename represents a file, not a directory; when we start globbing with patterns to collect files to convert, it's not impossible that a pattern's expansion might include the name of a directory along with the desired files.

Unix and Linux users: Unix-like shells automatically glob (i.e., expand) filename pattern operators like * in command lines before they ever reach your script. You generally need to quote such patterns to pass them in to scripts verbatim (e.g., "*.py").The fixeoln_dir script will still work if you don'tits glob.glob calls will simply find a single matching filename for each already-globbed name, and so have no effect:

>>>glob.glob('PyDemos.pyw')
['PyDemos.pyw']

Patterns are not pre-globbed in the DOS shell, though, so the glob.glob calls here are still a good idea in scripts that aspire to be as portable as this one.

5.2.3 Converting Line Ends in an Entire Tree

Finally, Example 5-4 applies what we've already learned to an entire directory tree. It simply runs the file-converter function to every filename produced by tree-walking logic. In fact, this script really just orchestrates calls to the original and already debugged convertEndlines function.

Example 5-4. PP2EPyToolsfixeoln_all.py

#########################################################
# Use: "python fixeoln_all.py [tounix|todos] patterns?".
# find and convert end-of-lines in all text files at and
# below the directory where this script is run (the dir 
# you are in when you type 'python'). If needed, tries to 
# use the Python find.py library module, else reads the 
# output of a unix-style find command; uses a default 
# filename patterns list if patterns argument is absent.
# This script only changes files that need to be changed, 
# so it's safe to run brute-force from a root-level dir.
#########################################################

import os, sys, string
debug = 0
pyfind = 0 # force py find
listonly = 0 # 1=show find results only

def findFiles(patts, debug=debug, pyfind=pyfind):
 try:
 if sys.platform[:3] == 'win' or pyfind:
 print 'Using Python find'
 try:
 import find # use python-code find.py
 except ImportError: # use mine if deprecated!
 from PP2E.PyTools import find # may get from my dir anyhow
 matches = map(find.find, patts) # startdir default = '.'
 else:
 print 'Using find executable'
 matches = []
 for patt in patts:
 findcmd = 'find . -name "%s" -print' % patt # run find command
 lines = os.popen(findcmd).readlines( ) # remove endlines
 matches.append(map(string.strip, lines)) # lambda x: x[:-1]
 except:
 assert 0, 'Sorry - cannot find files'
 if debug: print matches
 return matches

if __name__ == '__main__':
 from fixeoln_dir import patts
 from fixeoln_one import convertEndlines

 errmsg = 'Required first argument missing: "todos" or "tounix"'
 assert (len(sys.argv) >= 2 and sys.argv[1] in ['todos', 'tounix']), errmsg

 if len(sys.argv) > 2: # quote in unix shell 
 patts = sys.argv[2:] # else tries to expand
 matches = findFiles(patts)

 count = 0
 for matchlist in matches: # a list of lists
 for fname in matchlist: # one per pattern
 if listonly:
 print count+1, '=>', fname 
 else: 
 convertEndlines(sys.argv[1], fname)
 count = count + 1
 print 'Visited %d files' % count

On Windows, the script uses the portable find.find built-in tool we met in Chapter 2 (either Python's or the hand-rolled equivalent)[3] to generate a list of all matching file and directory names in the tree; on other platforms, it resorts to spawning a less portable and probably slower find shell command just for illustration purposes.

[3] Recall that the home directory of a running script is always added to the front of sys.path to give the script import visibility to other files in the script's directory. Because of that, this script would normally load the PP2EPyToolsfind.py module anyhow (not the one in the Python library), by just saying import find; it need not specify the full package path in the import. The try handler and full path import are useful here only if this script is moved to a different source directory. Since I move files a lot, I tend to code with self-inflicted worst-case scenarios in mind.

Once the file pathname lists are compiled, this script simply converts each found file in turn using the single-file converter module's tools. Here is the collection of scripts at work converting the book examples tree on Windows; notice that this script also processes the current working directory (CWD; cd to the directory to be converted before typing the command line), and that Python treats forward and backward slashes the same in the program filename:

C:	empexamples>python %X%/PyTools/fixeoln_all.py tounix 
Using Python find
Changing .LaunchBrowser.py
Changing .Launch_PyGadgets.py
Changing .Launcher.py
Changing .Othercgimail.py
 ...lots of lines deleted...
Changing .EmbExtExportsClassAndModoutput.prog1
Changing .EmbExtExportsoutput.prog1
Changing .EmbExtRegistoutput
Visited 1051 files

C:	empexamples>python %X%/PyTools/fixeoln_all.py todos 
Using Python find
Changing .LaunchBrowser.py
Changing .Launch_PyGadgets.py
Changing .Launcher.py
Changing .Othercgimail.py
 ...lots of lines deleted...
Changing .EmbExtExportsClassAndModoutput.prog1
Changing .EmbExtExportsoutput.prog1
Changing .EmbExtRegistoutput
Visited 1051 files

C:	empexamples>python %X%/PyTools/fixeoln_all.py todos 
Using Python find
Not a text file .EmbedInventoryOutput
Not a text file .EmbedInventoryWithDbaseOutput
Visited 1051 files

The first two commands convert over 1000 files, and usually take some eight seconds of real-world time to finish on my 650 MHz Windows 98 machine; the third takes only six seconds, because no files have to be updated (and fewer messages have to be scrolled on the screen). Don't take these figures too seriously, though; they can vary by system load, and much of this time is probably spent scrolling the script's output to the screen.

5.2.3.1 The view from the top

This script and its ancestors are shipped on the book's CD, as that portable converter tool I was looking for. To convert all examples files in the tree to Unix line-terminator format, simply copy the entire PP2E examples tree to some "examples" directory on your hard drive, and type these two commands in a shell:

cd examples/PP2E
python PyTools/fixeoln_all.py tounix

Of course, this assumes Python is already installed (see the CD's README file for details; see http://examples.oreilly.com/python2), but will work on almost every platform in use today.[4] To convert back to DOS, just replace "tounix" with "todos" and rerun. I ship this tool with a training CD for Python classes I teach too; to convert those files, we simply type:

[4] Except Macs, perhaps -- see Macintosh Line Conversions earlier in this chapter. To convert to Mac format, try replacing the script's import of fixeoln_one to load fixeoln_one_mac.

cd HtmlExamples
python ....Toolsfixeoln_all.py tounix

Once you get accustomed to the command lines, you can use this in all sorts of contexts. Finally, to make the conversion easier for beginners to run, the top-level examples directory includes tounix.py and todos.py scripts that can be simply double-clicked in a file explorer GUI; Example 5-5 shows the "tounix" converter.

Example 5-5. PP2E ounix.py

#!/usr/local/bin/python
######################################################################
# Run me to convert all text files to UNIX/Linux line-feed format.
# You only need to do this if you see odd '
' characters at the end
# of lines in text files in this distribution, when they are viewed 
# with your text editor (e.g., vi). This script converts all files 
# at and below the examples root, and only converts files that have 
# not already been converted (it's okay to run this multiple times).
#
# Since this is a Python script which runs another Python script, 
# you must install Python first to run this program; then from your
# system command-line (e.g., a xterm window), cd to the directory 
# where this script lives, and then type "python tounix.py". You 
# may also be able to simply click on this file's icon in your file
# system explorer, if it knows what '.py' file are.
###################################################################### 

import os
prompt = """
This program converts all text files in the book
examples distribution to UNIX line-feed format.
Are you sure you want to do this (y=yes)? """

answer = raw_input(prompt) 
if answer not in ['y', 'Y', 'yes']:
 print 'Cancelled'
else:
 os.system('python PyTools/fixeoln_all.py tounix')

This script addresses the end user's perception of usability, but other factors impact programmer usability -- just as important to systems that will be read or changed by others. For example, the file, directory, and tree converters are coded in separate script files, but there is no law against combining them into a single program that relies on a command-line arguments pattern to know which of the three modes to run. The first argument could be a mode flag, tested by such a program:

if mode == '-one':
 ...
elif mode == '-dir':
 ...
elif mode == '-all:
 ...

That seems more confusing than separate files per mode, though; it's usually much easier to botch a complex command line than to type a specific program file's name. It will also make for a confusing mix of global names, and one very big piece of code at the bottom of the file. As always, simpler is usually better.

Introducing Python

Part I: System Interfaces