Recipe 2.3. Searching and Replacing Text in a FileCredit: Jeff Bauer, Adam Krieg ProblemYou need to change one string into another throughout a file. SolutionString substitution is most simply performed by the replace method of string objects. The work here is to support reading from a specified file (or standard input) and writing to a specified file (or standard output): #!/usr/bin/env python import os, sys nargs = len(sys.argv) if not 3 <= nargs <= 5: print "usage: %s search_text replace_text [infile [outfile]]" % \ os.path.basename(sys.argv[0]) else: stext = sys.argv[1] rtext = sys.argv[2] input_file = sys.stdin output_file = sys.stdout if nargs > 3: 7 input_file = open(sys.argv[3]) if nargs > 4: output_file = open(sys.argv[4], 'w') for s in input_file: output_file.write(s.replace(stext, rtext)) output.close( ) input.close( ) DiscussionThis recipe is really simple, but that's what beautiful about itwhy do complicated stuff when simple stuff suffices? As indicated by the leading "shebang" line, the recipe is a simple main script, meaning a script meant to be run directly at a shell command prompt, as opposed to a module meant to be imported from elsewhere. The script looks at its arguments to determine the search text, the replacement text, the input file (defaulting to standard input), and the output file (defaulting to standard output). Then, it loops over each line of the input file, writing to the output file a copy of the line with the substitution performed on it. That's all! For accuracy, the script closes both files at the end. As long as an input file fits comfortably in memory in two copies (one before and one after the replacement, since strings are immutable), we could, with an increase in speed, operate on the entire input file's contents at once instead of looping. With today's low-end PCs typically containing at least 256 MB of memory, handling files of up to about 100 MB should not be a problem, and few text files are bigger than that. It suffices to replace the for loop with one single statement: output_file.write(input_file.read( ).replace(stext, rtext)) As you can see, that's even simpler than the loop used in the recipe. See AlsoDocumentation for the open built-in function, file objects, and strings' replace method in the Library Reference and Python in a Nutshell. |