Reading from Text Files


With Python, it's easy to read strings from plain text files—files that are made up of only ASCII characters. (Although there are different types of text files, when I use the term "text file," I mean a plain text file.) Text files are a good choice for permanently storing simple information, for a number of reasons. First, text files are cross-platform. A text file on a Windows machine is the same text file on a Mac and is the same text file under Unix. Second, text files are easy to use. Most operating systems come with basic tools to view and edit them.

Introducing the Read It Program

The Read It program demonstrates several ways you can read strings from a text file. The program demonstrates how to read anything from a single character to the entire file. It also shows several different ways to read one line at a time (probably the most common way you'll access text files). The program reads a simple text file I created on my system using a text editor. Here are the contents of the file:

 Line 1 This is line 2 That makes this line 3 

I saved the file with the name read_it.txt and put it in the same directory as the Read It program file for easy access. Figure 7.2 illustrates the program.

click to expand
Figure 7.2: The file is read using a few different techniques.

Here's the code for the program:

 # Read It # Demonstrates reading from a text file # Michael Dawson - 4/28/03 print "Opening and closing the file." text_file = open("read_it.txt", "r") text_file.close() print "\nReading characters from the file." text_file = open("read_it.txt", "r") print text_file.read(1) print text_file.read(5) text_file.close() print "\nReading the entire file at once." text_file = open("read_it.txt", "r") whole_thing = text_file.read() print whole_thing text_file.close() print "\nReading characters from a line." text_file = open("read_it.txt", "r") print text_file.readline(1) print text_file.readline(5) text_file.close() print "\nReading one line at a time." text_file = open("read_it.txt", "r") print text_file.readline() print text_file.readline() print text_file.readline() text_file.close() print "\nReading the entire file into a list." text_file = open("read_it.txt", "r") lines = text_file.readlines() print lines print len(lines) for line in lines:     print line text_file.close() print "\nLooping through the file, line by line." text_file = open("read_it.txt", "r") for line in text_file:     print line text_file.close() raw_input("\n\nPress the enter key to exit.") 

I'll show you exactly how the code works through an interactive session.

Opening and Closing a Text File

Before you can read from a text file, you need to open it. That's the first thing I do in the Read It program:

 >>> text_file = open("read_it.txt", "r") 

I use the open() function to open a text file and assign the results to text_file. In the function call, I provide two string arguments: a file name and an access mode.

The file argument, "read_it.txt", is pretty straightforward. Since I don't include any path information, Python looks in the current directory for the file. I can access a file in any directory by providing the proper path information. For example, on my Windows machine I could provide an absolute path with the string "C:\Documents and Settings\Owner\Desktop\ read_it.txt" to access the file read_it.txt located on my desktop. This will access the file regardless of the directory from which Read It is run. Or, I could provide a relative path with the string "data\read_it.txt" to access the file read_it.txt located in the subdirectory data of the directory from which Read It is run. In either case, I'm not limited to accessing files from the only directory where Read It is run.

Next, I provide "r" for the access mode, which tells Python that I want to open the file for reading. You can open a file for reading, writing, or both. Table 7.1 describes valid access modes.

Table 7.1: SELECTED FILE ACCESS MODES

Mode

Description

"r"

Read from a file. If the file doesn't exist, Python will complain with an error.

"w"

Write to a file. If the file exists, its contents are overwritten. If the file doesn't exist, it's created.

"a"

Append a file. If the file exists, new data is appended to it. If the file doesn't exist, it's created.

"r+"

Read from and write to a file. If the file doesn't exist, Python will complain with an error.

"w+"

Write to and read from a file. If the file exists, its contents are overwritten. If the file doesn't exist, it's created.

"a+"

Append and read from a file. If the file exists, new data is appended to it. If the file doesn't exist, it's created.

After opening the file, I access it through the variable text_file. There are many useful file methods that I can invoke, but the simplest is close(), which closes the file. That's what I do next in the program:

 >>> text_file.close() 

Whenever you're done with a file, it's good programming practice to close it.

Reading Characters from a Text File

For a file to be of any use, you need to do something with its contents between opening and closing it. So next, I open the file and read its contents with the read() file method. read() allows you to read a specified number of characters from a file, which the method returns as a string. After opening the file again, I read and print exactly one character from it:

 >>> text_file = open("read_it.txt", "r") >>> print text_file.read(1) L 

All I have to do is specify the number of characters between the parentheses. Next, I read and print the next five characters:

 >>> print text_file.read(5) ine 1 

Notice that I read the five characters following the "L". Python remembers where I last left off. It's like the computer puts a bookmark in the file and each subsequent read() begins where the last ended. When you read to the end of a file, subsequent reads return the empty string.

To start back at the beginning of a file, you can close and open it. That's just what I did next:

 >>> text_file.close() >>> text_file = open("read_it.txt", "r") 

If you don't specify the number of characters to be read, Python returns the entire file as a string. Next, I read the entire file, assign the returned string to a variable, and print the variable:

 >>> whole_thing = text_file.read() >>> print whole_thing Line 1 This is line 2 That makes this line 3 

If a file is small enough, reading the entire thing at once may make sense. Since I've read the entire file, any subsequent reads will just return the empty string. So, I close the file again:

 >>> text_file.close() 

Reading Characters from a Line

Often, you'll want to work with one line of a text file at a time. The readline() method lets you read characters from the current line. You just pass the number of characters you want read from the current line and the method returns them as a string. If you don't pass a number, the method returns the entire line. Once you read all of the characters of a line, the next line becomes the current line. After opening the file again, I read the first character of the current line:

 >>> text_file = open("read_it.txt", "r") >>> print text_file.readline(1) L 

Then I read the next five characters of the current line:

 >>> print text_file.readline(5) ine 1 >>> text_file.close() 

At this point, readline() may seem no different than read(), but readline() reads characters from the current line only, while read() reads characters from the entire file. Because of this, readline() is usually invoked to read one line of text at a time. In the next few lines of code, I read the file, one line at a time:

 >>> text_file = open("read_it.txt", "r") >>> print text_file.readline() Line 1 >>> print text_file.readline() This is line 2 >>> print text_file.readline() That makes this line 3 >>> text_file.close() 

Notice that a blank line appears after each line. That's because each line in the text file ends with a newline character ("\n").

Reading All Lines into a List

Another way to work with individual lines of a text file is the readlines() method, which reads a text file into a list, where each line of the file becomes a string element in the list. Next, I invoke the readlines() method:

 >>> text_file = open("read_it.txt", "r") >>> lines = text_file.readlines() 

lines now refers to a list with an element for each line in the text file:

 >>> print lines ['Line 1\n', 'This is line 2\n', 'That makes this line 3\n'] 

lines is like any list. You can find the length of it and even loop through it:

 >>> print len(lines) 3 >>> for line in lines:         print line Line 1 This is line 2 That makes this line 3 >>> text_file.close() 

Looping through a Text File

Starting in Python 2.2, you can loop directly through the lines of a text file:

 >>> text_file = open("read_it.txt", "r") >>> for line in text_file:         print line Line 1 This is line 2 That makes this line 3 >>> text_file.close() 

This technique is the most elegant solution if you want to move through all of the lines of a text file.




Python Programming for the Absolute Beginner
Python Programming for the Absolute Beginner, 3rd Edition
ISBN: 1435455002
EAN: 2147483647
Year: 2003
Pages: 194

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net