Storing Complex Data in Files


Text files are convenient because you can read and manipulate them with any text editor, but they're limited to storing a series of characters. Sometimes you may want to store more complex information, like a list or a dictionary, for example. You could try to convert the contents of these data structures to characters and save them to a file, but Python offers a much better way. You can store more complex data in a file with a single line of code. You can even store a simple database of values in a single file that acts like a dictionary.

Introducing the Pickle It Program

Pickling means to preserve—and that's just what it means in Python. You can pickle a complex piece of data, like a list or dictionary, and save it in its entirety to a file. Best of all, your hands won't smell like vinegar when you're done.

start sidebar
IN THE REAL WORLD

Other languages can convert complex data for storage in files too, but may not call the process pickling. Instead, these languages may call the process serialization or marshaling.

end sidebar

The Pickle It program pickles, stores, and retrieves three lists of strings. First, the program stores and retrieves the lists sequentially using a file, much like you've seen with characters in a text file. But then the program stores and retrieves the same three lists so that any list can be randomly accessed. The results of the program are shown in Figure 7.4.

click to expand
Figure 7.4: Each list is written to and read from a file in its entirety.

Pickling Data and Writing It to a File

The first thing I do in the program is import two new modules:

 # Pickle It # Demonstrates pickling and shelving data # Michael Dawson 5/1/03 import cPickle, shelve 

The cPickle module allows you to pickle and store more complex data in a file. The shelve module allows you to store and randomly access pickled objects in a file.

HINT

Python also has a pickle module, which works like the cPickle module. pickle is written in Python while cPickle is written in C. Since cPickle can be much faster, it's better to use cPickle over pickle in almost every case.

Pickling is pretty simple. Instead of writing characters to a text file, you can write a pickled object to a file. Pickled objects are stored in files much like characters; you can store and retrieve them sequentially.

In the next section of code, I pickle and store the three lists variety, shape, and brand in the file pickles1.dat using the cPickle.dump() function. The function requires two arguments: the data to pickle and the file in which to store it.

 print "Pickling lists." variety = ["sweet", "hot", "dill"] shape = ["whole", "spear", "chip"] brand = ["Claussen", "Heinz", "Vlassic"] pickle_file = open("pickles1.dat", "w") cPickle.dump(variety, pickle_file) cPickle.dump(shape, pickle_file) cPickle.dump(brand, pickle_file) pickle_file.close() 

So, this code pickles the list referred to by variety and writes the whole thing as one object to the file pickles1.dat. Next, the program pickles the list referred to by shape and writes the whole thing as one object to the file. Then, the program pickles the list referred to by brand and writes the whole thing as one object to the file. Finally, the program closes the file.

You can pickle a variety of objects, including:

  • Numbers

  • Strings

  • Tuples

  • Lists

  • Dictionaries

Reading Data from a File and Unpickling It

Next, I retrieve and unpickle the three lists with the cPickle.load() function. The function takes one argument: the file from which to load the next pickled object.

 print "\nUnpickling lists." pickle_file = open("pickles1.dat", "r") variety = cPickle.load(pickle_file) shape = cPickle.load(pickle_file) brand = cPickle.load(pickle_file) 

The program reads the first pickled object in the file, unpickles it to produce the list ["sweet", "hot", "dill"], and assigns the list to variety. Next, the program reads the next pickled object from the file, unpickles it to produce the list ["whole", "spear", "chip"], and assigns the list to shape. Finally, the program reads the last pickled object from the file, unpickles it to produce the list ["Claussen", "Heinz", "Vlassic"], and assigns the list to brand.

Finally, I print the unpickled lists to prove that the process worked:

 print variety, "\n", shape, "\n", brand pickle_file.close() 

Using a Shelf to Store Pickled Data

Next, I take the idea of pickling one step further by shelving the lists together in a single file. Using the shelve module, I create a shelf that acts like a dictionary, which allows the lists to be accessed randomly.

First, I create a shelf, pickles:

 print "\nShelving lists." pickles = shelve.open("pickles2.dat") 

The shelve.open() function works a lot like the file open() function. However, the shelve.open() function works with a file that stores pickled objects and not characters. In this case, I assigned the resulting shelf to pickles, which now acts like a dictionary whose contents are permanently stored in the file pickles2.dat.

The shelve.open() function requires one argument: a file name. It also takes an optional access mode. If you don't supply an access mode (like I didn't), it defaults to "c". Table 7.3 details access modes for the function.

Table 7.3: shelve ACCESS MODES

Mode

Description

"c"

Open a file for reading or writing. If the file doesn't exist, it's created.

"n"

Create a new file for reading or writing. If the file exists, its contents are overwritten.

"r"

Read from a file. If the file doesn't exist, Python will complain with an error.

"w"

Write to a file. If the file doesn't exist, Python will complain with an error.

Next, I add three lists to the shelf:

 pickles["variety"] = ["sweet", "hot", "dill"] pickles ["shape"] = ["whole", "spear", "chip"] pickles["brand"] = ["Claussen", "Heinz", "Vlassic"] 

pickles works like a dictionary. So, the key "variety" is paired with the value ["sweet", "hot", "dill"]. The key "shape" is paired with the value ["whole", "spear", "chip"]. And the key "brand" is paired with the value ["Claussen", "Heinz", "Vlassic"]. One important thing to note is that a shelf key can only be a string.

Lastly, I invoke the shelf's sync() method:

 pickles.sync()   # make sure data is written 

Python writes changes to a shelf file to a buffer and then periodically writes the buffer to the file. To make sure the file reflects all the changes to a shelf, you can invoke a shelf's sync() method. A shelf file is also updated when you close it with its close() method.

HINT

While you could simulate a shelf by pickling a dictionary, the shelve module is more memory efficient. So, if you need random access to pickled objects, create a shelf.

Using a Shelf to Retrieve Pickled Data

Since a shelf acts like a dictionary, you can retrieve pickled objects from it by supplying a key. Next, I loop through all of the pickled objects in pickles, treating it like a dictionary:

 print "\nRetrieving the lists from a shelved file:" for key in pickles.keys():     print key, "-", pickles[key] 

I loop through a list of keys, which includes "variety", "shape" and "brand", printing the key and its value. Finally, I close the file:

 pickles.close() raw_input("\n\nPress the enter key to exit.") 

start sidebar
IN THE REAL WORLD

Pickling and unpickling are good ways to store and retrieve structured information, but more complex information can require even more power and flexibility. Databases and XML are two popular methods for storing and retrieving more complex data, and Python has modules that can interface with either. To learn more, visit the Python language Web site at http://www.python.org.

end sidebar




Python Programming for the Absolute Beginner
Python Programming for the Absolute Beginner, 3rd Edition
ISBN: 1435455002
EAN: 2147483647
Year: 2003
Pages: 194

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net