Section 6.20. Copying Python Objects and Shallow and Deep Copies


6.20. *Copying Python Objects and Shallow and Deep Copies

Earlier in Section 3.5, we described how object assignments are simply object references. This means that when you create an object, then assign that object to another variable, Python does not copy the object. Instead, it copies only a reference to the object.

For example, let us say that you want to create a generic profile for a young couple; call it person. Then you copy this object for both of them. In the example below, we show two ways of copying an object, one uses slices and the other a factory function. To show we have three unrelated objects, we use the id() built-in function to show you each object's identity. (We can also use the is operator to do the same thing.)

>>> person = ['name', ['savings', 100.00]] >>> hubby = person[:]       # slice copy >>> wifey = list(person)    # fac func copy >>> [id(x) for x in person, hubby, wifey] [11826320, 12223552, 11850936]


Individual savings accounts are created for them with initial $100 deposits. The names are changed to customize each person's object. But when the husband withdraws $50.00, his actions affected his wife's account even though separate copies were made. (Of course, this is assuming that we want them to have separate accounts and not a single, joint account.) Why is that?

>>> hubby[0] = 'joe' >>> wifey[0] = 'jane' >>> hubby, wifey (['joe', ['savings', 100.0]], ['jane', ['savings', 100.0]]) >>> hubby[1][1] = 50.00 >>> hubby, wifey (['joe', ['savings', 50.0]], ['jane', ['savings', 50.0]])


The reason is that we have only made a shallow copy. A shallow copy of an object is defined to be a newly created object of the same type as the original object whose contents are references to the elements in the original object. In other words, the copied object itself is new, but the contents are not. Shallow copies of sequence objects are the default type of copy and can be made in any number of ways: (1) taking a complete slice [:], (2) using a factory function, e.g., list(), dict(), etc., or (3) using the copy() function of the copy module.

Your next question should be: When the wife's name is assigned, how come it did not affect the husband's name? Shouldn't they both have the name 'jane' now? The reason why it worked and we don't have duplicate names is because of the two objects in each of their lists, the first is immutable (a string) and the second is mutable (a list). Because of this, when shallow copies are made, the string is explicitly copied and a new (string) object created while the list only has its reference copied, not its members. So changing the names is not an issue but altering any part of their banking information is. Here, let us take a look at the object IDs for the elements of each list. Note that the banking object is exactly the same and the reason why changes to one affects the other. Note how, after we change their names, that the new name strings replace the original 'name' string:

BEFORE:

>>> [id(x) for x in hubby] [9919616, 11826320] >>> [id(x) for x in wifey] [9919616, 11826320]


AFTER:

>>> [id(x) for x in hubby] [12092832, 11826320] >>> [id(x) for x in wifey] [12191712, 11826320]


If the intention was to create a joint account for the couple, then we have a great solution, but if we want separate accounts, we need to change something. In order to obtain a full or deep copy of the objectcreating a new container but containing references to completely new copies (references) of the element in the original objectwe need to use the copy.deepcopy() function. Let us redo the entire example but using deep copies instead:

>>> person = ['name', ['savings', 100.00]] >>> hubby = person >>> import copy >>> wifey = copy.deepcopy(person) >>> [id(x) for x in person, hubby, wifey] [12242056, 12242056, 12224232] >>> hubby[0] = 'joe' >>> wifey[0] = 'jane' >>> hubby, wifey (['joe', ['savings', 100.0]], ['jane', ['savings', 100.0]]) >>> hubby[1][1] = 50.00 >>> hubby, wifey (['joe', ['savings', 50.0]], ['jane', ['savings', 100.0]])


Now it is just the way we want it. For kickers, let us confirm that all four objects are different:

>>> [id(x) for x in hubby] [12191712, 11826280] >>> [id(x) for x in wifey] [12114080, 12224792]


There are a few more caveats to object copying. The first is that non-container types (i.e., numbers, strings, and other "atomic" objects like code, type, and xrange objects) are not copied. Shallow copies of sequences are all done using complete slices. Finally, deep copies of tuples are not made if they contain only atomic objects. If we changed the banking information to a tuple, we would get only a shallow copy even though we asked for a deep copy:

>>> person = ['name', ('savings', 100.00)] >>> newPerson = copy.deepcopy(person) >>> [id(x) for x in person, newPerson] [12225352, 12226112] >>> [id(x) for x in person] [9919616, 11800088] >>> [id(x) for x in newPerson] [9919616, 11800088]


Core Module: copy

The shallow and deep copy operations that we just described are found in the copy module. There are really only two functions to use from this module: copy() creates shallow copy, and deepcopy() creates a deep copy.




Core Python Programming
Core Python Programming (2nd Edition)
ISBN: 0132269937
EAN: 2147483647
Year: 2004
Pages: 334
Authors: Wesley J Chun

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net