7.5 References Versus Copies

Section 4.6 in Chapter 4 mentioned that assignments always store references to objects, not copies. In practice, this is usually what you want. But because assignments can generate multiple references to the same object, you sometimes need to be aware that changing a mutable object in-place may affect other references to the same object elsewhere in your program. If you don't want such behavior, you'll need to tell Python to copy the object explicitly.

For instance, the following example creates a list assigned to X, and another assigned to L that embeds a reference back to list X. It also creates a dictionary D that contains another reference back to list X:

>>> X = [1, 2, 3] >>> L = ['a', X, 'b']           # Embed references to X's object. >>> D = {'x':X, 'y':2}

At this point, there are three references to the first list created: from name X, from inside the list assigned to L, and from inside the dictionary assigned to D. The situation is illustrated in Figure 7-2.

Figure 7-2. Shared object references
figs/lpy2_0702.gif

Since lists are mutable, changing the shared list object from any of the three references changes what the other two reference:

>>> X[1] = 'surprise'         # Changes all three references! >>> L ['a', [1, 'surprise', 3], 'b'] >>> D {'x': [1, 'surprise', 3], 'y': 2}

References are a higher-level analog of pointers in other languages. Although you can't grab hold of the reference itself, it's possible to store the same reference in more than one place: variables, lists, and so on. This is a feature you can pass a large object around a program without generating copies of it along the way. If you really do want copies, you can request them:

  • Slice expressions with empty limits copy sequences.

  • The dictionary copy method copies a dictionary.

  • Some built-in functions such as list also make copies.

  • The copy standard library module makes full copies.

For example, if you have a list and a dictionary, and don't want their values to be changed through other variables:

>>> L = [1,2,3] >>> D = {'a':1, 'b':2}

simply assign copies to the other variables, not references to the same objects:

>>> A = L[:]              # Instead of: A = L (or list(L)) >>> B = D.copy(  )            # Instead of: B = D

This way, changes made from other variables change the copies, not the originals:

>>> A[1] = 'Ni' >>> B['c'] = 'spam' >>> >>> L, D ([1, 2, 3], {'a': 1, 'b': 2}) >>> A, B ([1, 'Ni', 3], {'a': 1, 'c': 'spam', 'b': 2})

In terms of the original example, you can avoid the reference side effects by slicing the original list, instead of simply naming it:

>>> X = [1, 2, 3] >>> L = ['a', X[:], 'b']          # Emded copies of X's object. >>> D = {'x':X[:], 'y':2}

This changes the picture in Figure 7-2 L and D will point to different lists than X. The net effect is that changes made through X will impact only X, not L and D; similarly, changes to L or D will not impact X.

One note on copies: empty-limit slices and the copy method of dictionaries still only make a top-level copy they do not copy nested data structures, if any are present. If you need a complete, fully independent copy of a deeply nested data structure, use the standard copy module: import copy, and say X=copy.deepcopy(Y) to fully copy an arbitrarily nested object Y. This call recursively traverses objects to copy all their parts. This is the much more rare case, though (which is why you have to say more to make it go). References are usually the behaviour you will want; when they are not, slices and copy methods are usually as much copying as you'll need to do.



Learning Python
Learning Python: Powerful Object-Oriented Programming
ISBN: 0596158068
EAN: 2147483647
Year: 2003
Pages: 253
Authors: Mark Lutz

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net