Section 4.8. Categorizing the Standard Types


4.8. Categorizing the Standard Types

If we were to be maximally verbose in describing the standard types, we would probably call them something like Python's "basic built-in data object primitive types."

  • "Basic," indicating that these are the standard or core types that Python provides

  • "Built-in," due to the fact that these types come by default in Python

  • "Data," because they are used for general data storage

  • "Object," because objects are the default abstraction for data and functionality

  • "Primitive," because these types provide the lowest-level granularity of data storage

  • "Types," because that's what they are: data types!

However, this description does not really give you an idea of how each type works or what functionality applies to them. Indeed, some of them share certain characteristics, such as how they function, and others share commonality with regard to how their data values are accessed. We should also be interested in whether the data that some of these types hold can be updated and what kind of storage they provide.

There are three different models we have come up with to help categorize the standard types, with each model showing us the interrelationships between the types. These models help us obtain a better understanding of how the types are related, as well as how they work.

4.8.1. Storage Model

The first way we can categorize the types is by how many objects can be stored in an object of this type. Python's types, as well as types from most other languages, can hold either single or multiple values. A type which holds a single literal object we will call atomic or scalar storage, and those which can hold multiple objects we will refer to as container storage. (Container objects are also referred to as composite or compound objects in the documentation, but some of these refer to objects other than types, such as class instances.) Container types bring up the additional issue of whether different types of objects can be stored. All of Python's container types can hold objects of different types. Table 4.6 categorizes Python's types by storage model.

Table 4.6. Types Categorized by the Storage Model

Storage Model Category

Python Types That Fit Category

Scalar/atom

Numbers (all numeric types), strings (all are literals)

Container

Lists, tuples, dictionaries


Although strings may seem like a container type since they "contain" characters (and usually more than one character), they are not considered as such because Python does not have a character type (see Section 4.8). Thus strings are self-contained literals.

4.8.2. Update Model

Another way of categorizing the standard types is by asking the question, "Once created, can objects be changed, or can their values be updated?" When we introduced Python types early on, we indicated that certain types allow their values to be updated and others do not. Mutable objects are those whose values can be changed, and immutable objects are those whose values cannot be changed. Table 4.7 illustrates which types support updates and which do not.

Table 4.7. Types Categorized by the Update Model

Update Model Category

Python Types That Fit Category

Mutable

Lists, dictionaries

Immutable

Numbers, strings, tuples


Now after looking at the table, a thought that must immediately come to mind is, "Wait a minute! What do you mean that numbers and strings are immutable? I've done things like the following":

x = 'Python numbers and strings' x = 'are immutable?!? What gives?' i = 0 i = i + 1


"They sure as heck don't look immutable to me!" That is true to some degree, but looks can be deceiving. What is really happening behind the scenes is that the original objects are actually being replaced in the above examples. Yes, that is right. Read that again.

Rather than referring to the original objects, new objects with the new values were allocated and (re)assigned to the original variable names, and the old objects were garbage-collected. One can confirm this by using the id() BIF to compare object identities before and after such assignments.

If we added calls to id() in our example above, we may be able to see that the objects are being changed, as below:

>>> x = 'Python numbers and strings' >>> print id(x) 16191392 >>> x = 'are immutable?!? What gives?' >>> print id(x) 16191232 >>> i = 0 >>> print id(i) 7749552 >>> i = i + 1 >>> print id(i) 7749600


Your mileage will vary with regard to the object IDs as they will differ between executions. On the flip side, lists can be modified without replacing the original object, as illustrated in the code below:

>>> aList = ['ammonia', 83, 85, 'lady'] >>> aList ['ammonia', 83, 85, 'lady'] >>> >>> aList[2] 85 >>> >>> id(aList) 135443480 >>> >>> aList[2] = aList[2] + 1 >>> aList[3] = 'stereo' >>> aList ['ammonia', 83, 86, 'stereo'] >>> >>> id(aList) 135443480 >>> >>> aList.append('gaudy') >>> aList.append(aList[2] + 1) >>> aList ['ammonia', 83, 86, 'stereo', 'gaudy', 87] >>> >>> id(aList) 135443480


Notice how for each change, the ID for the list remained the same.

4.8.3. Access Model

Although the previous two models of categorizing the types are useful when being introduced to Python, they are not the primary models for differentiating the types. For that purpose, we use the access model. By this, we mean, how do we access the values of our stored data? There are three categories under the access model: direct, sequence, and mapping. The different access models and which types fall into each respective category are given in Table 4.8.

Table 4.8. Types Categorized by the Access Model

Access Model Category

Types That Fit Category

Direct

Numbers

Sequence

Strings, lists, tuples

Mapping

Dictionaries


Direct types indicate single-element, non-container types. All numeric types fit into this category.

Sequence types are those whose elements are sequentially accessible via index values starting at 0. Accessed items can be either single elements or in groups, better known as slices. Types that fall into this category include strings, lists, and tuples. As we mentioned before, Python does not support a character type, so, although strings are literals, they are a sequence type because of the ability to access substrings sequentially.

Mapping types are similar to the indexing properties of sequences, except instead of indexing on a sequential numeric offset, elements (values) are unordered and accessed with a key, thus making mapping types a set of hashed key-value pairs.

We will use this primary model in the next chapter by presenting each access model type and what all types in that category have in common (such as operators and BIFs), then discussing each Python standard type that fits into those categories. Any operators, BIFs, and methods unique to a specific type will be highlighted in their respective sections.

So why this side trip to view the same data types from differing perspectives? Well, first of all, why categorize at all? Because of the high-level data structures that Python provides, we need to differentiate the "primitive" types from those that provide more functionality. Another reason is to be clear on what the expected behavior of a type should be. For example, if we minimize the number of times we ask ourselves, "What are the differences between lists and tuples again?" or "What types are immutable and which are not?" then we have done our job. And finally, certain categories have general characteristics that apply to all types in a certain category. A good craftsman (and craftswoman) should know what is available in his or her toolboxes.

The second part of our inquiry asks, "Why all these different models or perspectives"? It seems that there is no one way of classifying all of the data types. They all have crossed relationships with each other, and we feel it best to expose the different sets of relationships shared by all the types. We also want to show how each type is unique in its own right. No two types map the same across all categories. (Of course, all numeric subtypes do, so we are categorizing them together.) Finally, we believe that understanding all these relationships will ultimately play an important implicit role during development. The more you know about each type, the more you are apt to use the correct ones in the parts of your application where they are the most appropriate, and where you can maximize performance.

We summarize by presenting a cross-reference chart (see Table 4.9) that shows all the standard types, the three different models we use for categorization, and where each type fits into these models.

Table 4.9. Categorizing the Standard Types

Data Type

Storage Model

Update Model

Access Model

Numbers

Scalar

Immutable

Direct

Strings

Scalar

Immutable

Sequence

Lists

Container

Mutable

Sequence

Tuples

Container

Immutable

Sequence

Dictionaries

Container

Mutable

Mapping




Core Python Programming
Core Python Programming (2nd Edition)
ISBN: 0132269937
EAN: 2147483647
Year: 2004
Pages: 334
Authors: Wesley J Chun

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net