4.6. Sequence OperationsPython supports a variety of operations applicable to all sequences, including strings, lists, and tuples. Some sequence operations apply to all containers (including, for example, sets and dictionaries, which are not sequences), and some apply to all iterables (meaning "any object on which you can loop," as covered in "Iterables" on page 40; all containers, be they sequences or otherwise, are iterable, and so are many objects that are not containers, such as files, covered in "File Objects" on page 216, and generators, covered in "Generators" on page 78). In the following, I use the terms sequence, container, and iterable, quite precisely and specifically, to indicate exactly which operations apply to each category. 4.6.1. Sequences in GeneralSequences are containers with items that are accessible by indexing or slicing. The built-in len function takes any container as an argument and returns the number of items in the container. The built-in min and max functions take one argument, a nonempty iterable whose items are comparable, and return the smallest and largest items, respectively. You can also call min and max with multiple arguments, in which case they return the smallest and largest arguments, respectively. The built-in sum function takes one argument, an iterable whose items are numbers, and returns the sum of the numbers. 4.6.1.1. Sequence conversionsThere is no implicit conversion between different sequence types, except that plain strings are converted to Unicode strings if needed. (String conversion is covered in detail in "Unicode" on page 198.) You can call the built-ins tuple and list with a single argument (any iterable) to get a new instance of the type you're calling, with the same items (in the same order) as in the argument. 4.6.1.2. Concatenation and repetitionYou can concatenate sequences of the same type with the + operator. You can multiply a sequence S by an integer n with the * operator. S*n or n*S is the concatenation of n copies of S. When n<=0, S*n is an empty sequence of the same type as S. 4.6.1.3. Membership testingThe x in S operator tests to check whether object x equals any item in the sequence (or other kind of container or iterable) S. It returns TRue if it does and False if it doesn't. The x not in S operator is just like not (x in S). In the specific case of strings, though, x in S is more widely applicable; in this case, the operator tests whether x equals any substring of string S, not just any single character. 4.6.1.4. Indexing a sequenceThe nth item of a sequence S is denoted by an indexing: S[n]. Indexing is zero-based (S's first item is S[0]). If S has L items, the index n may be 0, 1...up to and including L-1, but no larger. n may also be -1, -2...down to and including -L, but no smaller. A negative n indicates the same item in S as L+n does. In other words, S[-1], like S[L-1], is the last element of S, S[-2] is the next-to-last one, and so on. For example: x = [1, 2, 3, 4] x[1] # 2 x[-1] # 4 Using an index >=L or <-L raises an exception. Assigning to an item with an invalid index also raises an exception. You can add one or more elements to a list, but to do so you assign to a slice, not an item, as I'll discuss shortly. 4.6.1.5. Slicing a sequenceTo indicate a subsequence of S, you can use a slicing, with the syntax S[i:j], where i and j are integers. S[i:j] is the subsequence of S from the ith item, included, to the jth item, excluded. In Python, ranges always include the lower bound and exclude the upper bound. A slice is an empty subsequence if j is less than or equal to i, or if i is greater than or equal to L, the length of S. You can omit i if it is equal to 0, so that the slice begins from the start of S. You can omit j if it is greater than or equal to L, so that the slice extends all the way to the end of S. You can even omit both indices, to mean a copy of the entire sequence: S[:]. Either or both indices may be less than 0. A negative index indicates the same spot in S as L+n, just like in indexing. An index greater than or equal to L means the end of S, while a negative index less than or equal to -L means the start of S. Here are some examples: x = [1, 2, 3, 4] x[1:3] # [2, 3] x[1:] # [2, 3, 4] x[:2] # [1, 2] Slicing can also use the extended syntax S[i:j:k]. k is the stride of the slice, or the distance between successive indices. S[i:j] is equivalent to S[i:j:1], S[::2] is the subsequence of S that includes all items that have an even index in S, and S[::-1] has the same items as S, but in reverse order. 4.6.2. StringsString objects are immutable, so attempting to rebind or delete an item or slice of a string raises an exception. The items of a string object (corresponding to each of the characters in the string) are themselves strings, each of length 1. The slices of a string object are also strings. String objects have several methods, which are covered in "Methods of String Objects" on page 186. 4.6.3. TuplesTuple objects are immutable, so attempting to rebind or delete an item or slice of a tuple raises an exception. The items of a tuple are arbitrary objects and may be of different types. The slices of a tuple are also tuples. Tuples have no normal (nonspecial) methods, only some of the special methods covered in "Special Methods" on page 104. 4.6.4. ListsList objects are mutable, so you may rebind or delete items and slices of a list. The items of a list are arbitrary objects and may be of different types. The slices of a list are lists. 4.6.4.1. Modifying a listYou can modify a list by assigning to an indexing. For instance: x = [1, 2, 3, 4] x[1] = 42 # x is now [1, 42, 3, 4] Another way to modify a list object L is to use a slice of L as the target (LHS) of an assignment statement. The RHS of the assignment must then be an iterable. If the LHS slice is in extended form, then the RHS must have just as many items as the number of items in the LHS slice. However, if the LHS slice is in normal (nonextended) form, then the LHS slice and the RHS may each be of any length; assigning to a (nonextended) slice of a list can add or remove list items. For example: x = [1, 2, 3, 4] x[1:3] = [22, 33, 44] # x is now [1, 22, 33, 44, 4] x[1:4] = [8, 9] # x is now [1, 8, 9, 4] Here are some important special cases of assignment to slices:
You can delete an item or a slice from a list with del. For instance: x = [1, 2, 3, 4, 5] del x[1] # x is now [1, 3, 4, 5] del x[::2] # x is now [3, 5] 4.6.4.2. In-place operations on a listList objects define in-place versions of the + and * operators, which you can use via augmented assignment statements. The augmented assignment statement L+=L1 has the effect of adding the items of iterable L1 to the end of L. L*=n has the effect of adding n-1 copies of L to the end of L; if n<=0, L*=n empties the contents of L, like L[:]=[]. 4.6.4.3. List methodsList objects provide several methods, as shown in Table 4-3. Nonmutating methods return a result without altering the object to which they apply, while mutating methods may alter the object to which they apply. Many of the mutating methods behave like assignments to appropriate slices of the list. In Table 4-3, L indicates any list object, i any valid index in L, s any iterable, and x any object.
All mutating methods of list objects, except pop, return None. 4.6.4.4. Sorting a listA list's method sort causes the list to be sorted in-place (its items are reordered to place them in increasing order) in a way that is guaranteed to be stable (elements that compare equal are not exchanged). In practice, sort is extremely fast, often preternaturally fast, as it can exploit any order or reverse order that may be present in any sublist (the advanced algorithm sort uses, known as timsort to honor its developer, Tim Peters, is technically a "non-recursive adaptive stable natural mergesort/binary insertion sort hybrid"). In Python 2.3, a list's sort method takes one optional argument. If present, the argument must be a function that, when called with any two list items as arguments, returns -1, 0, or 1, depending on whether the first item is to be considered less than, equal to, or greater than the second item for sorting purposes. Passing the argument slows down the sort, although it makes it easy to sort small lists in flexible ways. The decorate-sort-undecorate idiom, covered in "Searching and sorting" on page 485 is faster, at least as flexible, and often less error-prone than passing an argument to sort. In Python 2.4, the sort method takes three optional arguments, which may be passed with either positional or named-argument syntax. Argument cmp plays just the same role as the only (unnamed) optional argument in Python 2.3. Argument key, if not None, must be a function that can be called with any list item as its only argument. In this case, to compare any two items x and y, Python uses cmp(key(x),key(y)) rather than cmp(x,y) (in practice, this is implemented in the same way as the decorate-sort-undecorate idiom presented in "Searching and sorting" on page 485, and can be even faster). Argument reverse, if TRue, causes the result of each comparison to be reversed; this is not the same thing as reversing L after sorting because the sort is stable (elements that compare equal are never exchanged) whether argument reverse is true or false. Python 2.4 also provides a new built-in function sorted (covered in sorted on page 167) to produce a sorted list from any input iterable. sorted accepts the same input arguments as a list's method sort. Moreover, Python 2.4 adds to module operator (covered in "The operator Module" on page 368) two new higher-order functions, attrgetter and itemgetter, which produce functions particularly suitable for the key= optional argument of lists' method sort and the new built-in function sorted. In Python 2.5, an identical key= optional argument has also been added to built-in functions min and max, and to functions nsmallest and nlargest in standard library module heapq, covered in "The heapq Module" on page 177. |