Section 16.4. Array Objects


16.4. Array Objects

Numeric supplies a type array that represents a grid of items. An array object a has a given number of dimensions, known as its rank, up to some arbitrarily high limit (normally 30, when Numeric is built with default options). A scalar (i.e., a single number) has rank 0, a vector has rank 1, a matrix has rank 2, and so forth.

16.4.1. Typecodes

The values in the grid cells of an array object, known as the elements of the array, are homogeneous, meaning they are all of the same type, and all element values are stored within one memory area. This contrasts with a list, where items may be of different types, each stored as a separate Python object. This means a Numeric array occupies far less memory than a Python list with the same number of items. The type of a's elements is encoded as a's typecode, a one-character string, as shown in Table 16-2. Factory functions that build array instances (covered in "Factory Functions" on page 384) take a typecode argument that is one of the values in Table 16-2.

Table 16-2. Typecodes for Numeric arrays

Typecode

C type

Python type

Synonym

'c'

char

str (length 1)

Character

'b'

unsigned char

int

UnsignedInt8

'1'

signed char

int

Int8

's'

short

int

Int16

'w'

unsigned short

int

UnsignedInt16

'i'

int

int

Int32

'u'

unsigned

int

UnsignedInt32

'l'

long

int

Int

'f'

float

float

Float32

'F'

Two floats

complex

Complex32

'd'

double

float

Float

'D'

Two doubles

complex

Complex

'O'

PyObject*

any

PyObject


Numeric supplies readable attribute names for each typecode, as shown in the last column of Table 16-2. Numeric also supplies, on all platforms, the names Int0, Float0, Float8, Float16, Float64, Complex0, Complex8, Complex16, and Complex64. In each case, the name refers to the smallest type of the requested kind with at least that many bits. For example, Float8 is the smallest floating-point type of at least 8 bits (generally the same as Float0 and Float32, but some platforms might, in theory, supply very small floating-point types), while Complex0 is the smallest complex type. On some platforms, Numeric also supplies names Int64, Int128, Float128, and Complex128, with similar meanings. These names are not supplied on all platforms because not all platforms provide numbers with that many bits. A typecode of 'O' means that elements are references to Python objects. In this case, elements can be of different types. This lets you use Numeric array objects as Python containers for array-processing tasks that may have nothing to do with numeric processing.

When you build an array a with one of Numeric's factory functions, you can either specify a's typecode explicitly or accept a default data-dependent typecode. To get the typecode of an array a, call a.typecode( ). a's typecode determines how many bytes each element of a takes up in memory. Call a.itemsize( ) to get this information. When the typecode is 'O', the item size is small (e.g., 4 bytes on a 32-bit platform), but this size accounts only for the reference held in each of a's cells. The objects indicated by the references are stored elsewhere as separate Python objects; each such object, depending on its type, may occupy an arbitrary amount of extra memory, not accounted for in the item size of an array with typecode 'O'.

16.4.2. Shape and Indexing

Each array object a has an attribute a.shape, which is a tuple of ints. len(a.shape) is a's rank; for example, a one-dimensional array of numbers (also known as a vector) has rank 1, and a.shape has just one item. More generally, each item of a.shape is the length of the corresponding dimension of a. a's number of elements, known as its size, is the product of all items of a.shape. Each dimension of a is also known as an axis. Axis indices are from 0 and up, as is usual in Python. Negative axis indices are allowed and count from the right, so -1 is the last (rightmost) axis.

Each array a is a Python sequence. Each item a[i] of a is a subarray of a, meaning it is an array with a rank one less than a's: a[i].shape==a.shape[1:]. For example, if a is a two-dimensional matrix (a is of rank 2), a[i], for any valid index i, is a one-dimensional subarray of a that corresponds to a row of the matrix. When a's rank is 1 or 0, a's items are a's elements (just one element, for rank-0 arrays). Since a is a sequence, you can index a with normal indexing syntax to access or change a's items. Note that a's items are a's subarrays; only for an array of rank 1 or 0 are the array's items the same thing as the array's elements.

You can also loop on a in a for, just as you can with any other sequence. For example:

 for x in a:     process(x) 

means the same thing as:

 for i in range(len(a)):     x = a[i]     process(x) 

In these examples, each item x of a in the for loop is a subarray of a. For example, if a is a two-dimensional matrix, each x in either of these loops is a one-dimensional subarray of a that corresponds to a row of the matrix.

You can also index a by a tuple. For example, if a's rank is at least 2, you can write a[i][j] as a[i,j], for any valid i and j, for rebinding as well as for access. Tuple indexing is faster and more convenient. Do not put parentheses inside the brackets to indicate that you are indexing a by a tuple: just write the indices one after the other, separated by commas. a[i,j] means the same thing as a[(i,j)], but the form without parentheses is more natural and readable.

If the result of indexing is a single number, Numeric sometimes leaves the result as a rank-0 array, and sometimes as a scalar quantity of the appropriate Python type. In other words, as a result of such an indexing you sometimes get an array with just one number in it, and sometimes the number it contains. For example, consider the snippet:

 >>> for t in 'blswiufFdDO': print t, type(Numeric.array([0],t)[0]) 

The somewhat surprising output is:

 b <type 'array'> l <type 'int'> s <type 'array'> w <type 'array'> i <type 'int'> u <type 'array'> f <type 'array'> F <type 'array'> d <type 'float'> D <type 'complex'> O <type 'int'> 

which shows that, for single-result indexing, array types that correspond exactly to a Python number type produce Python numbers, while other array types produce rank-0 arrays.

16.4.3. Storage

An array object a is usually stored in a contiguous area of memory, with elements one after the other in what is traditionally called row-major order. For example, when a's rank is 2, the elements of a's first row a[0] come first, immediately followed by those of a's second row a[1], and so on.

An array can be noncontiguous when it shares some of the storage of a larger array, as covered in "Slicing" on page 381. For example, when a's rank is 2, the slice b=a[:,0] is the first column of a, and is stored noncontiguously because it occupies some of the same storage as a. b[0] occupies the same storage as a[0,0], while b[1] occupies the same storage as a[1,0], which cannot be adjacent to the memory occupied by a[0,0] when a has more than one column.

Numeric handles contiguous and noncontiguous arrays transparently in most cases so that you can use the most natural approach without wasting memory nor requiring avoidable copies. In the rest of this chapter, I point out the rare exceptions where a contiguous array is needed. When you want to copy a noncontiguous array b into a new contiguous array c, use method copy, covered in copy on page 387.

16.4.4. Slicing

Arrays may share some or all of their data with other arrays. Numeric shares data between arrays whenever feasible. If you want Numeric to copy data, explicitly ask for a copy. Data sharing, for Numeric, also applies to slices. For built-in Python lists and standard library array objects, slices are (shallow) copies, but for Numeric.array objects, slices share data with the array they're sliced from:

 from Numeric import * alist=range(10) list_slice=alist[3:7] list_slice[2]=22 print list_slice, alist       # prints: [3,4,22,6] [0,1,2,3,4,5,6,7,8,9] anarray=array(alist) arr_slice=anarray[3:7] arr_slice[2]=33 print arr_slice, anarray      # prints: [3 4 33 6] [0 1 2 3 4 33 6 7 8 9] 

Rebinding an item of list_slice does not affect the list alist from which list_slice is sliced, since, for built-in lists, slicing performs a copy. However, because, for Numeric arrays, slicing shares data, assigning to an item of arr_slice does affect the array object anarray from which arr_slice is sliced. This behavior may be unexpected for a beginner, but was chosen to enable high performance.

16.4.4.1. Slicing examples

You can use a tuple to slice an array, just as you can use the tuple to index the array: for arrays, slicing and indexing blend into each other. Each item in a slicing tuple can be an integer, and for each such item, the slice has one fewer axis than the array being sliced: slicing removes the axis for which you give a number by selecting the indicated plane of the array.

A slicing tuple's item can also be a slice expression; the general syntax is start:stop:step, and you can omit one or more of the three parts (see "Sequence Operations" on page 53 and slice on page 156, for details on slice semantics and defaults). Here are some example slicings:

 # a is [[ 0, 1, 2, 3, 4, 5], #       [10,11,12,13,14,15], #       [20,21,22,23,24,25], #       [30,31,32,33,34,35], #       [40,41,42,43,44,45], #       [50,51,52,53,54,55]] a[0,2:4]                        # array([2,3]) a[3:,3:]                        # array([[33,34,35],                                 #        [43,44,45],                                 #        [53,54,55]]) a[:,4]                          # array([4,14,24,34,44,54]) a[2::2,::2]                     # array([[20,22,24],                                 #        [40,42,44]]) 

A slicing-tuple item can also use an ellipsis (...) to indicate that the following items in the slicing tuple apply to the last (rightmost) axes of the array you're slicing. For example, consider slicing an array b of rank 3:

 b.shape                            # (4,2,3) b[1].shape                         # (2,3) b[...,1].shape                     # (4,2) 

When we slice with b[1] (equivalent to indexing), we give an integer index for axis 0, and therefore we select a specific plane along b's axis 0. By selecting a specific plane, we remove that axis from the result's shape. Therefore, the result's shape is b.shape[1:]. When we slice with b[...,1], we select a specific plane along b's axis -1 (the rightmost axis of b). Again, by selecting a specific plane, we remove that axis from the result's shape. Therefore, the result's shape is b.shape[:-1].

A slicing-tuple item can also be the pseudoindex NewAxis, which is a constant supplied by module Numeric. The resulting slice has an additional axis at the point at which you use NewAxis, with a value of 1 in the corresponding item of the shape tuple. Continuing the previous example:

 b[Numeric.NewAxis,...,Numeric.NewAxis].shape       # (1,4,2,3,1) 

Here, rather than selecting and thus removing some of b's axes, we have added two new axes, one at the start of the shape and one at the end, thanks to the ellipsis.

Axis removal and addition can both occur in the same slicing. For example:

 b[Numeric.NewAxis,:,0,:,Numeric.NewAxis].shape     # (1,4,3,1) 

Here, we both add new axes at the start and end of the shape, and select a specific index from the middle axis (axis 1) of b by giving an index for that axis. Therefore, axis 1 of b is removed from the result's shape. The colons (:) used as the second and fourth items in the slicing tuple in this example are slice expressions with both start and stop omitted, meaning that all of the corresponding axis is included in the slice. In all these examples, all slices share some or all of b's data. Slicing affects only the shape of the resulting array. No data is copied, and no operations are performed on the data.

16.4.4.2. Assigning to array slices

Assignment to array slices is less flexible than assignment to list slices. Normally, the only thing you can assign to an array slice is another array of the same shape as the slice. However, if the righthand side (RHS) of the assignment is not an array, Numeric creates a temporary array from it. Each element of the RHS is coerced to the lefthand side (LHS) type. If the RHS array is not the same shape as the LHS slice, broadcasting applies, as covered in "Operations on Arrays" on page 389. For example, you can assign a scalar (meaning a single number) to any slice of a numeric array: the RHS number is coerced, then broadcast (replicated) as needed to make the assignment succeed.

When you assign to an array slice (or indexing) a RHS of a type different from that of the LHS, Numeric coerces the values to the LHS typefor example, by truncating floating-point numbers to integers. This does not apply if the RHS values are complex. Full coercion does not apply to in-place operators, which can only cast the RHS values upward (for example, an integer RHS is okay for in-place operations with a floating-point LHS, but not vice versa), as covered in "In-place operations" on page 391.

16.4.5. Truth Values and Comparisons of Arrays

Although an array object a is a Python sequence, a does not follow Python's normal rule for the truth value of sequences (a sequence is false when empty; otherwise, it is true). Rather, a is false when a has no elements or when all of a's elements are 0. Since comparisons between arrays produce arrays (whose items are 0 or 1), Numeric's rule is necessary to let you test for element-wise equality of arrays in the natural way:

 if a==b: 

Without this proviso, such an if condition would be satisfied by any nonempty comparable arrays a and b. Despite this rule, array comparison is still tricky, since the comparison of two arrays is true if any one of the corresponding elements is equal:

 print bool(Numeric.array([1,2])==Numeric.array([1,9])) # prints True (!) 

A better way to express such comparisons is offered by Numeric's functions alltrue and sometrue, covered in Table 16-4; I suggest you never rely on the confusing behavior of if a==b but rather make your intentions clear and explicit by coding either if Numeric.alltrue(a==b) or if Numeric.sometrue(a==b).

Do remember, at any rate, that you have to be explicit when you want to test whether a has any items or whether a has any elements, which are different conditions:

 a = Numeric.array( [ [  ], [  ], [  ] ] ) if a: print 'a is true' else: print 'a is false'                       # prints: a is false print bool(Numeric.alltrue(a))                 # prints: False print bool(Numeric.sometrue(a))                # prints: False if len(a): print 'a has some items' else: print 'a has no items'                   # prints: a has some items if Numeric.size(a): print 'a has some elements' else: print 'a has no elements'                # prints: a has no elements 

In most cases, however, the best way to compare arrays of numbers is for approximate equality, using Numeric's function allclose, covered in allclose on page 391.

16.4.6. Factory Functions

Numeric supplies several factory functions that create array objects.

array, asarray

array(data,typecode=None,copy=true,savespace=False) asarray(data,typecode=None,savespace=False)

Returns a new array object a. a's shape depends on data. When data is a number, a has rank 0 and a.shape is the empty tuple ( ). When data is a sequence of numbers, a has rank 1 and a.shape is the singleton tuple (len(data),). When data is a sequence of sequences of numbers, all of data's items (subsequences) must have the same length, a has rank 2, and a.shape is the pair (len(data),len(data[0])). This idea generalizes to any nesting level of data as a sequence of sequences, up to the arbitrarily high limit on rank mentioned earlier in this chapter. If data is nested over that limit, array raises TypeError. (The limit is unlikely to be a problem in practice: an array of rank 30, with each axis of length 2, would have over a billion elements.)

typecode can be any of the values shown in Table 16-2 or None. When typecode is None, array chooses a typecode depending on the types of the elements of data. When any one or more elements in data are long or are neither numbers nor plain strings (e.g., None or Unicode strings), the typecode is 'O', a.k.a. PyObject. When all elements are plain strings, the typecode is Character. When any one or more elements (but not all) are plain strings, all others are numbers (none of them long), and typecode is None, array raises TypeError. You must explicitly pass 'O' or PyObject as argument typecode if you want array to build an array from some plain strings and some ints or floats. When all elements are numbers (none of them long), the typecode depends on the "widest" numeric type among the elements. When any of the elements is a complex, the typecode is Complex. When no elements are complex but some or all are float, the typecode is Float. When all elements are int, the typecode is Int.

Function array, by default, returns an array object a that doesn't share data with any other object. If data is an array object, and you explicitly pass a false value for argument copy, array returns an array object a that shares data with data, if feasible. Function asarray is just like function array with argument copy passed as False.

By default, a numeric array is implicitly cast up when operated with numbers of wider numeric types. When you do not want this implicit casting, you can save some memory by explicitly passing a true value for argument savespace to the array factory function to set the resulting array object a into space-saving mode. For example:

 array(range(4),typecode='b')+2.0                  # array([2.,3.,4.,5.]) array(range(4),typecode='b',savespace=True)+2.0   # array([2,3,4,5]) array(range(4),typecode='b',savespace=True)+258.7 # array([2,3,4,5]) 

The first statement creates an array of floating-point values; savespace is not specified, so each element is implicitly cast up to a float when added to 2.0. The second and third statements create arrays of 8-bit integers; savespace is specified, so, instead of implicit casting up of the array's element, we get implicit casting down of the float added to each element. 258.7 is cast down to 2; the fractional part .7 is lost because of the cast to an integer, and the resulting 258 becomes 2 because, since the cast is to 8-bit integers, only the lowest 8 bits are kept. The savespace mode can be useful for large arrays, but be careful lest you suffer unexpected loss of precision when using it.

arrayrange, arange

arrayrange([start,]stop[,step=1],typecode=None)

Like array(range(start,stop,step),typecode), but faster. (See built-in function range, covered in "range", for details about start, stop, and step.) arrayrange allows floats for these arguments, not just ints. Be careful when exploiting this feature, since floating-point arithmetic may lead to a result with one more or fewer items than you might expect. arange is a synonym of arrayrange.

fromstring

fromstring(data,count=None,typecode=Int)

Returns a one-dimensional array a of shape (count,) with data copied from the bytes of string data. When count is None, len(data) must be a multiple of typecode's item size, and a's shape is (len(data)/a.itemsize( ),). When count is not None, len(data) must be greater than or equal to count*a.itemsize( ), and fromstring ignores data's trailing bytes, if any.

Together with methods a.tostring and a.byteswapped (covered in "Attributes and Methods" on page 387), fromstring allows binary I/O of array objects. When you need to save arrays and later reload them, and don't need to use the saved form in non-Python programs, it's simpler and faster to use module cPickle, covered in "The pickle and cPickle Modules" on page 279. Many experienced users prefer to use portable, self-describing file formats such as netCDF (see http://met-www.cit.cornell.edu/noon/ncmodule.html).

identity

identity(n,typecode=Int)

Returns a two-dimensional array a of shape (n,n) (a square matrix). a's elements are 0, except those on the main diagonal (a[j,j] for j in range(n)), which are 1.

empty

empty(shapetuple,typecode=Int,savespace=False)

Returns an array a with a.shape==shapetuple. a's elements are not initialized, so their values are totally arbitrary (as in other languages that allow "uninitialized variables" and are different than any other situation in Python).

ones

ones(shapetuple,typecode=Int,savespace=False)

Returns an array a with a.shape==shapetuple. All of a's elements are 1.

zeros

zeros(shapetuple,typecode=Int,savespace=False)

Returns an array a with a.shape==shapetuple. All of a's elements are 0.

By default, identity, ones, and zeros all return arrays whose type is Int. If you want a different typecode, such as Float, pass it explicitly. A common mistake is:

 a = zeros(3) a[0] = 0.3                    # a is array([0,0,0]) 

Since a is Int in this snippet, the 0.3 we assign to one of its items gets truncated to the integer 0. Instead, you typically want something closer to the following:

 a = zeros(3, Float) a[0] = 0.3                    # a is array([0.3,0.,0.]) 

Here, we have explicitly specified Float as the typecode for a, and therefore no truncation occurs when we assign 0.3 to one of a's items.


16.4.7. Attributes and Methods

For most array manipulations, Numeric supplies functions you can call with array arguments, covered in "Functions" on page 391. Arguments can also be Python lists; this polymorphism offers more flexibility than functionality packaged up as array attributes and methods. Each array object a also supplies some methods and attributes for direct (and slightly faster) access to functionality that may not need polymorphism.

astype

a.astype(typecode)

Returns a new array b with the same shape as a. b's elements are a's elements coerced to the type indicated by typecode. b does not share a's data, even if typecode equals a.typecode( ).

byteswapped

a.byteswapped( )

Returns a new array object b with the same typecode and shape as a. Each element of b is copied from the corresponding element of a, inverting the order of the bytes in the value. This swapping transforms each value from little-endian to big-endian or vice versa. Together with function fromstring and method a.tostring, the swapping helps when you have binary data from one kind of machine and need them for the other kind. For example, all Apple Mac computers sold through 2005 had PowerPC CPUs, which are big-endian, but new Mac computers use Intel CPUs, which are little-endian; byteswapped can help you read, on a new Mac, a binary file written on an older Mac, or vice versa.

copy

a.copy( )

Returns a new contiguous array object b that is identical to a but does not sharing a's data.

flat

a.flat is an attribute that is an array of rank 1, with the same size as a, and shares a's data. Indexing or slicing a.flat lets you access or change a's elements through this alternate view of a. trying to access a.flat raises a TypeError exception when a is noncontiguous. When a is contiguous, a.flat is in row-major order. For example, when a's shape is (7,4) (i.e., a is a two-dimensional matrix with seven rows and four columns), a.flat[i] is the same as a[divmod(i,4)] for all i in range(28).

imag, imaginary, real

Trying to access a.imag raises a ValueError exception unless a's typecode is complex; in this case, a.real is an array with the same shape and typecode as a, and shares data with a. When a's typecode is complex, a.real and a.imag are noncontiguous arrays with the same shape as a and a float typecode, and shares data with a. Accessing or modifying a.real or a.imag accesses or modifies the real or imaginary parts of a's complex elements. imaginary is a synonym of imag.

iscontiguous

a.iscontiguous( )

Returns true if a's data occupies contiguous storage; otherwise, False. This matters particularly when interfacing to C-coded extensions. a.copy( ) makes a contiguous copy of a. Noncontiguous arrays arise when slicing or transposing arrays, as well as for attributes a.real and a.imag of an array a with a complex typecode.

itemsize

a.itemsize( )

Returns the number of bytes of memory used by each of a's elements (despite the name; not by each of a's items, which in the general case are subarrays of a).

savespace

a.savespace(flag=true)

Sets or resets the space-saving mode of array a, depending on flag. When flag is true, a.savespace(flag) sets a's space-saving mode, so that a's elements are not implicitly cast up when operated with wider numeric types. (For more details on this, see the discussion of the savespace argument of function array in array on page 376.) When flag is false, a.savespace(flag) resets a's space-saving mode so that a's elements are implicitly cast up when needed.

shape

a.shape is a tuple with one item per axis of a, giving the length of that axis. You can assign a sequence of ints to a.shape to change the shape of a, but a's size (total number of elements) must remain the same. When you assign to a.shape a sequence s, one of s's items can be -1, meaning that the length along that axis is whatever is needed to keep a's size unchanged. The product of the other items of s must evenly divide a's size, or else the reshaping raises an exception. When you need to change the total number of elements in a, call function resize (covered in resize on page 397).

spacesaver

a.spacesaver( )

Returns true if space-saving mode is on for array a; otherwise, False. See the discussion of the savespace method earlier in this section.

tolist

a.tolist( )

Returns a list L equivalent to a. For example, if a.shape is (2,3) and a's typecode is 'd', L is a list of two lists of three float values each such that, for each valid i and j, L[i][j]==a[i,j]. list(a) converts only the top-level (axis 0) of array a into a list, and thus is not equivalent to a.tolist( ) if a's rank is 2 or more. For example:

 a=array([[1,2,3],[4,5,6]],typecode='d') print a.shape             # prints: (2,3) print a                   # prints: [[1. 2. 3.]                           #          [4. 5. 6.]] print list(a) # prints: [array([1.,2.,3.]), array([4.,5.,6.])] print a.tolist( ) # prints: [[1.0,2.0,3.0],[4.0,5.0,6.0]] 

toscalar

a.toscalar( )

Returns the first element of a as a Python scalar (normally a number) of the appropriate type, depending on a's typecode.

tostring

a.tostring( )

Returns a binary string s whose bytes are a copy of the bytes of a's elements.

typecode

a.typecode( )

Returns the typecode of a as a one-character string.


16.4.8. Operations on Arrays

Arithmetic operators +, -, *, /, //, %, and **; comparison operators >, >=, <, <=, ==, and !=; and bitwise operators &, |, ^, and ~ (all covered in "Numeric Operations" on page 52) also apply to arrays. If both operands a and b are arrays with equal shapes and typecodes, the result is a new array c with the same shape (and the same typecode, except for comparison operators). Each element of c is the result of the operator on corresponding elements of a and b (element-wise operation). Order comparisons are not allowed between arrays whose typecode is complex, just as they are not allowed between complex numbers; in all other cases, comparison operators between arrays return arrays with integer typecode.

Arrays do not follow sequence semantics for * (replication) and + (concatenation): * and + perform element-wise arithmetic. Similarly, * does not mean matrix multiplication, but element-wise multiplication. Numeric supplies functions to perform replication, concatenation, and matrix multiplication; all operators on arrays work element-wise.

When the typecodes of a and b differ, the narrower numeric type is converted to the wider one, like for other Python numeric operations. Operations between numeric and nonnumeric values are disallowed. In the case of arrays, you can inhibit casting by setting an array into space-saving mode with method savespace. Use space-saving with care, since it can result in a silent loss of significant data. For more details on this, see the discussion of the savespace argument of function array in array on page 376.

16.4.8.1. Broadcasting

Element-wise operations between arrays of different shapes are generally not possible: attempting such operations raises an exception. Numeric allows some such operations by broadcasting (replicating) a smaller array up to the shape of the larger one when feasible. To make broadcasting efficient, the replication is only conceptual: Numeric does not physically copy the data (i.e., you need not worry that performance will be degraded because an operation involves broadcasting).

The simplest and most common case of broadcasting is when one operand, a, is a scalar (or an array of rank 0), while b, the other operand, is any array. In this case, Numeric conceptually builds a temporary array t, with shape b.shape, where each element of t equals a. Numeric then performs the requested operation between t and b. In practice, therefore, when you operate an array b with a scalar a, as in a+b or b-a, the resulting array has the same shape as b, and each element is the result of applying the operator to the corresponding element of b and the single number a.

More generally, broadcasting can also apply when both operands a and b are arrays. Conceptually, broadcasting works according to rather complicated general rules:

  • When a and b differ in rank, the one whose shape tuple is shorter is padded up to the other's rank by adding leading axes, each with a length of 1.

  • a.shape and b.shape, padded to the same length as per the first rule, are compared starting from the right (i.e., from the length of the last axis).

  • When the axis length along the axis being examined is the same for a and b, that axis is okay, and examination moves leftward to the previous axis.

  • When the lengths of the axes differ and both are >1, Numeric raises an exception.

  • When one axis length is 1, Numeric broadcasts the corresponding array by replication along that plane to the axis length of the other array.

The rules of broadcasting are complicated because of their generality, but most typical applications of broadcasting are simple. For example, say we compute a+b, and a.shape is (5,3) (a matrix of five rows and three columns). Typical values for b.shape include ( ) (a scalar), (3,) (a one-dimensional vector with three elements), and (5,1) (a matrix with five rows and one column). In each of these cases, b is conceptually broadcast up to a temporary array t with shape (5,3) by replicating b's elements along the needed axis (both axes when b is a scalar), and Numeric computes a+t. The simplest and most frequent case, of course, is when b.shape is (5,3), the same shape as a's. In this case, no broadcasting is needed.

16.4.8.2. In-place operations

Arrays support in-place operations through augmented assignment operators +=, -=, and so on. The LHS array or slice cannot be broadcast, but the RHS can be. Similarly, the LHS cannot be cast up, but the RHS can be. In other words, in-place operations treat the LHS as rigid in both shape and type, but the RHS is subject to the normal, more lenient rules.

16.4.9. Functions

Numeric defines several functions that operate on arrays, or polymorphically on Python sequences, conceptually forming temporary arrays from nonarray operands.

allclose

allclose(x,y,rtol=1.e-5,atol=1.e-8)

Returns a single number: 0 when every element of x is close to the corresponding element of y; otherwise, 1. Two elements ex and ey are defined to be close if:

 abs(ex-ey) < atol + rtol*abs(ey) 

In other words, ex and ey are close if both are tiny (less than atol) or if the relative difference is small (less than rtol). allclose is generally a better way to check array equality than ==, since floating-point arithmetic requires some comparison tolerance. However, allclose is not applicable to complex arrays, only to floating-point and integer arrays. To compare two complex arrays x and y for approximate equality, use:

 allclose(x.real, y.real) and allclose(x.imag, y.imag) 

argmax, argmin

argmax(a,axis=-1) argmin(a,axis=-1)

argmax returns a new integer array m whose shape tuple is a.shape minus the indicated axis. Each element of m is the index of a maximal element of a along axis. argmin is similar, but indicates minimal elements rather than maximal ones.

argsort

argsort(a,axis=-1)

Returns a new integer array m with the same shape as a. Each vector of m along axis is the index sequence needed to sort the corresponding axis of a. In particular, if a has rank 1, the most common case, take(a,argsort(a))==sort(a). For example:

 x = [52, 115, 99, 111, 114, 101, 97, 110, 100, 55] print Numeric.argsort(x)   # prints: [0 9 6 2 8 5 7 3 4 1] print Numeric.sort(x) # prints: [52 55 97 99 100 101 110 111 114 115] print Numeric.take(x, Numeric.argsort(x)) # prints: [52 55 97 99 100 101 110 111 114 115] 

Here, the result of Numeric.argsort(x) tells us that x's smallest element is x[0], the second smallest is x[9], the third smallest is x[6], and so on. The call to Numeric.take in the last print statement takes x's items in this order, producing the same sorted array as the call to Numeric.sort in the second print statement.

around

around(a,decimals=0)

Returns a new float array m with the same shape as a. Each element of m is like the result of calling Python's built-in function round on the corresponding element of a.

array2string

array2string(a,max_line_width=77,precision=8, suppress_small=False,separator=' ', array_output=False)

Returns a string representation s of array a, with elements in brackets, separated by string separator. The last dimension is horizontal, the penultimate one vertical, and further dimensions are shown by bracket nesting. When array_output is true, s starts with 'array(' and ends with ')', or ",'X')" when X, which is a's typecode, is not Float, Complex, or Int (so you can later use eval(s) if separator is ',').

Lines longer than max_line_width get split. precision determines how many digits each element shows. If suppress_small is true, very small numbers are shown as 0. To change defaults, set attributes of module sys named output_line_width, float_output_precision, and float_output_suppress_small. For example:

 >>> Numeric.array2string(Numeric.array([1e-20]*3)) '[  1.00000000e-20   1.00000000e-20   1.00000000e-20]' >>> import sys >>> sys.float_output_suppress_small = True >>> Numeric.array2string(Numeric.array([1e-20]*3)) '[ 0.  0.  0.]' 

str(a) is like array2string(a). repr(a) is like array2string(a, separator=',', array_output=True). You can also access these formatting functions by the names array_repr and array_str in module Numeric.

average

average(a,axis=0,weights=None,returned=False)

Returns a's average along axis. When axis is None, returns the average of all of a's elements. When weights is not None, weights must be an array with a's shape, or a one-dimensional array with the length of a's given axis, and average computes a weighted average. When returned is true, returns a pair: the first item is the average; the second item is the sum of weights (the count of values when weights is None).

choose

choose(a,values)

Returns an array c with the same shape as a. values is any Python sequence. a's elements are integers between 0, included, and len(values), excluded. Each element of c is the item of values whose index is the corresponding element of a. For example:

 print Numeric.choose(Numeric.identity(3),'ox') # prints: [[x o o] #          [o x o] #          [o o x]] 

clip

clip(a,min,max)

Returns an array c with the same typecode and shape as a. Each element ec of c is the corresponding element ea of a, where min<=ea<=max. Where ea<min, ec is min; where ea>max, ec is max. For example:

 print Numeric.clip(Numeric.arange(10),2,7) # prints: [2 2 2 3 4 5 6 7 7 7] 

compress

compress(condition,a,axis=0)

Returns an array c with the same typecode and rank as a. c includes only the elements of a for which the item of condition, corresponding along the given axis, is true. For example, compress((1,0,1),a) == take(a,(0,2),0) since (1,0,1) has true values only at indices 0 and 2. Here's how to get only the even numbers from an array:

 a = Numeric.arange(10) print Numeric.compress(a%2==0, a)   # prints: [0 2 4 6 8] 

concatenate

concatenate(arrays, axis=0)

arrays is a sequence of arrays, all with the same shape except possibly along the given axis. concatenate returns an array concatenating the arrays along the given axis. concatenate((s,)*n) has the same sequence replication semantics that s*n would have if s were a generic Python sequence rather than an array. For example:

 print Numeric.concatenate([Numeric.arange(5), Numeric.arange(3)]) # prints: [0 1 2 3 4 0 1 2] 

convolve

convolve(a,b,mode=2)

Returns an array c with rank 1, the linear convolution of rank 1 arrays a and b. Linear convolution is defined over unbounded sequences. convolve conceptually extends a and b to infinite length by padding with 0, then clips the infinite-length result to its central part, yielding c. When mode is 2, the default, convolve clips only the padding, so c's shape is (len(a)+len(b)-1,). Otherwise, convolve clips more. Say len(a) is greater than or equal to len(b). When mode is 0, len(c) is len(a)-len(b)+1; when mode is 1, len(c) is len(a). When len(a) is less than len(b), the effect is symmetrical. For example:

 a = Numeric.arange(6) b = Numeric.arange(4) print Numeric.convolve(a, b)          # prints: [0 0 1 4 10 16 22 22 15] print Numeric.convolve(a, b, 1)       # prints: [0 1 4 10 16 22] print Numeric.convolve(a, b, 0)       # prints: [4 10 16] 

cross_correlate

cross_correlate(a,b,mode=0)

Like convolve(a,b[::-1],mode).

diagonal

diagonal(a,k=0,axis1=0,axis2=1)

Returns the elements of a whose indices along axis1 and axis2 differ by k. When a has rank 2, that's the main diagonal when k == 0, subdiagonals above the main one when k > 0, and subdiagonals below the main one when k < 0. For example:

 # a is [[ 0  1  2  3] #       [ 4  5  6  7] #       [ 8  9 10 11] #       [12 13 14 15]] print Numeric.diagonal(a)        # prints: [0 5 10 15] print Numeric.diagonal(a,1)      # prints: [1 6 11] print Numeric.diagonal(a,-1)     # prints: [4 9 14] 

As shown, diagonal(a) is the main diagonal, diagonal(a,1) is the subdiagonal just above the main one, and diagonal(a,-1) is the subdiagonal just below the main one.

dot

dot(a,b)

Returns an array m with a times b in the matrix-multiplication sense, rather than element-wise multiplication. a.shape[-1] must equal b.shape[-2], and m.shape is the tuple a.shape[:-1]+b.shape[:-2]+b.shape[-1:].

indices

indices(shapetuple,typecode=None)

Returns an integer array x of shape (len(shapetuple),)+shapetuple. Each element of subarray x[i] is equal to the element's i index in the subarray. For example:

 print Numeric.indices((2,4))     # prints: [[[0 0 0 0]                                  #           [1 1 1 1]]                                  #          [[0 1 2 3]                                  #           [0 1 2 3]]] 

innerproduct

innerproduct(a,b)

Returns an array m with the result of the inner product of a and b, like matrixmultiply(a,transpose(b)). a.shape[-1] must equal b.shape[-1], and m.shape is the tuple a.shape[:-1]+b.shape[0:-1:-1].

matrixmultiply

matrixmultiply(a,b)

Returns an array m with a times b in the matrix-multiplication sense, rather than element-wise multiplication. a.shape[-1] must equal b.shape[0], and m.shape is the tuple a.shape[:-1]+b.shape[1:].

nonzero

nonzero(a)

Returns the indices of those elements of a that are not equal to 0, like the expression:

 array([i for i in range(len(a)) if a [i] != 0]) 

a must be a sequence or vector (meaning a one-dimensional array).

outerproduct

outerproduct(a,b)

Returns an array m, which is the outer products of vectors a and b (in other words, for every valid pair of indices i and j, m[i,j] equals a[i]*b[j]).

put

put(a,indices,values)

a must be a contiguous array. indices is a sequence of integers, taken as indices into a.flat. values is a sequence of values that can be converted to a's typecode (if shorter than indices, values is repeated as needed). Each element of a indicated by an item in indices is replaced by the corresponding item in values. put is therefore similar to (but much faster than) the loop:

 for i,v in zip(indices,list(values)*len(indices)):     a.flat[i]=v 

putmask

putmask(a,mask,values)

a must be a contiguous array. mask is a sequence with the same length as a.flat. values is a sequence of values that can be converted to a's typecode (if shorter than mask, values is repeated as needed). Each element of a corresponding to a true item in mask is replaced by the corresponding item in values. putmask is therefore similar to (but faster than) the loop:

 for i,v in zip(xrange(len(mask)),list(values)*len(mask)):     if mask[i]: a.flat[i]=v 

rank

rank(a)

Returns the rank of a, just like len(array(a,copy=False).shape).

ravel

ravel(a)

Returns the flat form of a, just like array(a, copy=not a.iscontiguous( ))).flat.

repeat

repeat(a,repeat,axis=0)

Returns an array with the same typecode and rank as a, where each of a's elements is repeated along axis as many times as the value of the corresponding item of repeat. repeat is an int or an int sequence of length a.shape[axis]. For example:

 >>> print N.repeat(range(4),range(4))  # emits  [1 2 2 3 3 3] 

reshape

reshape(a,shapetuple)

Returns an array r with shape shapetuple, and shares a's data. r=reshape(a,shapetuple) is like r=a;r.shape=shapetuple. The product of shapetuple's items must equal the product of a.shape's; one of shapetuple's items may be -1 to ask for adaptation of that axis's length. For example:

 print Numeric.reshape(range(12),(3,-1)) # prints: [[0 1 2 3] #          [4 5 6 7] #          [8 9 10 11]] 

resize

resize(a,shapetuple)

Returns an array r with shape shapetuple and data copied from a. If r's size is smaller than a's size, r.flat is copied from the start of ravel(a); if r's size is larger, the data in ravel(a) is replicated as many times as needed. In particular, resize(s,(n*len(s),)) has the sequence replication semantics that s*n would have if s were a generic Python sequence rather than an array. For example:

 print Numeric.resize(range(5),(3,4)) # prints: [[0 1 2 3] #          [4 0 1 2] #          [3 4 0 1]] 

searchsorted

searchsorted(a,values)

a must be a sorted rank 1 array. searchsorted returns an array of integers s with the same shape as values. Each element of s is the index in a where the corresponding element of values would fit in the sorted order of a. For example:

 print Numeric.searchsorted([0,1], [0.2,-0.3,0.5,1.3,1.0,0.0,0.3]) # prints: [1 0 1 2 1 0 1] 

This specific idiom returns an array with: 0 in correspondence to each element x of values when x is less than or equal to 0;, 1 when x is greater than 0 and less than or equal to 1, and 2 when x is greater than 1. With slight generalization, and with appropriate thresholds as the elements of sorted array a, this idiom allows very fast classification of the subrange each element x of values falls into.

shape

shape(a)

Returns the shape of a, just like array(a,copy=False).shape.

size

size(a,axis=None)

When axis is None, returns the total number of elements in a. Otherwise, returns the number of elements of a along axis, like array(a,copy=False).shape[axis].

sort

sort(a,axis=-1)

Returns an array s with the same typecode and shape as a, with elements along each plane of axis reordered so that the plane is sorted in increasing order. For example:

 # x is [[0 1 2 3] #       [4 0 1 2] #       [3 4 0 1]] print Numeric.sort(x)       # prints: [[0 1 2 3]                             #          [0 1 2 4]                             #          [0 1 3 4]] print Numeric.sort(x,0)     # prints: [[0 0 0 1]                             #          [3 1 1 2]                             #          [4 4 2 3]] 

Here, sort(x) sorts each row, while sort(x,0) sorts each column.

swapaxes

swapaxes(a,axis1,axis2)

Returns an array s with the same typecode, rank, and size as a, and shares a's data. s's shape is the same as a, but with the lengths of axes axis1 and axis2 swapped. In other words, s=swapaxes(a,axis1,axis2) is like:

 swapped_shape=range(length(a.shape)) swapped_shape[axis1]= axis2 swapped_shape[axis2]= axis1 s=TRanspose(a,swapped_shape) 

take

take(a,indices,axis=0)

Returns an array t with the same typecode and rank as a, and contains the subset of a's elements that would be in a slice along axis comprising the given indices. For example, after t=take(a,(1,3)), t.shape==(2,)+a.shape[1:], and t's items are copies of the second and fourth rows of a.

trace

trace(a,k=0)

Returns the sum of a's elements along the k diagonal, like sum(diagonal(a,k)).

transpose

TRanspose(a,axes=None)

Returns an array t, with the same typecode, rank, and size as a, and shares a's data. t's axes are permuted with respect to a's by the axis indices in sequence axes. When axes is None, t's axes invert the order of a's, as if axes were reversed(a.shape).

vdot

vdot(a,b)

Returns a scalar that is the dot products of vectors a and b. If a is complex, this operation uses the complex conjugate of a.

where

where(condition,x,y)

Returns an array w with the same shape as condition. Where an element of condition is true, the corresponding element of w is the corresponding element of x; otherwise, it is the corresponding element of y. For example, clip(a,min,max) is the same as where(greater(a,max),max,where(greater(a,min),a,min)).





Python in a Nutshell
Python in a Nutshell, Second Edition (In a Nutshell)
ISBN: 0596100469
EAN: 2147483647
Year: 2004
Pages: 192
Authors: Alex Martelli

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net