The Basics of Python | Ubuntu Unleashed 2011 Edition: Covering 10.10 and 11.04 (6th Edition)

Python is a language wholly unlike most others, and yet it is so logical that most people can pick it up quickly. You have already seen how easily you can assign strings, but in Python nearly everything is that easyas long as you remember the syntax!

Numbers

The way Python handles numbers is more precise than some other languages. It has all the normal operatorssuch as + for addition, - for subtraction, / for division, and * for multiplicationbut it adds % for modulus (division remainder), ** for raise to the power, and // for floor division. It is also specific about which type of number is being used, as this example shows:

>>> a = 5 >>> b = 10 >>> a * b 50 >>> a / b 0 >>> b = 10.0 >>> a / b 0.5 >>> a // b 0.0

The first division returns 0 because both a and b are integers (whole numbers), so Python calculates the division as an integer, giving 0. By converting b to 10.0, Python considers it to be a floating-point number, and so the division is now calculated as a floating-point value, giving 0.5. Even with b being floating point, using //floor divisionrounds it down.

Using **, you can easily see how Python works with integers:

>>> 2 ** 30 1073741824 >>> 2 ** 31 2147483648L

The first statement raises 2 to the power of 30 (that is, 2 x 2 x 2 x 2 x 2 x …), and the second raises 2 to the power of 31. Notice how the second number has a capital L on the end of itthis is Python telling you that it is a long integer. The difference between long integers and normal integers is slight but important: Normal integers can be calculated using simple instructions on the CPU, whereas long integersbecause they can be as big as you need them to beneed to be calculated in software and therefore are slower.

When specifying big numbers, you need not put the L at the endPython figures it out for you. Furthermore, if a number starts off as a normal number and then exceeds its boundaries, Python automatically converts it to a long integer. The same is not true the other way around: If you have a long integer and then divide it by another number so that it could be stored as a normal integer, it remains a long integer:

>>> num = 999999999999999999999999999999999L >>> num = num / 1000000000000000000000000000000 >>> num 999L

You can convert between number types using typecasting, like this:

>>> num = 10 >>> int(num) 10 >>> float(num) 10.0 >>> long(num) 10L >>> floatnum = 10.0 >>> int(floatnum) 10 >>> float(floatnum) 10.0 >>> long(floatnum) 10L

You need not worry whether you are using integers or long integers; Python handles it all for you, so you can concentrate on getting the code right.

More on Strings

Python stores strings as an immutable sequence of charactersa jargon-filled way of saying that "it is a collection of characters that, once set, cannot be changed without creating a new string." Sequences are important in Python. There are three primary types, of which strings are one, and they share some properties. Mutability makes much sense when you learn about lists in the next section.

As you saw in the previous example, you can assign a value to strings in Python with just an equal sign, like this:

>>> mystring = 'hello'; >>> myotherstring = "goodbye"; >>> mystring 'hello' >>> myotherstring; 'goodbye' >>> test = "Are you really Bill O'Reilly?" >>> test "Are you really Bill O'Reilly?"

The first example encapsulates the string in single quotation marks, and the second and third in double quotation marks. However, printing the first and second strings show them both in single quotation marks because Python does not distinguish between the two. The third example is the exceptionit uses double quotation marks because the string itself contains a single quotation mark. Here, Python prints the string with double quotation marks because it knows it contains the single quotation mark.

Because the characters in a string are stored in sequence, you can index into them by specifying the character you are interested in. Like most other languages, these indexes are zero based, which means you need to ask for character 0 to get the first letter in a string. For example:

>>> string = "This is a test string" >>> string 'This is a test string' >>> string[0] 'T' >>> string [0], string[3], string [20] ('T', 's', 'g')

The last line shows how, with commas, you can ask for several indexes at the same time. You could print the entire first word using this:

>>> string[0], string[1], string[2], string[3] ('T', 'h', 'i', 's')

However, for that purpose you can use a different concept: slicing. A slice of a sequence draws a selection of indexes. For example, you can pull out the first word like this:

>>> string[0:4] 'This'

The syntax there means "take everything from position 0 (including 0) and end at position 4 (excluding it)." So, [0:4] copies the items at indexes 0, 1, 2, and 3. You can omit either side of the indexes, and it will copy either from the start or to the end:

>>> string [:4] 'This' >>> string [5:] 'is a test string' >>> string [11:] 'est string'

You can also omit both numbers, and it will give you the entire sequence:

>>> string [:] 'This is a test string'

Later you will learn precisely why you would want to do that, but for now there are a number of other string intrinsics that will make your life easier. For example, you can use the + and * operators to concatenate (join) and repeat strings, like this:

>>> mystring = "Python" >>> mystring * 4 'PythonPythonPythonPython' >>> mystring = mystring + " rocks! " >>> mystring * 2 'Python rocks! Python rocks! '

In addition to working with operators, Python strings come with a selection of built-in methods. You can change the case of the letters with capitalize() (uppercases the first letter and lowercases the rest), lower() (lowercases them all), title() (uppercases the first letter in each word), and upper() (uppercases them all). You can also check whether strings match certain cases with islower(), istitle(), and isupper(); that also extends to isalnum() (returns true if the string is letters and numbers only) and isdigit() (returns true if the string is all numbers).

This example demonstrates some of these in action:

>>> string 'This is a test string' >>> string.upper() 'THIS IS A TEST STRING' >>> string.lower() 'this is a test string' >>> string.isalnum() False >>> string = string.title() >>> string 'This Is A Test String'

Why did isalnum() return falseour string contains only alphanumeric characters, doesn't it? Well, no. There are spaces in there, which is what is causing the problem. More important, we were calling upper() and lower() and they were not changing the contents of the stringthey just returned the new value. So, to change our string from This is a test string to This Is A Test String, we actually have to assign it back to the string variable.

Lists

Python's built-in list data type is a sequence, like strings. However, they are mutable, which means they can be changed. Lists are like arrays in that they hold a selection of elements in a given order. You can cycle through them, index into them, and slice them:

>>> mylist = ["python", "perl", "php"] >>> mylist ['python', 'perl', 'php'] >>> mylist + ["java"] ['python', 'perl', 'php', 'java'] >>> mylist * 2 ['python', 'perl', 'php', 'python', 'perl', 'php'] >>> mylist[1] 'perl' >>> mylist[1] = "c++" >>> mylist[1] 'c++' >>> mylist[1:3] ['c++', 'php']

The brackets notation is important: You cannot use parentheses, (( and )) or braces ({ and }) for lists. Using + for lists is different from using + for numbers. Python detects you are working with a list and appends one list to another. This is known as operator overloading, and it is one of the reasons Python is so flexible.

Lists can be nested, which means you can put a list inside a list. However, this is where mutability starts to matter, and so this might sound complicated! If you recall, the definition of an immutable string sequence is that it is a collection of characters that, once set, cannot be changed without creating a new string. Lists are mutable as opposed to immutable, which means you can change your list without creating a new list.

This becomes important because Python, by default, copies only a reference to a variable rather than the full variable. For example:

>>> list1 = [1, 2, 3] >>> list2 = [4, list1, 6] >>> list1 [1, 2, 3] >>> list2 [4, [1, 2, 3], 6]

Here we have a nested list. list2 contains 4, then list1, then 6. When you print the value of list2, you can see it also contains list1. Now, proceeding on from that:

>>> list1[1] = "Flake" >>> list2 [4, [1, 'Flake', 3], 6]

In line one, we set the second element in list1 (remember, sequences are zero based!) to be Flake rather than 2; then we print the contents of list2. As you can see, when list1 changed, list2 was updated, too. The reason for this is that list2 stores a reference to list1 as opposed to a copy of list1; they share the same value.

We can show that this works both ways by indexing twice into list2, like this:

>>> list2[1][1] = "Caramello" >>> list1 [1, 'Caramello', 3]

The first line says, "Get the second element in list2 (list1) and the second element of that list, and set it to be 'Caramello'." Then list1's value is printed, and you can see it has changed. This is the essence of mutability: We are changing our list without creating a new list. On the other hand, editing a string creates a new string, leaving the old one unaltered. For example:

>>> mystring = "hello" >>> list3 = [1, mystring, 3] >>> list3 [1, 'hello', 3] >>> mystring = "world" >>> list3 [1, 'hello', 3]

Of course, this raises the question of how you copy without references when references are the default. The answer, for lists, is that you use the [:] slice, which we looked at earlier. This slices from the first element to the last, inclusive, essentially copying it without references. Here is how that looks:

>>> list4 = ["a", "b", "c"] >>> list5 = list4[:] >>> list4 = list4 + ["d"] >>> list5 ['a', 'b', 'c'] >>> list4 ['a', 'b', 'c', 'd']

Lists have their own collections of built-in methods, such as sort(), append(), and pop(). The latter two add and remove single elements from the end of the list, with pop() also returning the removed element. For example:

>>> list5 = ["nick", "paul", "julian", "graham"] >>> list5.sort() >>> list5 ['graham', 'julian', 'nick', 'paul'] >>> list5.pop() 'paul' >>> list5 ['graham', 'julian', 'nick'] >>> list5.append("Rebecca")

In addition, one interesting method of strings returns a list: split(). This takes a character to split by and then gives you a list in which each element is a chunk from the string. For example:

>>> string = "This is a test string"; >>> string.split(" ") ['This', 'is', 'a', 'test', 'string']

Lists are used extensively in Python, although this is slowly changing as the language matures.

Dictionaries

Unlike lists, dictionaries are collections with no fixed order. Instead, they have a key (the name of the element) and a value (the content of the element), and Python places them wherever it needs to for maximum performance. When defining dictionaries, you need to use braces ({ }) and colons (:). You start with an opening brace and then give each element a key and a value, separated by a colon, like this:

>>> mydict = { "perl" : "a language", "php" : "another language" } >>> mydict {'php': 'another language', 'perl': 'a language'}

This example has two elements, with keys perl and php. However, when the dictionary is printed, we find that php comes before perlPython hasn't respected the order in which we entered them. We can index into a dictionary using the normal code:

>>> mydict["perl"] 'a language'

However, because a dictionary has no fixed sequence, we cannot take a slice, or index by position.

Like lists, dictionaries are mutable and can also be nested; however, unlike lists, you cannot merge two dictionaries by using +. This is because dictionary elements are located using the key. Therefore, having two elements with the same key would cause a clash. Instead, you should use the update() method, which merges two arrays by overwriting clashing keys.

You can also use the keys() method to return a list of all the keys in a dictionary.

Conditionals and Looping

So far, we have just been looking at data types, which should show you how powerful Python's data types are. However, you simply cannot write complex programs without conditional statements and loops.

Python has most of the standard conditional checks, such as, (greater than), <= (less than or equal to), and == (equal), but it also adds some new ones, such as in. For example, we can use in to check whether a string or a list contains a given character/element:

>>> mystring = "J Random Hacker" >>> "r" in mystring True >>> "Hacker" in mystring True >>> "hacker" in mystring False

The last example demonstrates how in is case sensitive. We can use the operator for lists, too:

>>> mylist = ["soldier", "sailor", "tinker", "spy"] >>> "tailor" in mylist False

Other comparisons on these complex data types are done item by item:

>>> list1 = ["alpha", "beta", "gamma"] >>> list2 = ["alpha", "beta", "delta"] >>> list1 > list2 True

list1's first element (alpha) is compared against list2's first element (alpha) and, because they are equal, the next element is checked. That is equal also, so the third element is checked, which is different. The g in gamma comes after the d in delta in the alphabet, so gamma is considered greater than delta and list1 is considered greater than list2.

Loops come in two types, and both are equally flexible. For example, the for loop can iterate through letters in a string or elements in a list:

>>> string = "Hello, Python!" >>> for s in string: print s, ... H e l l o ,   P y t h o n !

The for loop takes each letter in string and assigns it to s. This then is printed to the screen using the print command, but note the comma at the end; this tells Python not to insert a line break after each letter. The "…" is there because Python allows you to enter more code in the loop; you need to press Enter again here to have the loop execute.

The exact same construct can be used for lists:

>>> mylist = ["andi", "rasmus", "zeev"] >>> for p in mylist: print p ... andi rasmus zeev

Without the comma after the print statement, each item is printed on its own line.

The other loop type is the while loop, and it looks similar:

>> while 1: print "This will loop forever!" ... This will loop forever! This will loop forever! This will loop forever! This will loop forever! This will loop forever! (etc) Traceback (most recent call last):   File "<stdin>", line 1, in ? KeyboardInterrupt >>>

That is an infinite loop (it will carry on printing that text forever), so you need to press Ctrl+C to interrupt it and regain control.

If you want to use multiline loops, you need to get ready to use your Tab key: Python handles loop blocks by recording the level of indent used. Some people find this odious; others admire it for forcing clean coding on users. Most of us, though, just get on with programming!

For example:

>>> i = 0 >>> while i < 3: ...     j = 0 ...     while j < 3: ...             print "Pos: " + str(i) + "," + str(j) + ")" ...             j += 1 ...     i += 1 ... Pos: (0,0) Pos: (0,1) Pos: (0,2) Pos: (1,0) Pos: (1,1) Pos: (1,2) Pos: (2,0) Pos: (2,1) Pos: (2,2)

You can control loops using the break and continue keywords. break exits the loop and continues processing immediately afterward, and continue jumps to the next loop iteration.