Indexing Strings


By using a for loop, you're able to go through a string, one character at a time, in order. This is known as sequential access, which means you have to go through a sequence one element at a time, starting from the beginning. Sequential access is like going through a stack of heavy boxes that you can only lift one at a time. To get to the bottom box in a stack of five, you'd have to lift the top box, then the next box, followed by the next box, then one more to finally get to the last box. Wouldn't it be nice to just grab the last box without messing with any of the others? This kind of direct access is called random access. Random access allows you to get to any element in a sequence directly. Fortunately, there's a way to randomly access elements of a sequence. It's called indexing. Through indexing, you specify a position (or index) number in a sequence and get the element at that position. In the box example, you could get the bottom box directly, by asking for box number five.

Introducing the Random Access Program

The Random Access program uses sequence indexing to directly access random characters in a string. The program picks a random position from the string "index", and prints the letter and the position number. The program does this 10 times to give a good sampling of random positions. Figure 4.5 shows the program in action.

click to expand
Figure 4.5: You can directly access any character in a string through indexing.

The following is the code for the program:

 # Random Access # Demonstrates string indexing # Michael Dawson - 1/27/03 import random word = "index" print "The word is: ", word, "\n" high = len(word) low = -len(word) for i in range(10):     position = random.randrange(low, high)     print "word[", position, "]\t", word[position] raw_input("\n\nPress the enter key to exit.") 

Working with Positive Position Numbers

In this program, one of the first things I do is assign a string value to a variable:

 word = "index" 

Nothing new here. But by doing this, I create a sequence (like every time I create a string) where each character has a numbered position. The first letter, "i," is at position 0. (Remember, computers usually start counting from 0.) The second letter, "n," is at position 1. The third letter, "d," is at position 2, and so on.

Accessing an individual character of a string is easy. To access the letter in position 0 from the variable word, you'd just type word[0]. For any other position, you'd just substitute that number. To help cement the idea, take a look at part of an interactive session I had:

 >>> word = "index" >>> print word[0] i >>> print word[1] n >>> print word[2] d >>> print word[3] e >>> print word[4] x 

TRAP

Since there are five letters in the string "index", you might think that the last letter, "x," would be at position 5. But you'd be wrong. There is no position 5 in this string, because the computer begins counting at 0. Valid positive positions are 0, 1, 2, 3, and 4. Any attempt to access a position 5 will cause an error. Take a look at an interactive session for proof:

 >>> word = "index" >>> print word[5] Traceback (most recent call last):   File "<pyshell#1>", line 1, in ?     print word[5] IndexError: string index out of range 

Somewhat rudely, the computer is saying there is no position 5. So remember, the last element in a sequence is at the position number of its length minus one.

Working with Negative Position Numbers

Except for the idea that the first letter of a string is at position 0 and not 1, working with positive position numbers seems pretty natural. But there's also a way to access elements of a sequence through negative position numbers. With positive position numbers, your point of reference is the beginning of the sequence. For strings, this means that the first letter is where you start counting. But with negative position numbers, you start counting from the end. For strings, that means you start counting from the last letter and work backwards.

The best way to understand how negative position numbers work is to see an example. Take a look at another interactive session I had, again, using the string "index":

 >>> word = "index" >>> print word[-1] 'x' >>> print word[-2] 'e' >>> print word[-3] 'd' >>> print word[-4] 'n' >>> print word[-5] 'i' 

You can see from this session that word[-1] accesses the last letter of "index", the "x." When using negative position numbers, -1 means the last element, the index -2 means the second to the last element, the index -3 means the third to the last element, and so on. Sometimes it makes more sense for your reference point to be the end of a sequence. For those times, you can use negative position numbers.

Figure 4.6 provides a nice way to see the string "index" broken up by position numbers, both positive and negative.

click to expand
Figure 4.6: You can access any letter of "index" with a positive or negative position number.

Accessing a Random String Element

It's time to get back to the Random Access program. To access a random letter from the "index", I need to generate random numbers. So, the first thing I did in the program was import the random module:

 import random 

Next, I wanted a way to pick any valid position number in word, negative or positive. I wanted my program to be able to generate a random number between -5 and 4, inclusive, because those are all the possible position values of word. Luckily, the random.randrange() function can take two end points and produce a random number from between them. So, I created two end points:

 high = len(word) low = -len(word) 

high gets the value 5, because "index" has five characters in it. The variable low gets the negative value of the length of the word (that's what putting a minus sign in front of a number does). So low gets the value of -5. This represents the range from which I want to grab a random number.

Actually, I want to generate a random number between, and including, -5 up to, but not including, 5. And that's exactly the way the random.randrange() function works. If you pass it two arguments, it will produce a random number from and including the low end point, up to, but not including, the high end point. So in my sample run, the line:

    position = random.randrange(low, high) 

produces either -5, -4, -3, -2, -1, 0, 1, 2, 3, or 4. This is exactly what I want, since these are all the possible valid position numbers for the string "index".

Finally, I created a for loop that executes 10 times. In the loop body, the program picks a random position value and prints that position value and corresponding letter:

 for i in range(10):     position = random.randrange(low, high)     print "word[", position, "]\t", word[position] 




Python Programming for the Absolute Beginner
Python Programming for the Absolute Beginner, 3rd Edition
ISBN: 1435455002
EAN: 2147483647
Year: 2003
Pages: 194

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net