10.1. Manipulating Text
You can't use arithmetic operators like + and - with text. If you try to perform a calculation by referring to one or more cells containing text, Excel displays a #VALUE error message. However, there's one operator that you can use: the concatenation operator (&), which joins together text. For example, imagine you have an individual's first name in cell A1, and a last name in cell A2. You could join the values from these two cells to create a full name with this formula:
=A1 & A2
There's one drawback to this approach. In all likelihood , the first and last name cells won't include any leading or trailing spaces. This means when you join them, they'll get fused into one word, like JohnSmith. One solution is to explicitly add a space (between quotation marks) into your formula, like so:
=A1 & " " & A2
The important concept in this example is that you can enter string literals fixed pieces of text (including spaces)as easily as you can enter literal numbers. The only difference between entering literal text and literal numbers is that you must place all text between quotation marks. There's no limit to how many pieces of text you can stitch together at a time. The next group of functions showcases the many different ways that Excel lets you manipulate text.
The CONCATENATE( ) function lets you join together text in exactly the same way the & operator does. CONCATENATE( ) joins all the parameters that you supply into one long piece of text, in the order you specify them.
Here's an equivalent way to rewrite the aforementioned name-joining formula by using CONCATENATE( ) with two pieces of text:
=CONCATENATE(A1, " ", A2)
Just as you can join together pieces of text, you can also split up a string of text. The LEFT( ), MID( ), and RIGHT( ) functions let you extract a portion from a larger text string. For example, the LEFT( ) function takes two arguments: the text you want to examine, and the number of characters that Excel should extract, starting from the string's left side:
To take the first four letters from the text in cell A1, you'd use the formula:
Assuming the cell contains the text tofurkey , the result of this formula would be the shortened text tofu .
The RIGHT( ) function performs the same operation, but it extracts letters starting from the right side of a string. For example, consider the following formula:
If you use this function with the same string, you'll simply end up with the text urkey .
The MID( ) function is more powerful than the LEFT( ) and RIGHT( ) functions, as it has the ability to extract a consecutive series of characters from anywhere inside a string. When using the MID( ) function, you need to supply three arguments: the text you're evaluating, the starting position, and the number of characters. Excel numbers each letter in a string (it also counts spaces), starting with 1 for the first letter, 2 for the second letter, and so on. That means if you specify a starting position of 3 and a length of 2, Excel extracts the third and fourth characters. The basic formula looks like this:
MID(text, start_position, number_of_characters)
Here's an example that copies characters from the middle of a string. If the cell A1 contains the text Swanky Franks , the following formula returns the value Frank .
=MID(A1, 8, 5)
The LEFT( ), RIGHT( ), and MID( ) functions let you copy specified segments of a string. But what happens if you don't know the exact length of the string you're searching? For example, if you're interested in retrieving last names from a column that contains full names, none of these functions will help since last names vary in length.
Fortunately, Excel provides some other tools that can help you out. Three of these are LEN( ), FIND( ), and SEARCH( ), all of which give you numeric information about text. This section first explains how to use each of these functions, then it shows how you can combine them with the LEFT( ), RIGHT( ), and MID( ) functions to perform some really powerful operations.
To begin with, LEN( )short for LENgthcounts the number of characters in a string of text. For example, the result of the following formula is 5:
The FIND( ) function is more sophisticated. It gives you a number representing the position of a given character or series of characters. For example, the function can tell you where a space is located in a phrase. If there's more than one match, the FIND( ) function returns the position of the first match only. The FIND( ) function can also take an optional third parameter indicating the position where the search should beginif left out, Excel starts at the beginning of the text. Here's what the basic formula looks like:
FIND(find_text, within_text, [start_position])
Now consider the following example. It gives you 5, indicating that the first space in the phrase "Mind the Gap" is in position 5.
=FIND(" ", "Mind the Gap")
SEARCH( ) works in almost exactly the same way, and it takes the same two or three arguments. The only difference is that FIND( ) performs a case-sensitive search, which means it looks at whether upper- and lowercase letters match, whereas SEARCH( ) doesn't care whether there are inconsistencies between the case of the text you're looking for and the text you're evaluating.
The LEN( ), FIND( ), and SEARCH( ) functions all become even more useful when used in conjunction with the LEFT( ), RIGHT( ), and MID( ) functions. For example, say you've got a column filled with full names, as shown in Figure 10-1, and you want to copy the first and last names and paste them into their own new columns (called "First Name" and "Last Name"). To copy the first name from a cell containing a full name, here's what to do.
Although you don't know how long the first name is, you can search for the position of the first space by using the FIND( ) function. Next, subtract one from the number you get from the FIND( ) function. You'll now end up with the length you need to copy the first name (using the LEFT( ) function). Here's a formula that puts all these steps together (this formula would work for cell B2 in Figure 10-1):
=LEFT(A2,FIND(" ", A2)-1)
You can use a similar trick to get the last name using the RIGHT( ) function:
=RIGHT(A2,LEN(A2)-FIND(" ", A2))
This formula is put to work in Figure 10-1.
In Figure 10-1, the formula in the formula bar can look overwhelming at first, so it helps to remember the two arguments the RIGHT( ) function is processing: first, which cell to evaluate (A2) and, second, the number of spaces over that it should count from the right (LEN(A2)-FIND(" ", A2)). The first argument is straightforward. Here's how the second argument breaks down: First, the FIND( ) function finds the location of the first space (this is the space separating the first and last name). Next, the LEN( ) function determines how many characters (including spaces) are in the full name string. Excel will then subtract the FIND( ) result from the LEN( ) result, which helps identify exactly where the last name begins. Then the RIGHT( ) function can use this number to do its job: copy out the last name.
Another way to manipulate text is by changing capitalization. Excel provides three functions for this purpose (see Figure 10-2):
UPPER( ) converts text to all capitals.
LOWER( ) converts text to all lowercase.
PROPER( ) converts text to initial-case. That means every letter will be lowercase, except for the first letter in each word. (Excel identifies where words start by looking for the spaces in between.)
All three of these functions need just one part, which is a string (a short bit of text, anywhere from a few letters to a lengthy phrase). You can use a reference to a cell that contains text, or a piece of literal textas long as you remember the quotation marks. For example, the following formula displays the text contained in cell A1, but changes all characters to lowercase:
| GEM IN THE ROUGH |
Using PROPER( ) to Change Names in All Caps to Initial Caps
Usually, functions like UPPER( ), LOWER( ), and PROPER( ) transform the appearance of text that's already in your spreadsheet. But figuring out exactly how to do this can be a little bit tricky.
Say you've got a list of first and last names in column A (beginning in cell A1) whose letters are all uppercase. Your goal is to change these names so that only the first letters of each name are capitalized. Here's what to do. First, insert a new column B. Next, enter the formula =PROPER(A1) in cell B1 and then copy this formula to the rest of the cells in column B. Because this formula uses a relative cell reference (as explained on Section 7.3), Excel automatically adjusts the formula for each cell that you paste it into.
Once you've taken this step, Excel displays the properly capitalized names in column B. But you can't just copy and paste these cells into column A (because they hold formulas that reference the cells you'd be pasting them into). Instead, you need to select the corrected names from column B, copy them (Edit Copy), move to cell A1, and then select Edit Paste Special (not the plain- vanilla Edit Paste command). Choose the Values option before clicking OK to paste the data.
This technique provides a quick way to clean up a number of problems. For example, you can also use this approach with many other text functions like TRIM( ), CLEAN( ), SUBSTITUTE( ), TEXT( ), FIXED( ), DOLLAR( ), all of which are explained in this chapter.
The TRIM( ) and CLEAN( ) functions perform minor cleanup on any strings of text you run through them. TRIM( ) removes any leading and trailing spaces; it also changes any series of more than one space to a single space. Thus, if you use TRIM( ) on the text string " Hello There " the altered text becomes "Hello There." TRIM( ) can be quite handy for fixing erratic spacing.
CLEAN( ) simply removes non-printable characters from a text string. Non-printable characters, which usually appear as empty-box icons in your text, tend to appear only if you import some text from another file format that Excel has difficulty understanding.
The SUBSTITUTE( ) function replaces a sequence of characters in a string with another set of characters. The function has three parts : the text you want to modify, the characters you're looking to replace, and the replacement text you want to insert. In addition, you can supply an optional occurrence number parameter, which Excel uses if it finds more than one match. For example, if your search text is matched three times, and you supply 2 for your occurrence number, Excel changes the second occurrence only. If you don't supply the occurrence number, Excel will change all occurrences. Here's what the function looks like:
SUBSTITUTE(text, old_text, new_text, [occurrence_number])
Consider the case where cell A1 contains the text: It was the best of times; it was the worst of times . You could use the following formula:
=SUBSTITUTE(A1, "times", "nanoseconds")
The result is the string: It was the best of nanoseconds; it was the worst of nanoseconds .
On the other hand, the following formula explicitly replaces just the second occurrence. The resulting string is: It was the best of times; it was the worst of crimes .
=SUBSTITUTE(A1, "times", "crimes", 2)
Sometimes, you might need to convert text into a number, or vice versa. For example, imagine you have a cell that contains the sentence "A good sandwich costs $5.95." Using the MID( ) function, you could copy just the part of this text that has the price"5.95." However, even though this text contains numeric information, to Excel it's still a piece of text, so you can't perform mathematical operations with it (like adding tax). On the other hand, you might have the reverse problem: you might want to show a string of text that includes a number from another cell. In these cases, the data conversion functions TEXT( ) and VALUE( ) are useful.
TEXT( ) converts an ordinary number into formatted text using the format that you specify. It always requires two arguments:
The first argument is the number you're converting; the second argument is the format you want to use. You can use any of the date, time, or numeric formatting codes described in Chapter 4. (See in particular Table 4-2.) For example, the following formula converts the number 434.2 to the formatted text $434.20.
On its own, this may not seem very practical. After all, you've already learned how you can control cell formatting using the Format Cells dialog box. What's the point of doing the same job with a formula? The answer is that you may want to do so when you're looking to perform some fancy text processing with your number.
For example, imagine you have the price of a product in cell A1 (which happens to be 300). You could use the following formula to change a number to formatted text and then put it into a complete sentence such as, $300.00 is way too expensive .
=TEXT(A1,"$#,##0.00") & " is way too expensive."
Now compare what happens if you tried to do this without using the TEXT( ) function:
=A1 & " is way too expensive."
You'd end up with an unformatted result: 300 is way too expensive . This is true regardless of what formatting you've used in cell A1.
In other words, you can use text-based functions with numbers, but unless you explicitly perform the conversion using the TEXT( ) function, Excel will always use General formattingwhich might not be what you want.
The VALUE( ) function performs the reverse transformationit converts a piece of text into a number you can manipulate in a formula. Here's an example:
This trick becomes useful if you need to extract a price out of a string of text, and then perform a calculation with it. Here's an example that gets the number 300 out of a sentence using the MID( ) function, and then converts it to a number using the VALUE( ) function:
=VALUE(MID("I suggest a price of 0.00 for the chair.",23,6))
The VALUE( ) function is fairly simple to use, but it isn't terribly bright. If you use it with content that contains both numeric characters and letters, like the string of text 42 bananas , it will fail. You'll see the error message #VALUE! instead of the desired content.
The FIXED( ) and DOLLAR( ) functions also convert numbers to text. The difference is that these functions are customized to use a set format (described below), so you don't need to specify the desired format as you would with the TEXT( ) function.
The FIXED( ) function actually performs several steps. First it rounds a number to a specified number of decimal places. Next, it formats the number with a decimal point and, optionally , with commas to separate the thousands. Finally, it converts the number to text. Here's what the function looks like:
FIXED(number, [number_of_decimals], [no_commas])
Only the first argument is required. If you don't specify the other arguments, the FIXED( ) function automatically uses two decimal places and includes commas if your number is large enough to warrant them. If you don't want commas, set the third argument to TRUE.
For example, the following formula gives you the text 5,450.59 :
The DOLLAR( ) function automatically applies the currency format before it converts a value, so your text appears with a currency symbol before it. If you want, you can specify the number of decimal places, or you can just accept the default of two decimal places:
So far you've had a tour of the most useful text manipulation functions. Excel also provides a few lesser-used functions, outlined in Table 10-1.
Returns the character for a specific character code. For example, CHAR(100) corresponds to the lowercase letter d.
Returns the numeric code for the first character in a text string. For example, CODE("d") returns 100.
Compares two pieces of text, and returns TRUE if they match, or FALSE if they don't. Usually, it's just easier to use the IF( ) function with the equal sign for comparison, as described in Chapter 12.
Creates a text string by repeating the text you specify the number of times you specify. This is quick way to repeat one or more characters in a cell.
T( ) ensures that certain content is in text form. If you use T( ) with a piece of text, it returns that piece of text. If you use T( ) with a number, it returns an empty string. Thus, =T("Hello") returns the text Hello , while =T(56) just returns a blank value.
N( ) ensures that certain content is a number. If you use N( ) with a numeric value, it returns that number. If you use N( ) with a text string, it returns 0, even if the text could be converted to a number. Thus, =N(2) returns the number 2, while =N("2") returns 0.