Finding Matches


Finding matches is all about searching text. You create a regular expression, also called a pattern , and then see whether that pattern matches any of the text you're searching. You can use regular expressions in JavaScript to find matches in several ways:

  • The String object's match method. This method finds regular expression matches in a string.

    Syntax: string .match( regExp ) . Returns an array of matches or null if there were no matches. See Chapter 18, "The Date , Time , and String Objects," for more information.

  • The String object's search method. This method returns the position of the first substring match in a regular expression search.

    Syntax: string .search( regExp ) , where regExp is the regular expression to match. See Chapter 18 for more information.

  • The regular expression's exec method. Like the String object's match method, this method finds regular expression matches in a string. Syntax: regularExpression .exec( string ) , where string is the string you're searching. Returns an array of matches or null if there were no matches.

For example, we've already seen the match method at work in Chapter 18 (in Listing 18-05.html). In that example, I only allowed the user to enter date strings that are made up of one or two digits, followed by a slash (/), followed by one or two digits, followed by a slash, followed by two or four digits for the year. To do that, I used two regular expressions (we'll see how to create these regular expressions later in the chapter) and the match method; if both match expressions returned null, no valid dates were found:

 <HTML>      <HEAD>          <TITLE>Checking Dates</TITLE>          <SCRIPT LANGUAGE="JavaScript">              <!--             function checker()              {  var regExp1 = /^(\d{1,2})\/(\d{1,2})\/(\d{2})$/   var regExp2 = /^(\d{1,2})\/(\d{1,2})\/(\d{4})$/   var resultArray1 = document.form1.text1.value.match(regExp1)   var resultArray2 = document.form1.text1.value.match(regExp2)   if (resultArray1 == null && resultArray2 == null) {   alert("Sorry, that's not in valid date format.")   } else {   alert("That's in valid date format.")   }  }              //-->          </SCRIPT>      </HEAD>      <BODY>          <H1>Checking Dates</H1>          <FORM NAME="form1">              <INPUT TYPE="TEXT" NAME="text1"></INPUT>              <INPUT TYPE="BUTTON" VALUE="Check Date" ONCLICK="checker()">          </FORM>      </BODY>  </HTML> 

You can see the results in Chapter 18, in Figure 18.5. This brings up a question: If there were indeed matches to a regular expression, how do you get the matching text? In other words, how do you find out which substrings in the text you're searching for matched your regular expression?

If your regular expression matches any substring in the searched string, you get an array holding of those matches. To search for words that use only lower case letters , for example, you can use the regular expression /\b[^A-Z]+\b/ (which means "word boundary followed by one or more characters that don't include any upper case letters, followed by a word boundary"). We'll see how to create this regular expression in this chapter. Here's an example that puts this regular expression to work, using the exec method, returning an array of the lower case words in the text JavaScript is the subject. :

(Listing 20-02.html on the web site)
 <HTML>      <HEAD>          <TITLE>Getting Lower Case Words</TITLE>          <SCRIPT LANGUAGE="JavaScript">              <!--             function getLowers()              {  var regexp = /\b[^A-Z]+\b/   var matches = regexp.exec(document.form1.text1.value)   if (matches) {   for (var loopIndex = 0; loopIndex < matches.length; loopIndex++){   document.form1.text2.value += matches[loopIndex] + " "   }   } else {   document.form1.text2.value = "Sorry, no lower case words."   }  }              //-->          </SCRIPT>      </HEAD>      <BODY>          <H1>Getting Lower Case Words</H1>          <FORM NAME="form1">              <INPUT TYPE="TEXT" NAME="text1" VALUE="JavaScript is the subject." SIZE="30">              <BR>              <INPUT TYPE="BUTTON" VALUE="Get Lower Case Words" ONCLICK="getLowers()">              <BR>              <INPUT TYPE="text" NAME="text2" SIZE="30">          </FORM>      </BODY>  </HTML> 

You can see the results of this code in Figure 20.2, where the code is displaying all the lower case words by looping over the array returned by the exec method.

Figure 20.2. Getting matches to a regular expression.

graphics/20fig02.gif

That technique is fine if your regular expression is intended to be used over and over to find a number of matches, as was the case in this example. However, you also can use regular expressions that have various submatches .

Suppose, for example, that you want to extract the month, day, and year from a "MM/DD/YYYY" string. You can use one regular expression to match each of those three items. In this case, I'll match the date in "MM/DD/YYYY" format using the regular expression (we'll see how to create regular expressions like this in this chapter):

 var regexp = /^(0?[1-9]1[0-2])\/(0?[1-9][12][0-9]3[01])\/((181920)\d{2})$/ 

Note the parentheses, which I've placed around parts of the regular expression that I want to treat as submatches. The first expression in parentheses is (0?[1-9] 1[0-2]) , which will match the month; the next is (0?[1-9] [12] [0-9] 3[01]) , which matches the day; and the last is ((18 19 20)\d{2}) , which matches the year. (The parentheses around (18 19 20) are to match either 18, 19, or 20 as the first two digits of the year, and are not used for submatches, as discussed later in this chapter.) Because these expressions are enclosed in parentheses, we can access the text they matched. (Such text is called a submatch .)

Suppose, for example, that I use this regular expression with the date "09/02/1957" . The exec method will return an array with the matches to our regular expression. The first element in the array will be the match to the entire regular expression ( "09/02/1957" ), the next will be the match to the expression in the first set of parentheses ( "09" ), the next will be the match to the expression in the second set of parentheses ( "02" ), and the last will be the match to the expression in the third set of parentheses ( "1957" ). Here's an example putting this all to work by extracting the parts of a date the user has entered:

(Listing 20-03.html on the web site)
 <HTML>      <HEAD>          <TITLE>Getting Match Data</TITLE>          <SCRIPT LANGUAGE="JavaScript">              <!--             function getDate()              {  var regexp = /^(0?[1-9]1[0-2])\/(0?[1-9][12][0-9]3[01])\/((181920)\ graphics/ccc.gif d{2})$/   var matches = regexp.exec(document.form1.text1.value)   if (matches) {   var month = matches[1]   var date = matches[2]   var year = matches[3]   document.form1.text2.value = "Month: " + month +   " Day: " + date + " Year: " + year   }  }              //-->          </SCRIPT>      </HEAD>      <BODY>          <H1>Getting Match Data</H1>          <FORM NAME="form1">              Type a date (MM/DD/YYYY):              <BR>              <INPUT TYPE="TEXT" NAME="text1">              <BR>              <INPUT TYPE="BUTTON" VALUE="Get Date" ONCLICK="getDate()">              <BR>              <INPUT TYPE="text" NAME="text2" SIZE="30">          </FORM>      </BODY>  </HTML> 

You can see the results of this code in Figure 20.3, where we're able to dissect a date into its component parts using parenthesized submatches.

Figure 20.3. Dissecting a date.

graphics/20fig03.gif



Inside Javascript
Inside JavaScript
ISBN: 0735712859
EAN: 2147483647
Year: 2005
Pages: 492
Authors: Steve Holzner

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net