Regular Expressions: Quantifiers


What if you want to match an unknown number of characters ? For example, what if you're trying to match words that can vary in length? You can do that with quantifiers .

For example, the pattern \w matches a single-word character (any alphanumeric character and "_"). To match one or more word characters in succession (that is, a whole word), you can use the expression \w+ , because + is a quantifier that means "one or more of."

Here's another example. In this case, I want to replace one or more a characters with a single a character, changing the word JaaaaaavaScript to JavaScript . I can match all those a characters with the pattern a+ :

(Listing 20-07.html on the web site)
 <HTML>      <HEAD>          <TITLE>Replacing Characters</TITLE>          <SCRIPT LANGUAGE="JavaScript">              <!--  function replacer()   {   var regexp = /a+/g   document.form1.text2.value = document.form1.text1.value.replace(regexp, graphics/ccc.gif "a")   }  //-->          </SCRIPT>      </HEAD>      <BODY>          <H1>Replacing Characters</H1>          <FORM NAME="form1">               <INPUT TYPE="TEXT" NAME="text1" VALUE="JaaaaaavaScript is the subject." graphics/ccc.gif SIZE="30">              <BR>              <INPUT TYPE="BUTTON" VALUE="Replace Characters" ONCLICK="replacer()">              <BR>              <INPUT TYPE="TEXT" NAME="text2" SIZE="30">          </FORM>      </BODY>  </HTML> 

You can see the results of this code in Figure 20.6.

Figure 20.6. Replacing characters.

graphics/20fig06.gif

So which quantifiers are available? Here they are:

  • * Match zero or more times

  • + Match one or more times

  • ? Match one or zero times

  • {n} Match n times

  • {n,} Match at least n times

  • {n,m} Match at least n but not more than m times

Here's another example where I'm making sure the user types lines of at least 20 characters, matching any character with the dot (.) special character and using the quantifier {20,} :

(Listing 20-08.html on the web site)
 <HTML>      <HEAD>          <TITLE>Checking Text Length</TITLE>          <SCRIPT LANGUAGE="JavaScript">              <!--  function checkText()   {   var regexp = /.{20,}/   var matches = regexp.exec(document.form1.text1.value)   if (!matches) {   alert("Please type longer sentences...")   }   }  //-->          </SCRIPT>          </HEAD>          <BODY>          <H1>Checking Text Length</H1>          <FORM NAME="form1">              <INPUT TYPE="TEXT" NAME="text1">              <BR>              <INPUT TYPE="BUTTON" VALUE="Check Text" ONCLICK="checkText()">          </FORM>      </BODY>  </HTML> 

Regular expression quantifiers are "greedy," which means they'll return the longest match possible. What does that mean? Here's an example. Suppose we want to change "That is some text, isn't it?" to "That's some text, isn't it?" by replacing the That is with That's . You might try it this way, searching for any number of characters followed by is like this: ".*is" , hoping it'll match That is so that you can replace the match with That's :

 <HTML>      <HEAD>          <TITLE>Replacing Characters</TITLE>          <SCRIPT LANGUAGE="JavaScript">              <!--                 function displayer()                  {  var regexp = /.*is/   document.form1.text2.value = document.form1.text1.value.replace( graphics/ccc.gif regexp, "That's")  }              //-->          </SCRIPT>      </HEAD>      <BODY>          <H1>Replacing Characters</H1>          <FORM NAME="form1">  <INPUT TYPE="TEXT" NAME="text1" VALUE="That is some text, isn't it?" graphics/ccc.gif SIZE="30">  <BR>              <INPUT TYPE="BUTTON" VALUE="Find First Word" ONCLICK="displayer()">              <BR>              <INPUT TYPE="TEXT" NAME="text2">          </FORM>      </BODY>  </HTML> 

The problem is that because quantifiers are greedy, they will try to match as much as they can, which means JavaScript will use the .* preceding is to match all the characters up to the last is in the text. That makes the result of the preceding code "That'sn't it?" not what we expected. However, you can fix the problem by making quantifiers less greedy; see the section, "Quantifier Greediness," later in this chapter.



Inside Javascript
Inside JavaScript
ISBN: 0735712859
EAN: 2147483647
Year: 2005
Pages: 492
Authors: Steve Holzner

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net