Form-Field Checking

   

As you can see, it isn't too hard to make a form a little more user-friendly. You can make the task of entering information easier on the user, but to make it worthwhile for yourself, you should check to see that the user has entered data that is actually useful. Otherwise, why collect it? Just checking to make sure that they filled in a field is not always adequate.

That's where ereg() comes in. ereg() is a built-in PHP function. It means Evaluate a Regular Expression. There is also a PERL-compatible function that works the same as ereg() (and may be a little faster) called preg_match(). The syntax for both functions is very similar, if not exactly the same in most instances. The syntax is:

 ereg(search, string)  

where search is the regular expression search criteria, and string is the string that you wish to search. The ereg function returns a boolean value of either true or false. If the search criteria matches the string, then ereg() returns true; otherwise, it returns false.

Now, regular expressions are one of those things that all programmers who spend a lot of time wading around in text should really get to know. The different search criteria used in regular expressions can be cryptic, but they really get the job done.

Take a look at the example we used earlier:

   if(ereg("^ +", $last)) {        $error['last'] = true;       $print_again = true;   } 

The search criteria, "^ +", literally means, "A string starting with a space and followed by zero or more spaces." We search for this because we do not want users to fool the form into being accepted by entering spaces for the last name instead of characters.

However, we can get a lot more detailed about analyzing the data entered into a form using other regular expression operators. A list of common regular expression operators can be found in Table 4-1.

Table 4-1. Common Regular Expression Operators

SYMBOL

DESCRIPTION

EXAMPLES

*

Match zero or more occurrences.

Just like the UNIX wildcard symbol, * matches everything. Use this for groups of symbols.

+

Match one or more occurrences.

"e+" matches a string that has one or more occurrences of the character "e" in it.

^

Placed at the beginning of a string, matches that string at the start.

"^Hello" matches any string that starts with the word "Hello".

$

Placed at the end of a string, matches that string at the end.

"$goodbye!" matches any string that ends with "goodbye!"

|

Used to separate searches.

"a|b" matches a string that contains the character "a" or the character "b".

[]

Matches range of characters.

"[a-z]" matches a string that contains a lowercase character from a through z.

"[A-D]" matches a string that contains an uppercase character from A through D.

"[0-9]" matches a string that contains a numeral from 0 through 9.

"[1-5]" matches a string that contains a numeral from 1 to 5.

()

Group operators in a sequence.

Parentheses work similarly to how they do in most programming operations. You can group items together. For example, (5[0-9]+) matches a string that contains a 5 followed by one or more numerals. "5" does not match, while "55" does.

.

Matches any single character.

"[0-9].[0-9]" matches a string that contains a numeral followed by any other character, followed by a numeral.

\

Escape an operator so that it is taken literally.

You must escape the following operators if you want to search for that particular operator literally: [*\+?{}.]

Checking for Valid Email Addresses

One of the most common things for a user to enter into a form is an email address. Having users' email addresses lets you do things like send users their password if they happen to forget it. This saves users from having to create a new account on your site, and it also saves you some space on your database.

But sometimes a user will fill out a form and not put in a valid email. Have you ever received a form submission with something along the lines of "qwert" in the email field? Using regular expression matching, you can at least check the user's email address entry to see if it is technically valid.

A technically valid email address consists of a username, the "@" sign, and a server name. Valid usernames can contain letters, numbers, the underscore ("_"), the minus sign ("-"), and periods ("."). Valid server names are almost the same, except that server names cannot contain an underscore. Finally, the end of the domain name must have a "." in it followed by two or more letters, such as ".com", ".it", or ".info". Using our regular expression operators from Table 4-1, we can build a regular expression that matches against valid email addresses.

Let's state that a valid username must start with at least one letter or one number:

 ^[a-z0-9]+  

This is followed by zero or more letters or numbers, underscores "_", or minus signs "-":

 [a-z0-9_-]*  

Then it can be followed by zero or one "."s, followed by any number of letters or numbers, underscores "_", or minus signs "-":

 (\.[a-z0-9_-]+)*  

then followed by the @ sign:

 @  

which is then followed by at least one letter or number:

 [a-z0-9]+  

Then it can be followed by zero or one "."s, followed by any number of letters or numbers, underscores "_", or minus signs "-":

 (\.[a-z0-9_-]+)*  

Finally, it can be followed by another ".", then two or more letters (the end of the domain name), and then the email string must end:

 \.([a-z]+){2,}$  

Put it all together and you get:

 ^[a-z0-9]+(\.[a-z0-9_-]+)*@[a-z0-9_-]+(\.[a-z0-9-]+)*\.([a-z]+){2,}$  

But we haven't taken into account the case of the letters! The expressions above only allow for lowercase letters in the email address, but uppercase letters are valid as well. To save some time, we didn't include the "A Z" ranges in our expressions, because we can use a slightly different form of ereg(), which is called eregi(). eregi() works exactly the same as ereg() except that it is not case-sensitive. That way you don't have to search for both uppercase and lowercase characters in your expressions, using the [a-zA-Z] syntax. You only need to use [a-z].

Our eregi() function now looks like this:

 eregi("^[a-z0-9]+[a-z0-9_-]*(\.[a-z0-9_-]+)*@[a-z0-  9_-]+(\.[a-z0-9_-]+)*\.([a-z]+){2,}$", $email); 

It's very ugly, but it does a very good job of checking for valid email addresses.

Here is a short script that demonstrates the prowess of this function:

Script 4-2 checkemail.php
  1.  <?  2.  $email = array ("chris_2@company.com", "-      fred@broken.org", "joe.smith@works.it", "Busted@bad.a",      "strange@44.44", "works.fine.all_day@x.y.z.com","is-      dashed-line@d-a-s-h-e-d.com", "CC@c.com",      "this.works.fine@also.ok");  3.  4.  for($i = 0; $i < sizeof($email); $i++) {  5.    if(eregi("^[a-z0-9]+[a-z0-9_-]*(\.[a-z0-9_-]+)*@[a-z0-9_-]+(\.[a-z0-      9_-]+)*\.([a-z]+){2,}$", $email[$i])) {  6.      echo "<p>$email[$i] is valid.";  7.    } else {  8.      echo "<p>$email[$i] is <b>not</b> valid.";  9.    } 10.  } 11.  ?> 

Script 4-2. checkemail.php Line-by-Line Explanation

LINE

DESCRIPTION

1

Tell the server to start parsing as PHP code.

2

Create an array of valid and invalid email addresses.

4

Create a for loop to loop through each email address in the array.

5

Test the email address received from the array against the eregi() function to check for valid email addresses. If the email is valid, line 6 executes. Otherwise, the script executes line 8.

6

Executed if the email address matches the eregi() expression.

7 9

Executed if the email address does not match the eregi() expression.

10

End the for loop.

11

Tell the server to stop parsing the page as PHP code.

There is also an additional syntax that you can use with ereg() and eregi() to search strings. See Table 4-2 below.

Table 4-2. Portable Regular Expression Operators

SYMBOL

DESCRIPTION

[[:alpha:]]

equal to [a-zA-Z]

[[:alnum:]]

equal to [a-zA-Z0-9]

[[:digit:]]

equal to [0-9]

[[:lower:]]

equal to [a-z]

[[:upper:]]

equal to [A-Z]

[[:space:]]

equal to a blank space (" ")

[[:print:]]

matches any printable character, except that /n (newline) and /t (tab) do not match

[[:graph:]]

matches a string with any graphical character except a space

[[:xdigit:]]

hexidecimal, equal to [a-fA-F0-9]

[[:punct:]]

matches a string with any punctuation

The same expression that we used for the email syntax checking:

 eregi("^[a-z0-9]+[a-z0-9_-]*(\.[a-z0-9_-]+)*@[a-z0-  9_-]+(\.[a-z0-9_-]+)*\.([a-z]+){2,}$", $email); 

is written like this using the portable syntax:

 eregi("^[[:alpha:]]+[[:alnum:]_-]*(\.[[:alnum:]_-  ]+)*@[[:alnum:]_-]+(\.[[:alnum:]_-]+)*\.([[:alpha:]_- ]+){2,}$", $email); 

   
Top


Advanced PHP for Web Professionals
Advanced PHP for Web Professionals
ISBN: 0130085391
EAN: 2147483647
Year: 2005
Pages: 92

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net