Hack 23. Validate Email Syntax


Check email syntax on the client side before the server component takes over.

Many web sites ask their users to register their email addresses as usernames. This hack makes sure the syntax of the entered email address is valid, before the server component finds out whether the email address has already been used as a user identifier. "Validate Unique Usernames" [Hack #24] takes care of the second step of this task.

The server component that receives the email address should always implement its own validation step, in order to deal with, for example, the disabling of JavaScript in the user's browser or a direct connection with the server by a hacker.


The Longest Wait

When registering with a web site, users typically type in an email address, make up a password, click Submit, and then often experience a long wait staring at the browser as the page is slowly reconstructed (if they're lucky). To add insult to injury, even though email addresses are supposed to be unique, sometimes the address is rejectedpeople often try to register at a site more than once with the same email address (guilty as charged!), forgetting that they've already visited. Therefore, the application often has to check both the email syntax and whether the identifier is already being used.

Ajax techniques can validate the email address on the client side and initiate a trip to the server behind the scenes to find out whether it is already in use, without disrupting the current view of the page. "Validate Unique Usernames" [Hack #24] ensures the uniqueness of the username. Both hacks share the same code base, a mix of JavaScript and other Ajax techniques.

Checking Out the Email Syntax

Web sites often use email addresses as usernames because they are guaranteed to be unique, as long as they are valid. In addition, the organizations can use the email addresses to communicate with their users later. You do not have to initiate a server round trip just to validate an email address, however. This task can be initiated in the client, which cancels the submission of the username to the server if the email syntax is invalid.

What criteria can you use for validation? A fairly dry technical document, RFC 2822 is a commonly accepted guideline from 2001 that organizations can use to validate email addresses. Let's look at an example email address to briefly summarize the typical syntax: hackreader@oreilly.com. Here, hackreader is the local part of the address, which typically identifies the user. This is followed by the commercial at sign (@), which precedes the Internet domain. Internet domains are those often well-known addresses of computer locations that handle in-transit emails; google.com and yahoo.com come to mind.

All of this is common knowledge. However, you may not know that RFC 2822 specifies that the local part cannot contain spaces (unless it's quoted, which is rare, as in "bruce perry"@gmail.com). The local part also cannot contain various special characters, such as the following: ( ) < > , ' @ : ; \\ [ ]. Maybe if someone tries to create an email address that looks like <(([[))>@yoursite.com you should reject it outright, rather than give points for originality!

The local part can and often does contain period characters, as in bruce.perry@google.com, but the periods have to be preceded and followed by alphanumeric characters (i.e., you cannot use an email address such as bruce.@google.com). The domain can contain more than one period, as in bruce@lists.myorg.net, but it cannot begin or end with a period (as in bruce@.lists.myorg.net). Finally, the guidelines permit but discourage a domain literal, as in bruce@[192.168.0.1]. These are the criteria you can check for in your validation code.

Looking at the Code

First, take a look at the page that imports the JavaScript code:

<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"         "http://www.w3.org/TR/2000/REC-xhtml1-20000126/DTD/xhtml1-strict.dtd"> <html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en"> <head> <script type="text/javascript" src="/books/4/254/1/html/2/js/http_request.js"></script> <script type="text/javascript" src="/books/4/254/1/html/2/js/email.js"></script> <meta http-equiv="content-type" content="text/html; charset=utf-8"> <title>Enter email</title> </head> <body> <form action="javascript:void%200"> <div ></div> Enter email: <input type="text" name="email" size="25"><br /> <button type="submit" name="submit" value="Send">Send</button> </form> </body> </html>

Figure 3-1 shows a simple web page with a text field for entering an email address and a Send button.

Figure 3-1. Enter your email address, please


The user types an email address into the text field and then clicks the Send button. This action does not send the email address to a server component yet, though. First, the code has to validate the syntax. The HTML code for the page imports two JavaScript files with the script tag. email.js is responsible for a thorough email-syntax check. http_request.js sends the email address to a server component as a username, but you can find this bit of Ajax in "Validate Unique Usernames" [Hack #24].

Figure 3-2 shows what the browser window looks like if the user types in an invalid email address. The page dynamically prints out a red message summarizing what appears to be wrong with the entered email address.

If, on the other hand, the email address is okay, the application sends it to a server component to determine if the address has already been used as a username. Here is the code from email.js:

var user,domain, regex, _match; window.onload=function(  ){     document.forms[0].onsubmit=function(  ) {         checkAddress(this.email.value);         return false;     }; }; /* Define an Email constructor */ function Email(e){     this.emailAddr=e;     this.message="";     this.valid=false; } function validate(  ){     //do a basic check for null, zero-length string, ".", "@",     //and the absence of spaces     if (this.emailAddr == null || this.emailAddr.length == 0 ||     this.emailAddr.indexOf(".") == -1 ||     this.emailAddr.indexOf("@") == -1 ||     this.emailAddr.indexOf(" ") != -1){     this.message="Make sure the email address does " +     "not contain any spaces "+     "and is otherwise valid (e.g., contains the \\"commercial at\\" @ sign).";         this.valid=false;         return;     }     /* The local part cannot begin or end with a "."     Regular expression specifies: the group of characters before the @         symbol must be made up of at least two word characters, followed by zero        or one period char, followed by at least 2 word characters. */     regex=/(^\\w{2,}\\.?\\w{2,})@/;     _match = regex.exec(this.emailAddr);     if ( _match){         user=RegExp.$1;         //alert("user: "+user);     } else {        this.message="Make sure the user name is more than two characters, "+             "does not begin or end with a period (.), or is not otherwise "+             "invalid!";         this.valid=false;         return;     }     //get the domain after the @ char     //first take care of domain literals like @[19.25.0.1], however rare     regex=/@(\\[\\d{1,3}\\.\\d{1,3}\\.\\d{1,3}.\\d{1,3}\\])$/;     _match = regex.exec(this.emailAddr);     if( _match){         domain=RegExp.$1;         this.valid=true;     } else { /* The @ character followed by at least two chars that are not a period (.), followed by a period, followed by zero or one instances of two or more characters ending with a period, followed by two-three chars that are  not periods */         regex=/@(\\w{2,}\\.(\\w{2,}\\.)?[a-zA-Z]{2,3})$/;         _match = regex.exec(this.emailAddr);         if( _match){             domain=RegExp.$1;            //alert("domain: "+domain);         } else {             this.message="The domain portion of the email had less than 2                           chars "+                          "or was otherwise invalid!";             this.valid=false;             return;         }     }//end domain check     this.valid=true; } //make validate(  ) an instance method of the Email object Email.prototype.validate=validate; function eMsg(msg,sColor){     var div = document.getElementById("message");     div.style.color=sColor;     div.style.fontSize="0.9em";     //remove old messages     if(div.hasChildNodes(  )){         div.removeChild(div.firstChild);     }     div.appendChild(document.createTextNode(msg)); } //a pull-it-all-together function function checkAddress(val){     var eml = new Email(val);     var url;     eml.validate(  );     if (! eml.valid) {eMsg(eml.message,"red")};     if(eml.valid)     {         //www.parkerriver.com         url="http://www.parkerriver.com/s/checker?email="+             encodeURIComponent(val);         httpRequest("GET",url,true,handleResponse);     } } //event handler for XMLHttpRequest //see Hack #24 function handleResponse(  ){     //snipped... }

First, the code sets up the handling for the user's click on the Send button. window.onload specifies an event handler that is called when the browser completes the loading of the web page:

window.onload=function(  ){     document.forms[0].onsubmit=function(  ) {         checkAddress(this.email.value);         return false;     }; };

The reason the code uses window.onload is that for the code to control form-related behavior, the form tag has to be able to be referenced from JavaScriptthat is, fully loaded into the browser.

Event handlers are designed to assign functions or blocks of code that specify the application's behavior (i.e., "Take this action when this happens in the browser."). For example, the onsubmit event handler indicates which function should be called when the user submits the form.


The previous code also sets up the form element's onsubmit event handler, a function that calls checkAddress( ). The onsubmit event handler intercepts the form submission because you want to validate what the user entered into the text field before the application does anything else. checkAddress( ) takes as a parameter the address that the user typed (if anything).

Checking Email at the Door

Let's take a closer look at the checkAddress( ) function:

function checkAddress(val){     var eml = new Email(val);     var url;     eml.validate(  );     if (! eml.valid) {eMsg(eml.message,"red")};     if(eml.valid)     {         url="http://www.parkerriver.com/s/checker?email="+             encodeURIComponent(val);         httpRequest("GET",url,true,handleResponse);     } }

This function creates a new Email object, validates the user's email address, and, if it's valid, submits it to a server component. You may be wondering, what the heck is an Email object? An Email object is a code template you can use over and over again every time you want to check the syntax of an email address. In fact, if you write a lot of JavaScript that handles email addresses, you'd likely break this code off into its own file (say, emailObject.js) so that it isn't tangled up with hundreds of lines of additional complex code in future applications. Here is the Email object definition:

/* Define an Email constructor */ function Email(e){     this.emailAddr=e;     this.message="";     this.valid=false; }

An Email object is constructed using a JavaScript function definition that takes the email address as the one function parameter, stored here as e.

This is a special kind of function that is called a constructor in object-oriented parlance, because it is used to construct an object.


An Email object has three properties: an email address (emailAddr), a message, and a boolean or true/false property named valid. When you use the new keyword in JavaScript to create a new Email object, as follows, the emailAddr property is set to the passed-in email address (stored in e):

var email = new Email("brucew@yahoo.com");

The message is initialized to the empty string because new Email objects do not have any special messages associated with them. The validity of the email address, somewhat pessimistically, is initialized as false. The this keyword refers to the instance of Email that the browser creates in memory when the code generates a new Email object. To look at this in a different way, a bicycle company might create a mold for new bicycle helmets. Conceptually, the mold is like our Email constructor. When the company makes new helmets, these helmets are instances of the mold or template that was developed for them.

On to Validation

An Email object validates the email address it is passed, which in our application takes place when the user clicks the Send button. The checkAddress( ) function contains code such as eml.validate( ) and if(eml.valid), indicating that our application validates individual email addresses and checks their valid properties. This happens because the code defines a validate( ) function and then signals that the Email object owns or is linked with that function.

Using code such as Email.prototype.validate=validate; is a special way in JavaScript to specify that you've defined this function, validate( ), and that every new Email object has its own validate( ) method. Using object-oriented techniques is not mandated, but it makes the code a little more tidy, concise, readable, and potentially reusable.


Now let's examine the validation code, which contains a few regular expressions for checking email syntax. The code, included in the prior code sample for email.js, is fairly complex, but the embedded comments are designed to help you along the way in figuring out what the code accomplishes. In order, here are the rules for our validation logic (partly based on RFC 2822 and partly on our own criteria for proper email syntax):

  1. If the email address is the empty string, if the emailAddr property value is null, or if the email address does not contain an @ character or any periods at all, it is rejected. No surprises there.

  2. The code then uses a regular expression to grab the local part of the email, which is the username, or the chunk of characters preceding the @. This regular expression checks for at least two "word characters" (the \\w predefined character class; i.e., [azAZ_09]), followed by zero or one period characters, followed by at least two word characters.

  3. The code then grabs all characters after the @ and checks whether the character string represents either a domain literal (however rare that is) or a typical domain syntax. The rule for the latter syntax is expressed as "the @ character followed by at least two word characters, followed by a period, followed by zero or one instances of at least two characters ending with a period, followed by two to three characters that fall into the character class [azAZ]."

JavaScript's built-in RegExp object's exec( ) method returns an array if it finds a match, or null otherwise. The RegExp.$1 part contains the first group of parenthesized matched characters after exec( ) is calledin this case, the local part/username before the @ character.

You can try different email addresses with the validation code and look at the returned values for debugging purposes.

The User Message

If users include illegal characters, type in otherwise invalid addresses, or leave the text field blank, they are greeted with a message like the one shown in Figure 3-2.

Figure 3-2. Communicating with the user


The following code inside validate( ) creates another such message if the email address does not include a domain (the part after the @) that matches the regular expression:

/* The @ character followed by at least two chars that are not a period (.), followed by a period, followed by zero or one instances of at least two characters ending with a period, followed by two-three chars that are not periods */ regex=/@(\\w{2,}\\.(\\w{2,}\\.)?[a-zA-Z]{2,3})$/; _match = regex.exec(this.emailAddr); if( _match){     domain=RegExp.$1; } else {     this.message="The domain portion of the email had less than 2 chars "+                  "or was otherwise invalid!";     this.valid=false;     return; }

Notice that the code also sets the Email object's valid property to false. checkAddress( ) then checks the valid property before the email address heads off to the server (we'll look at that part in "Validate Email Syntax" [Hack #23]):

//inside checkAddress(  )... eml.validate(  ); if (! eml.valid) {eMsg(eml.message,"red")}; if(eml.valid) {    url="http://www.parkerriver.com/s/checker?email="+        encodeURIComponent(val);    httpRequest("GET",url,true,handleResponse); }             

The eMsg( ) function generates the message. eMsg( ) uses a little DOM, a little dynamic CSS programming, and some JavaScript:

function eMsg(msg,sColor){     var div = document.getElementById("message");     div.style.color=sColor;     div.style.fontSize="0.9em";     //remove old messages     if(div.hasChildNodes(  )){         div.removeChild(div.firstChild);     }     div.appendChild(document.createTextNode(msg)); }

The parameters to this function are the text message and the color of the text. The application uses red for error messages and blue for notifications about usernames (this is discussed in the next hack). The code dynamically generates the message inside a div that the HTML reserves for that purpose:

var div = document.getElementById("message");

On Deck

As the user attempts to enter an email address with valid syntax, the page itself doesn't change; only the message shows different content. During the syntax validation step, the application responds rapidly because the work is done on the client side, and the server component does not participate (although a server role does come into play when the email address is valid).

Although we have not gone into very much detail about what's happening on the server end, the server component keeps a database of unique usernames for its web application. Once this hack gives the green light on the syntax, the application sends the email address to the server, which checks to see whether that address is already in its database. "Validate Email Syntax" [Hack #23] dives into this related functionality.




Ajax Hacks
Ajax Hacks: Tips & Tools for Creating Responsive Web Sites
ISBN: 0596101694
EAN: 2147483647
Year: 2006
Pages: 138

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net