Processing User Input | Microsoft Office PowerPoint 2007 On Demand

One of the most tedious tasks in implementing web interfaces is the need to accept user input from form submissions and use it to populate domain objects.

The central problem is that all values resulting from HTTP form submissions are strings. When domain object properties aren't strings, we need to convert values, checking parameter types as necessary. This means that we need to validate both the type of parameter submissions and their semantics. For example, an age parameter value of 23x isn't valid as it isn't numeric; but a numeric value of -1 is also invalid as ages cannot be negative. We might also need to check that the age is within a prescribed range.

We also need to check for the presence of mandatory form fields, and may need to set nested properties of domain objects.

A web application framework can provide valuable support here. Usually we want to take one of the following actions on invalid data:

Reject the input altogether, treating this as an invalid submission. This approach is appropriate, for example, when a user could only arrive at this page from a link within the application. As the application, not the user, controls the parameters, invalid parameter values indicate an internal error.
Send the user back the form allowing them to correct the errors and resubmit. In this case, the user will expect to see the actual data they entered, regardless of whether that input was valid or invalid, or even of the correct type.

The second action is the most commonly required and requires the most support from a web application framework.

Note

The discussion in this section concentrates on the submission of a single input form, rather than "wizard" style submissions spread over several pages. Many of the principles applicable to single form submission apply to wizard style submission, although it is naturally harder to implement multi-page forms than single-page forms (it may be modeled as several individual form submissions onto different JavaBeans, or multiple submissions updating different properties of the same bean stored in the session).

Data Binding and Displaying Input Errors for Resubmission

Often, rather than perform low-level processing of individual request parameters, we want to transparently update properties on a JavaBean. The operation of updating object properties on HTTP form submission is often called data binding.

In this approach, a JavaBean - typically a command - will expose properties with the same names as the expected request parameters. Web application framework code will use a bean manipulation package to populate these properties based on the request parameters. This is the same approach as that used by the <jsp:useBean> action, although, as we've noted, this doesn't provide a sophisticated enough implementation. Ideally, such a command bean will not depend on the Servlet API, so once its properties are set it can be passed as a parameter to business interface methods.

Many frameworks, such as Struts and WebWork, use data binding as their main approach to request parameter processing. However, data binding isn't always the best approach. When there are few parameters, it may be inappropriate to create an object. We may want to invoke business methods that don't take arguments, or take one or two primitive arguments. In such cases, application controllers can themselves process request parameters using the getParameter (String name) method of the HttpServletRequest interface. In such cases, creating a command object would waste a little time and memory, but - more seriously - make a very simple operation seem more complex. With a pure data-binding approach, if we have a separate command per bean, as in Struts, we can easily end up with dozens of action form classes that add very little.

Also, sometimes dynamic data to a request isn't contained in request parameters. For example, sometimes we might want to include dynamic information in a virtual URL. For example, a news article URL might include an ID in the servlet path, without requiring a query string, as in the following example:

    /articles/article3047.html

This approach may facilitate content caching: for example, by a servlet filter or in the browser.

Note that processing individual request parameters places the onus on application controller code to check the parameters are of the required type: for example, before using a method such as Integer.parseInt (String s) to perform type conversion.

Where data binding does shine is where there are many request parameters, or where resubmission is required on invalid input. Once user data is held in a JavaBean, it becomes easy to use that object to populate a resubmission form.

Approaches to Data Binding in MVC Frameworks

There seem to be two basic approaches to data binding in Java web application frameworks, both based on JavaBeans:

Keep the data (valid or invalid) in a single JavaBean, allowing a resubmission form's field values to be populated easily if necessary
This is the Struts ActionForm approach. It has the disadvantage that the data in the ActionForm bean isn't typed. Since all properties of the ActionForm are strings, the ActionForm can't usually be a domain object, as few domain objects hold only string data. This means that when we've checked that the data the ActionForm contains is valid (and can be converted to the target types) we'll need to perform another step to populate a domain object from it.

In this model, validation will occur on the form bean, not the domain object; we can't attempt to populate a domain object from the form bean until we're sure that all the form bean's string property values can be converted to the necessary type. Note that we will need to store error information, such as error codes, somewhere, as the form bean can store only the rejected values. Struts holds error information for each rejected field in a separate ActionErrors object in the request.
Keep errors in a separate object
This approach attempts to populate the domain object, without using an intermediate holding object such as an "action form". The domain object we're trying to bind to will have its fields updated with the inputs of the correct type, while any inputs of an incorrect type (which couldn't be set on the domain object) will be accessible from a separate errors object added to the request. Semantic (rather than syntactic) validation can be performed by domain objects after population is complete.

WebWork uses a variant of this approach. If a Web Work action implements the webwork.action.IllegalArgumentAware interface, as does the webwork.action.ActionSupport convenience superclass, it will receive notification of any type mismatches when its properties are being set, making it possible to store the illegal value for display if necessary. In this model, type mismatches are transparently handled by the framework, and application code doesn't need to consider them. Application validation can work with domain objects, not dumb storage objects.

Note

Since WebWork combines the roles of command and controller (or "action" in WebWork terms), a WebWork controller is not a true domain object because it depends on the WebWork API. However, this is a consequence of WebWork's overall design, not its data binding approach.

The second approach is harder to implement in a framework, but can simplify application code, as it places the onus on the framework to perform type conversion. Such type conversion can be quite complex, as it can use the standard JavaBeans PropertyEditor machinery, enabling complex objects to be created from request parameters. The second approach is used in the framework for the sample application, although this framework also supports an action form-like approach like that of Struts, in which string properties are used to minimize the likelihood of errors.

JSP Custom Tags

How do we know what data to display on a form? Usually we'll want to use the same template (JSP or other) for both fresh input forms and resubmissions, as the prospect of different forms containing hundreds of lines of markup getting out of synch is unappetizing. Thus the form must be able to be populated with no object data behind it; with data from an existing object (for example, a user profile retrieved from a database); or with data that may include errors following a rejected form submission. In all cases, additional model data may be required that is shared between users: for example, reference data to populate dynamic lists such as dropdowns.

In such cases, where data may come from different sources - or simply be blank if there is no bean from which to obtain data - JSP custom tags can be used to move the problem of data acquisition from template into helper code behind the scenes in tag handlers. Many frameworks use a similar approach here, regardless of how they store form data.

If we want to perform data binding, the form usually needs to obtain data using special tags. The tags in our framework with the sample application are conceptually similar to those with Struts or other frameworks. Whether or not the form has been submitted, we use custom tags to obtain values.

For each field, we can use custom tags to obtain any error message resulting from a failed data-binding attempt along with the rejected property value. The tags cooperate with the application context's internationalization support to display the error message for the correct locale automatically. For example, the following fragment of a JSP page uses cooperating custom tags to display an error message if the e-mail property of the user bean on the page failed validation, along with a pre-populated input field containing the rejected value (if the submitted value was invalid) or current value (if there was no error in the e-mail value submitted). Note that only the outer <i21:bind> tag is specific to this framework. It exposes data to tags nested within it that enable it to work with JSP Standard Tag Library tags such as the conditional <c:if> and output <c:out> tags. This means that the most complex operations, such as conditionals, are performed with standard tags. (We discuss the JSTL in detail in the next chapter.)

    <i21:bind value="user.email">         <c:if test="${bind.error}">                                                                                                         <font color="red"><b>                  <c:out value=" $ {bind.errorMessage}"/>                                                                                                         </b></font><br>         </c:if>         <input type="text" length="2" size="30" name="email"                value="<c:out value=" $ {bind.value}" />" />                                                                                              </i21:bind>

The <i21:bind> tag uses a value attribute to identify the bean and property that we are interested in. The bean prefix is necessary because these tags and the supporting infrastructure can support multiple bind operations on the same form: a unique capability, as far as I know. The <i21:bind> tag defines a bind scripting variable, which can be used anywhere within its scope, and which exposes information about the success or failure of the bind operation for this property and the display value.

The tag arrives at the string value to display by evaluating the following in order: any rejected input value for this field; the value of this attribute of a bean with the required name on the page (the same behavior as the <jsp:getProperty> standard action); and the empty string if there was no bean and no errors object (the case on a new form with no backing data). The bind scripting variable created by the <i21:bind> tag also exposes methods indicating which of these sources the displayed value came from.

Another convenient tag evaluates its contents only if there were data binding errors on the form:

    <i21:hasBindErrors>         <font color="red" size="4">             There were <%=count%> errors on this form         </font>    </i21:hasBindErrors>

We'll look at custom tags in more detail in the next chapter, but this demonstrates a very good use for them. They completely conceal the complexity of data binding from JSP content.

Note that it is possible to handle form submission using Struts or the framework designed in this chapter without using JSP custom tags just offer a simple, convenient approach.

Data Validation

If there's a type mismatch, such as a non-numeric value for a numeric property, application code shouldn't need to perform any validation. However, we may need to apply sophisticated validation rules before deciding whether to make the user resubmit input. To ensure that our display approach works, errors raised by application code validation (such as age under 18 unacceptable) must use the same reporting system as errors raised by the framework (such as type mismatch).

Where Should Data Validation be Performed?

The problem of data validation refuses to fit neatly into any architectural tier of an application. We have a confusing array of choices:

Validation in JavaScript running in the browser. In this approach, validity checks precede form submission, with JavaScript alerts prompting the user to modify invalid values.
Validation in the web tier. In this approach a web-tier controller or helper class will validate the data after form submission, and return the user to the form if the data is invalid, without invoking any business objects.
Validation in business objects, which may be EJBs.

Making a choice can be difficult. The root of the problem is the question "Is validation business logic?" The answer varies in different situations.

Validation problems generally fall into the categories of syntactic and semantic validation. Syntactic validation encompasses simple operations such as checks that data is present, of an acceptable length, or in the valid format (such as a number). This is not usually business logic. Semantic validation is trickier, and involves some business logic, and even data access.

Consider the example of a simple registration form. A registration request might contain a user's preferred username, password, e-mail address, country, and post or zip code. Syntactic validation could be used to ensure that all required fields are present (assuming that they're all required, which might be governed by a business rule, in which case semantics is involved). However, the processing of each field is more complicated. We can't validate the post or zip code field without understanding the country selection, as UK postcodes and US zip codes, for example, have different formats. This is semantic validation, and approaching business logic.

Worse, the rules for validating UK postcodes are too complex and require too much data to validate on the client-side. Even if we settle for accepting input that looks like a UK postcode but is semantic nonsense (such as Z10 8XX), JavaScript will still prove impracticable if we intend to support multiple countries. We can validate e-mail address formats in JavaScript, and perform password length and character checks. However, some fields will pose a more serious problem. Let's assume that usernames must be unique. It is impossible to validate a requested username without access to a business object that can connect to the database.

All the above approaches have their advantages and disadvantages. Using JavaScript reduces the load on the server and improves perceived response time. If multiple corrections must be made before a form is submitted, the load reduction may be significant and the user may perceive the system to be highly responsive.

On the other hand, complex JavaScript can rapidly become a maintainability nightmare, cross-browser problems are likely, and page weight may be significantly increased by client-side scripts. In my experience, maintaining complex JavaScript is likely to prove much more expensive than maintaining comparable functionality in Java. JavaScript being a completely different language from Java, this approach also has the serious disadvantage that, while Java developers write the business logic, JavaScript developers must write the validation rules. JavaScript validation is also useless for non-web clients, such as remote clients of EJBs with remote interfaces or web services clients.

Important

Do not rely on client-side JavaScript validation alone. The user, not the server, controls the browser. It's possible for the user to disable client-side scripting, meaning that it's always necessary to perform the same checks on the server anyway.

Validation in the web tier, on the other hand, has the severe disadvantage of tying validation logic - which may be business logic - to the Servlet API and perhaps also a web application framework. Unfortunately, Struts tends to push validation in the direction of the web tier, as validation must occur on Struts ActionForm objects, which depend on the Servlet API, and hence cannot be passed into an EJB container and should not be passed to any business object. For example, validation is often accomplished by overriding the org.apache.struts.action.ActionForm validate method like this:

    public final class MyStrutsForm extends org.apache.struts.action.ActionForm {    ...

     public ActionErrors validate (ActionMapping mapping,          HttpServletRequest request) {        ActionErrors errors = new ActionErrors() ;        if (email == null || " " .equals (email) ) {          errors.add("email" , new ActionError ("email.required") ) ;        }        return errors;      }

I consider this - and the fact that ActionForm objects must extend a Struts superclass dependent on the Servlet API - to be a major design flaw.

Struts 1.1 also provides declarative validation, controlled through XML configuration files. This is powerful and simple to use (it's based on regular expressions), but whether validation rules are in Java code or XML they're still often in the wrong place in the web tier.

Important

Validation should depend on business objects, rather than the web tier. This maximizes the potential to reuse validation code.

However, there is one situation in which validation code won't necessarily be collocated with business objects: in architectures in which business objects are EJBs. Every call into the EJB tier is potentially a remote call: we don't want to waste network roundtrips exchanging invalid data.

Thus the best place to perform validation is in the same JVM as the web container. However, validation need not be part of the web interface logical tier. We should set the following goals for validation:

Validation code shouldn't be contained in web-tier controllers or any objects unique to the web tier. This allows the reuse of validation objects for other client types.
To permit internationalization, it's important to separate error messages from Java code. Resource bundles provide a good, standard, way of doing this.
Where appropriate we should allow for parameterization of validation without recompiling Java code. For example, if the minimum and maximum password lengths on a system are 6 and 64 characters respectively, this should not be hard-coded into Java classes, even in the form of constants. Such business rules can change, and it should be possible to effect such a change without recompiling Java code.

We can meet these goals if validators are JavaBeans that don't depend on web APIs, but may access whatever business objects they need. This means that validators must act on domain objects, rather than Servlet API-specific concepts such as HttpServletRequests or framework-specific objects such as Struts ActionForms.

Data Validation in the Framework Described in this Chapter

Let's look at how this approach works in the framework described in this chapter, and so in the sample application. All validators depend only on two non web-specific framework interfaces, com.interface21.validation.Validator and com.interface21.validation.Errors. The Validator interface requires implementing classes to confirm which classes which they can validate, and implement a validate method that takes a domain object to be validated and reports any errors to an Errors object. Validator implementations must cast the object to be validated to the correct type, as it is impossible to invoke validators with typed parameters in a consistent way. The complete interface is:

    public interface Validator {      boolean supports(Class clazz) ;      void validate(Object obj, Errors errors) ;    }

Errors are added to the Errors interface by invoking the following method:

    void rejectValue(String field, String code, String message) ;

Errors objects expose error information that is used to back display by the custom tags shown above. The same errors object passed to an application validator is also used by the framework to note any type conversion failures, ensuring that all errors can be displayed in the same way.

Let's consider a partial listing of the Validator implementation used in the sample application to validate user profile information held in RegisteredUser objects. A RegisteredUser object exposes e-mail and other properties including postcode on an associated object of type Address.

The DefaultUserValidator class is an application-specific JavaBean, exposing minEmail and maxEmail properties determining the minimum and maximum length acceptable for e-mail addresses:

    package com.wrox.expertj2ee.ticket.customer;    import com.interface21.validation.Errors;    import com.interface21.validation.FieldError;    import com.interface21.validation.Validator;    public class DefaultUserValidator implements Validator {                                                                                public static final int DEFAULT_MIN_EMAIL = 6;      public static final int DEFAULT_MAX_EMAIL = 64;      private int minEmail = DEFAULT_MIN_EMAIL;      private int maxEmail = DEFAULT_MAX_EMAIL;      public void setMinEmail (int minEmail) {        this.minEmail = minEmail;      }      public void setMaxEmail (int maxEmail) {        this.maxEmail = maxEmail;      }

The implementation of the supports() method from the Validator interface indicates that this validator can handle only RegisteredUser objects:

    public boolean supports(Class clazz) {      return clazz.equals (RegisteredUser.class);    }

The implementation of the validate() method from the Validator interface performs a number of checks on the object parameter, which can safely be cast to RegisteredUser:

    public void validate (Object o, Errors errs) {      RegisteredUser u = (RegisteredUser) o;      validateEmail (u.getEmail(), errs);      //More check method invocations omitted    }

    private void validateEmail (String email, Errors errs) {      if (email == null || " " .equals(email)) {        errs.rejectValue("email", "emailRequired",                         "E-mail Address is required") ;        return;      }      if (email.length() < this.minEmail || email.length() > this.maxEmail) {      errs.rejectValue("email", "emailLengthInvalid",                              "E-mail Address is invalid");        return;      }      // Other checks omitted, including checks on min      }    }

The validator's bean properties are set using the same file format we discussed in the last chapter, meeting our goal of externalizing business rules:

    <bean name="userValidator"           >          <property name="minEmail">6</property>          <property name="maxEmail">64</property>    ...    </bean>

Thus we can validate RegisteredUser objects, regardless of what interface we use (web or otherwise). Error information in interface-independent errors objects can be displayed in any interface we choose.

Let's now look at how we can use our web framework to invoke this validation code.

The com.interface21.web.servlet.mvc.FormController superclass is a framework web controller designed both to display a form based around a single JavaBean and to handle form submission, automatically returning the user to the original form if resubmission is necessary.

Subclasses need only specify the class of the form bean (passed to the superclass constructor). The name of the "form view" and "success view" should be set as bean properties in the bean definition (shown below).

The following simple example, from our MVC demo, shows a subclass of FormController that can display a form based on a RegisteredUser object, allowing the user to input postcode (from the associated Address object), birth year, and e-mail address properties. Postcode and e-mail address will be text inputs; birth year should be chosen from a dropdown of birth years accepted by the system:

    package form;    import java.util.HashMap;    import java.util.Map;    import javax.servlet.http.HttpServletRequest;    import com.interface21.web.servlet.ModelAndView;    import com.interface21.web.servlet.mvc.FormController;    import com.wrox.expertj2ee.ticket.customer.RegisteredUser;    public class CustomerInput extends FormController {

By default, FormController will use Class.newInstance() to create a new instance of the form bean, whose properties will be used to populate the form (the form bean must have a no argument constructor). However, by overriding the following method, we can create an object ourselves. In the present example, I've simply pre-populated one property with some text: we would normally override this method only if it were likely that a suitable object existed in the session, or we knew how to retrieve one, for example, by a database lookup:

    protected Object formBackingObject (HttpServletRequest request) {        RegisteredUser user = new RegisteredUser() ;        user.setEmail ("Enter your email");        return user;    }

This method will be invoked if the controller is handling a request for the form, rather than a form submission.

If there are validation errors - type mismatches and/or errors raised by the validator object - subclasses will not need to do anything. The FormController class will automatically return the user to the submission form, making the bean and error information available to the view.

We must override one of several overloaded onSubmit() methods to take whatever action is necessary if the object passes validation. Each of these methods is passed the populated domain object. In our simple example, I've just made one of the object's properties available to the view; a real controller would pass the populated command to a business interface and choose a view depending on the result. Note that this method doesn't take request or response objects. These are unnecessary unless we need to manipulate the user's session (the request and response objects are available through overriding other onSubmit() methods): the request parameters have already been extracted and bound to command properties, while by returning a model map we leave the view to do whatever it needs to do to the response:

    protected ModelAndView onSubmit (Object command) {      RegisteredUser user = (RegisteredUser) command;      return new ModelAndView (getSuccessView(),                               "email", user.getEmail());      }    }

We may wish to override the isFormSubmission() method to tell FormController whether it's dealing with a request to display the form or a form submission. The default implementation (shown below) assumes that an HTTP request method of POST indicates a form submission. This may not always be true; we might want to distinguish between two URLs mapped to this controller, or check for the presence of a special request parameter:

    protected boolean isFormSubmission(HttpServletRequest request) {      return "POST".equals(request.getMethod());    }

If the form requires shared reference data in addition to the bound object, this data can be returned as a map (like a model map in our framework) from the referenceData() method. In this case, we return a static array of the birth years displayed on the form, although reference data will usually come from a database:

    private static final int[] BIRTH_YEARS = {        1970, 1971, 1972, 1973, 1974 };

    protected Map referenceData(HttpServletRequest request) {      Map m = new HashMap();      m.put("BIRTHYEARS" ,BIRTH_YEARS);      return m;    }

This is all the application-specific Java code required. This controller is configured by the following bean definition in the servlet's XML configuration file. Note the highlighted line that sets the validator property inherited from the FormController generic superclass to a reference to the validator bean, and how the inherited beanName, formView, and successView properties are set:

    <bean name="customerController"           >          <property name="validator" beanRef="true">customerValidator</property>                                                              <property name="beanName">user</property>          <property name="formView"customerForm</property>          <property name="successView">displayCustomerView</property>    </bean>

Let's now look at the listing for the complete form, using the custom tags shown above. Note that the bean name of "user" matches the bean name set as a property on the controller. As postcode is a property of the billingAddress property of the RegisteredUser, note the nested property syntax:

    <form method="POST">    <i21:hasBindErrors>          There were <%=count%> errors          <p>    </i21:hasBindErrors>    Postal code:    <br>

    <i21:bind value="user.billingAddress.postcode">             <c:if test="${bind.error}">                   <font color="red"><b>                           <c:out value="${bind.errorMessage}"/>                    </b></font>                   <br>             </c:if>    <input type="text" length="16" name="billing<I>Address.postcode"                  value="<c:out value="${bind.value}" />">    </i21:bind>

The birthYear property should be set to a choice from a dropdown of birth years populated by the reference data returned by the referenceData() method. We use a nested JSTL conditional tag to select the value from the list that the bind value matches. If the bind value is blank (as it will be when the form is first displayed) the first, "Please select," value will be selected:

    Birth year:    <br>    <i21:bind value="user.birthYear">           <c:if test="${bind.error}"><beginpage pagenum="503" />                 <font color="red"><b>                   Birth year is required                 </b>>/font>               <br>           </c:if>        <select size="1" name="birthYear"               <option value=""/>Please select</option>               <c:forEach var="birthYear" items="${BIRTHYEARS}">                     <option value="<c:out value="${birthYear}">"                     <c:if test="${bind.value == birthYear}">SELECTED</c:if>                               > <c:out value="${birthYear}"/>                   </option>                </c:forEach>            </select>    </i21:bind>

The e-mail address field requires similar code to the postcode text field:

    Email:    <br>    <i21:bind value="user.email">           <c:if test="${bind.error}">                    <font color="red"><b>                             <c:out value="${bind.errorMessage}"/>                    </b></font>                    <br>             <c:if>           <input type="text" length="2" size="30" name="email" value="<c:out    value="${bind.value}"/>" />    </i21:bind>

When the user requests cust.html, which is mapped onto the controller, the form will appear with the e-mail address field pre-populated from the formBackingObject() method. The postcode field will be blank, as this field was left null in the new bean, while the first (prompt) value will be selected in the birth-year dropdown:

click to expand

On invalid submission, the user will see error messages in red, with the actual user input (valid or invalid) redisplayed:

click to expand

If we want more control over the validation process, our framework enables us to use the data binding functionality behind the FormController ourselves. (Most frameworks don't allow "manual" binding.) It's even possible to bind the same request onto several objects. The following code from the sample application shows such use of the com.interface21.we.bind.HttpServletRequestDataBinder object:

    RegisteredUser user = (RegisteredUser) session.getAttribute("user");    ...    HttpServletRequestDataBinder binder = null;    try {      binder = new HttpServletRequestDataBinder(user, "user");      binder.bind(request);      validator.validate(user, binder);      PurchaseRequest purchaseRequest = new PurchaseRequest(reservation, user);      binder.newTarget(purchaseRequest, "purchase");                                                                                  binder.bind(request);      // May throw exception      binder.close();      // INVOKE BUSINESS METHODS WITH VALID DOMAIN OBJECTS    }    catch (BindException ex) {      // Bind failure      return new ModelAndView("paymentForm", binder.getModel());    }

The HttpServletRequestDataBinder close() method throws a com.interface21.validation.BindException if there were any bind errors, whether type mismatches or errors raised by validators. From this exception it's possible to get a model map by invoking the getModel() method. This map will contain both all objects bound (user and purchase) and an errors object. Note that this API isn't web specific. While the HttpServletRequestDataBinder knows about HTTP requests, other subclasses of com.interface.validation.DataBinder can obtain property values from any source. All this functionality is, of course, built on the com.interface21.beans bean manipulation package - the core of our infrastructure.

The validation approach shown here successfully populates domain objects, rather than web-specific objects such as Struts ActionForms, and makes validation completely independent of the web interface, allowing validation to be reused in different interfaces and tested outside the J2EE server.