Garbage in, garbage out (GIGO) is an axiom of the information technology industry. Organizations spend billions of dollars per year cleansing databases of erroneous and inconsistent information. In many cases, the errors remain undetected until database users discover them usually inadvertentlyor data analysis software (called online analytical processing , or OLAP) returns obviously unreasonable results. Most mistakes are typographical, but many errors result from missing, incomplete, or duplicate entries.
Validation during data entry has proven to be the best and easiest method of minimizing data errors and inconsistencies. Limiting users data entry choices with selections from drop-down lists is another way to minimize bad data collection. If youve designed Microsoft Access tables, youve probably applied at least some data validation tests for data entry errors, such as missing field values, incorrect or improbable date entries, and invalid or unreasonable numeric values. InfoPaths approach to declarative (design mode) validation methodology differs from Accesss, but InfoPath can accomplish similar results. However, you must add programming code to handle some data entry tests, such as numeric date differences, that are easy to express as Access validation rules but not with InfoPaths declarative approach.
The section Applying Conditional Formatting, in Chapter 7, introduced you to logical expressions that warn users of missing or mismatched text data. That chapter s On Your Own section included exercises for flagging some erroneous date values. You use similar expressions as data validation rules, but data validation doesn t just highlight text box controls with bad or suspect data. Data validation errors display a red asterisk or apply a dashed red border to controls containing entries that violate the rules. All standard controls ”except buttons ”have data validation features enabled.
An error message appears when you try to save or submit a data document that contains validation errors, as shown in Figure 8-1. You can t submit a data document with data entry errors to an Access or a SQL Server database, an XML Web service, a Microsoft SharePoint forms library, or a Web site. You can save the offending form as an XML data document ”presumably for subsequent correction ”by clicking Yes in the message box.
Figure 8-1: This error message appears when you attempt to save a form with a validation error or a missing required field value.
SP-1 You apply validation to data in a selected control by choosing Format, Data Validation or by opening the ControlType Properties dialog box and clicking Data Validation to open the Data Validation ( ControlName ) dialog box. Clicking the Data Validation ( ControlName ) dialog box s Add button for a new rule or clicking the Modify button for an existing rule opens a second Data Validation ( ControlName ) dialog box, shown in Figure 8-2. This dialog box is quite similar to the Conditional Format dialog box. Instead of text formatting controls, the Data Validation dialog box has a drop-down list that lets you specify whether to display a ScreenTip and delay error messages until requested , or display the error message when the offending control loses the focus. You add the ScreenTip and error message, which usually are identical, in the two text boxes. If you delay error messages, users can find multiple errors on complex forms by choosing Tools, Go To Next Error (Ctrl+Shift+E). They can display the error message by choosing Tools, Show Error Message (Ctrl+Shift+S).
Figure 8-2: The Data Validation dialog box is similar to the Conditional Formatting dialog box.
An important difference between data validation and conditional formatting is where InfoPath stores the rules in the template s .xsn file. The manifest.xsf file contains data validation expressions and message definitions, which apply to the data document and, thus, to all views. View transform files (view1.xsl for the default view) store conditional formatting expressions, which apply only to the view to which you add the expressions. This design preserves the separation of content-related actions (data validation) and presentation- related changes (conditional formatting).
Data vs. Schema Validation
The XML Schema recommendation incorporates data validation capabilities, which are enforced by constraining facets , such as the minLength= 1 example in the section Hiding Controls Conditionally, in Chapter 7 or the maxLength= ## value you specify by the Limit Text Box To ## Characters spin box on the Text Box Properties dialog box s Display tab. Almost all XML Schema datatypes support enumeration facets , which restrict values to members of a list of valid values, and pattern facets , which require values to adhere to a format specified by a regular expression . (InfoPath SP-1 data validation can test conformance to a regular expression with the Matches Pattern or Does Not Match Pattern condition. The section Validating Text Box Patterns near the end of the chapter describes how to validate a text box with a regular expression.)
If you base your form on an existing schema that includes constraining, enumeration, or pattern facets, InfoPath s schema validation feature identifies erroneous field values with a dashed red border around the offending control. You don t need to duplicate schema-based constraints with data validation expressions.
A few software analysts have complained that InfoPath doesn t add facets to the form s schema for declarative data validation, which would conform to the open XML Schema standard; they denigrate InfoPath s template-based validation rules as proprietary, meaning not standards-based. It s exceedingly difficult to design XML schemas that include constraints on a field value based on the value of another field, and (perhaps) impossible if the constraints depend on values of more than one other field. InfoPath uses XPath 1.0 ”a W3C recommendation ”to define data validation rules. Thus, the not standards-based argument is tenuous, at best.
Order entry forms and invoice forms are good examples of InfoPath forms that require validation of several text box values. Most order entry and invoicing operations use online transaction processing (OLTP) with a direct network connection to the database. If the data entry program doesn t include client-side data validation tests and relies on database constraints to ensure server-side data accuracy and consistency, users receive an error message from the server. Sending data that produces a server-side error and returns a message is called a round-trip . One of the goals of client-side data validation is to minimize round-trips to the database server.
When order entry or invoicing operations take place through an intermediary over a wide area network, such as an XML Web service accessed via the Internet, round-trips caused by bad input data consume substantially more computing resources and usually are much slower than round-trips in a traditional OLTP environment.
The rules you specified in Chapter 7 for the Northwind Traders form s shipping address text boxes also apply to data validation, except the rules that alert users to differences between billing and shipping addresses ”these rules are advisory, not mandatory. There are a few additional rules you might want to apply by data validation, such as requiring Postal Code values for all Country values except Ireland. (In Ireland, only Dublin City has postal codes.) Another important data validation rule is based on the fact that most addresses in the Northwind sample database don t have Region entries, but U.S., Canadian, and Brazilian addresses require Region values for states and provinces . The first validation test is simple, so it serves as an easy introduction to the data validation process.
To add the Postal Code data validation rule to a copy of the final version of the Northwind Traders form from Chapter 7 and test the result, follow these steps.
Add the required PostalCode rule
Test the PostalCode validation rule
Note |
Working around data validation s case-sensitivity problem |
The conditional formatting expressions you added in Chapter 7 s procedures and the Postal Code validation rule you added in the previous section use the XPath and operator. This section introduces you to using the XPath or operator in conjunction with an and operator. If you re familiar with logical expressions in VBA, you would expect to be able to use parentheses to specify logical operator precedence, as in the statement Region = And (Country = USA Or Country = Canada Or Country = Brazil ) , which would deliver the rule you need. InfoPath translates your selections and entries in the Data Validation and Conditional Format dialog boxes to XPath expressions. The problem is that XPath 1.0 doesn t recognize parentheses for operator precedence. XPath s operator precedence is fixed, and or is at the bottom of the precedence pecking order. Thus, the data validation expression shown in Figure 8-3 works for USA, but it doesn t work for Canada or Brazil. If a Region value is present, typing Canada or Brazil in the Country text box adds a red border to the Region textbox, because the second or operator disregards the and operator
Figure 8-3: The XPath expression represented by the conditions shown here doesn t work correctly because of operator precedence issues.
See Also |
To learn more than you might want to know about XPath s operator precedence rules, see the section 3.4, Booleans in the XML Path Language (XPath) Version 1.0 recommendation at www.w3.org/TR/xpath . |
Reversing the order by deleting the first And condition and adding it after the last Or condition doesn t solve the precedence problem. In this case, the expression works for Brazil only. This means that you must duplicate the Region Is Not Blank expression for each Country value. InfoPath limits you to a maximum of five conditions in an XPath expression, so you need multiple expression groups to handle the Region validation rule. One alternative is to add three Region Is Blank And Country = CountryName expression groups. You can save a bit of work by adding one expression group for USA and Canada, and a second expression group for Brazil.
Follow these steps to add the first validation expression set for USA and Canada to the Region field of the NWOrdersCh07 Test form s Customer Information section.
Add the Region validation rule for USA and Canada
As mentioned in Chapter 7, InfoPath applies the equivalent of a VBA Or operator to multiple expression groups. To add the second expression group for Brazil, follow these steps.
Add the Region validation rule for Brazil
InfoPath s declarative logical operators include several choices that you haven t encountered in previous conditional formatting or data validation exercises: Contains, Does Not Contain, Begins With, and Does Not Begin With. Contains tests a field value for an occurrence anywhere in the value of one or more consecutive characters you specify in the Type Text box. Begins With tests a field value for the occurrence of characters at the start of the value.
The next exercise uses the Contains and Begins With operators to test if values in the Shipping Information field contain the defaults when you clear the Same As Bill To Address check box. Adding Contains and Begins With validation rules demonstrates the effect of adding a validation rule to a field with conditional formatting.
To apply Contains and Begins With validation rules, follow these procedures.
Apply a Contains condition to the Company (ShipName) field
Apply a Begins With expression to other Shipping Information fields
Whether conditional formatting of Shipping Information values is applicable at this point is a judgment call. If most shipping address fields are likely to be the same as billing address fields, the conditional formatting tests are useful. Otherwise, conditional formatting might distract data entry operators.
XPath automatically detects whether a node value is a string or number. Comparison values, such as [Not] Less Than, [Not] Less Than Or Equal To, [Not] Equal To, [Not] Greater Than or Equal To, and [Not] Greater Than, apply to numeric values, as you d expect, and also to string values. When testing string values, these operators test the numeric Unicode value of the first character. If the string has more than one character, the test continues until a mismatch occurs with a successive character. If no character mismatch occurs, the values are equal; otherwise , the result is greater or less than the comparison value. If the comparison value you type in the text box is BB , a BC field value returns true for the Is Greater Than comparison, and BA returns true for Is Less Than.
The Line Items repeating list in the Northwind Traders Order Entry Form contains several numeric values that require validation to ensure data consistency. Here are the validation rules for LineItem field values, expressed in positive-logic terms, that you apply in this section s exercise:
Follow these steps to add and test the preceding data validation rules for the Line Items list.
Validate the OrderID attribute and the QuanOrdered and QuanShipped fields
If the QuanOrdered rule doesn t work correctly, you probably forgot to change And to Or in step 6.
The following exercise shows you how to combine two sets of And and Or operators in a single data validation rule for the BackOrdered field, and validate a field of the Decimal ( double ) data type.
Validate the BackOrdered and Discount fields
List boxes return text or numeric values; data validation expressions for list boxes are the same as those for text boxes of equivalent data types. Alternatively, if you have a text box bound to the same field as the list box, you can apply the data validation rule to the text box.
The following example assumes that Speedy Express (list box value = 1) makes deliveries to U.S. destinations only. The Same As Bill To Address option complicates the validation expression. If you select the Same As Bill To address check box, the expression must test the Customer Information sections Country field instead of the ShipCountry field. The data validation expression in VBA syntax is shown here:
(ShipVia = 1 And IsShipToSameAsBillTo = True And Country <> USA) Or (ShipVia = 1 And IsShipToSameAsBillTo = False And ShipCountry <> USA)
The Data Validation dialog box has a maximum of five conditions and the preceding requires six conditions. Thus, you must add a second expression group for the expression to the right of the Or operator in the VBA example above.
To add the Speedy Express constraint to the Ship Via drop-down list and test the result, follow these steps.
Validate the Ship Via drop-down list
Test your validation rule
The section Conditional Formatting Based on a Non-Blank Value, in Chapter 7, showed you how to test for the presence of values in date picker controls, and that chapter s On Your Own section suggested an additional exercise for conditionally formatting the Order ID text box if the Required date value is present and less than or equal to the Ordered date. Conditional formatting and data validation logical expressions are identical.
SP-1 You can compare date picker values with a fixed date value you type in a text box, which is seldom useful, or with other Date ( date ) and DateAndTime ( dateTime ) InfoPath field data types. (The Time ( time ) field data type is a special case and isn t applicable to date picker control values.) InfoPath treats dates as numeric values in declarative expressions, but uses a special string-comparison function ( msxsl :string-compare ) to determine date equality or inequality. The Today function is a useful comparison value for values of the date datatype. You can compare a date with your computer s system date by choosing Use A Formula in the third condition list to open the Insert Formula dialog box, clicking Insert Function, and selecting Today from the Functions list.
SP-1 InfoPath s pattern-matching feature lets you validate or conditionally format special text formatting. The Data Entry Pattern dialog box offers prebuilt regular expressions for U.S. and Canadian telephone numbers, social security numbers , 5-digit ZIP Codes, and ZIP+4 Codes, as shown on the left in Figure 8-4. You can also create your own regular expression by selecting a custom pattern and adding special characters from an Insert Special Character drop-down list. The dialog box on the right in Figure 8-4 shows a pattern for matching the five-letter CustomerID field.
Figure 8-4: The Data Entry Pattern dialog box provides prebuilt regular expressions (left) and lets you create custom regular expressions by adding elements from a drop-down list (right).
See Also |
For more information about regular expressions, visit msdn.microsoft.com/library/en-us/cpgenref/html/ cpconRegularExpressionsLanguageElements.asp . This page has links to a wide range of regular expression topics. |
To validate the format of U.S. and Canadian telephone numbers, follow these steps.
Use a regular expression to test phone number formatting
Data validation by data entry applications such as InfoPath is critical to maintaining organization-wide data accuracy and consistency, especially if the data isnt submitted immediately to a database that has built-in data validation rules. InfoPaths data validation features closely resemble those for applying conditional formatting. Both features use standards-based XPath expressions. View .xsl files store expressions for conditional formatting; the manifest.xsf file holds data validation expressions, which apply to all views of a form. The primary difference in the data validation UI is substitution of a ScreenTip and message box for conditional formatting property values and the inability to add multiple expression groups.
If a validation rule requires more than five expressions or contains both And and Or operators, you must add expression groups to complete the rule. XPath includes common comparison operators for numeric and text values; InfoPath treats date values as numeric for comparison purposes. You use less than, less than or equal to, greater than or equal to, and greater than operators primarily with numeric and date values. Pattern matching with regular expressions lets you validate the format of text box values; you also can use pattern matching with conditional formatting.
1. |
Can I validate rich text boxes and other basic control types that weren t covered in this chapter? |
|
2. |
Is there a limit to the number of expression groups that I can include in a data validation rule? |
|
3. |
Can I compare a Date ( date ) field value with a DateAndTime ( dateTime ) value? |
Answers
1. |
Yes. Data validation applies to all controls except the button picture, ink, file attachment, and expression box types. It s a good practice to validate check box and option button group values where possible, but validating rich text box and list (bulleted, numbered, or plain) content isn t a common practice. |
2. |
There s no published limitation on the number of expression groups for data validation or conditional formatting. The practical limit is your ability to avoid conflicting conditions in multiple expression groups. |
3. |
Yes. You can verify that the comparison behaves as expected in the first On Your Own exercise. |
Here are some additional exercises to test your expertise with data validation rules:
Part I - Introducing Microsoft Office InfoPath 2003 SP-1
Part II - Designing InfoPath Forms
Part III - Working with Databases and Web Services
Part IV - Programming InfoPath Forms