Parsing Functions

Sometimes you will find it useful to take strings apart or convert them into numbers, and these parsing functions can help you accomplish these tasks.

String Subset accesses a particular section of a string. It returns the substring beginning at offset and containing length number of characters. Remember, the first character's offset is zero (see Figure 9.18). Figure 9.19 shows an example of how String Subset can be used to return a subset of an input string.

Figure 9.18. String Subset

Figure 9.19. String Subset used to return a subset of an input string

Scan From String, the "opposite" of Format Into String, converts a string containing valid numeric characters (0 to 9, +, -, e, E, and period) to numeric data (see Figure 9.20). This function starts scanning the input string at initial search location and converts the data according to the specifications in format string (to learn more about the specifications, see the LabVIEW manuals or "String Function Overview" in the Online Reference). Scan From String can be resized to convert multiple values simultaneously.

Figure 9.20. Scan From String

a, b, c, d, e, f, A, B, C, D, E, and F are valid characters if a hex format is specified, and comma may be valid if it is the localized decimal point.

In this example, Scan From String converts the string "VOLTS DC + 1.28E + 2" to the number 128.00 (see Figure 9.21). It starts scanning at the eighth character of the string (which is the + in this caseremember that the first character offset is zero).

Figure 9.21. Scan From String used to extract a floating point numeric from an input string

Scan From String is also capable of converting string data to more than just numeric data. You can extract strings, paths, enums, Booleans, and time stamps from strings. For example, using the format string %s will extract a "FALSE" or "TRUE" from a string, as a Boolean. Similarly, you can use the format string %d to extract a "0" or "1" from a string, as a Boolean. However, you must be careful with scanning non-numeric data from strings, as the Scan From String function will often stop scanning when it encounters a space (or other whitespace) character. Therefore, it is not quite as flexible as Format Into String is for the inverse operation.

Both Format Into String and Scan From String have an Edit Scan String interface that you can use to create the format string. In this dialog box, you can specify format, precision, data type, and width of the converted value. Double click on the function or pop up on it and select Edit Format String to access the Edit Scan String or Edit Format String dialog box (see Figure 9.22).

Figure 9.22. Edit Scan String dialog

After you create the format string and click the Create String button, the dialog box creates the string constant and wires it to the format string input for you.

Match Pattern and Regular Expressions

Match Pattern is used to look for a given pattern of characters in a string (see Figure 9.23). It searches for and returns a matched substring. Match Pattern looks for the regular expression (or pattern) in string, beginning at offset; if it finds a match, it splits the string into three substrings. If no match is found, the match substring is empty and offset past match is set to -1.

Figure 9.23. Match Pattern

A regular expression is a string that uses a special syntax (called regular expression syntax) to describe a set of strings that match a pattern (see Figure 9.24)the syntax (and some useful examples) will be described shortly.

Figure 9.24. Match Pattern used to find a pattern in an input string

The Match Pattern function allows you to use some special characters to give you more powerful and flexible searches. Table 9.4 shows you the special characters you can use in the Match Pattern function.

Table 9.4. Special Characters Used by the Match Pattern Function
Special Character	Interpreted by the Match Pattern Function
.	Matches any character. For example, l.g matches lag, leg, log, and lug.
?	Matches zero or one instance of the expression preceding ?. For example, be?t matches bt and bet but not best.
\	Cancels the interpretation of any special character in this list. For example, \? matches a question mark and \. matches a period. You also can use the following constructions for the space and non-displayable characters:
	\b	backspace
	\f	form feed
	\n	new line
	\s	space
	\r	carriage return
	\t	tab
	\xx	any character, where xx is the hex code using 0 through 9 and uppercase A through F
^	If ^ is the first character of regular expression, it anchors the match to the offset in string. The match fails unless regular expression matches that portion of string that begins with the character at offset. If ^ is not the first character, it is treated as a regular character.
[]	Encloses alternates. For example, [abc] matches a, b, or c. The following characters have special significance when used within the brackets in the following manner:
	Indicates a range when used between digits, or lowercase or uppercase letters; for example, [025], [ag], or [LQ]. The following characters have significance only when they are the first character within the brackets.
~	Matches any character, including non-displayable characters, except for the characters or range of characters in brackets. For example, [~09] matches any character other than 0 through 9.
^	Matches any displayable character, including the space and tab characters, except the characters or range of characters enclosed in the brackets. For example, [^09] matches all displayable characters, including the space and tab characters, except 0 through 9.
+	Matches the longest number of instances of the expression preceding +; there must be at least one instance to constitute a match. For example, be+t matches bet and beet but not bt.
*	Matches the longest number of instances of the expression preceding * in regular expression, including zero instances. For example, be*t matches bt, bet, and beet.
$	If $ is the last character of regular expression, it anchors the match to the last element of string. The match fails unless regular expression matches up to and including the last character in the string. If $ is not last, it is treated as a regular character.

If you have used regular expressions before in other programming languages or with UNIX command-line utilities like grep, you know how powerful and useful regular expressions can be.

If you are not familiar with regular expressions, they are a very powerful set of rules for matching and parsing text. Table 9.5 shows you some examples of pattern matching using regular expressions. For more information on regular expression syntax, consult the LabVIEW documentation or a good reference.

Table 9.5. Regular Expressions and the Characters That They Can be Used to Find
Characters to Find	Regular Expression
VOLTS	VOLTS
All uppercase and lowercase versions of volts; that is, VOLTS, Volts, volts, and so on	[Vv][Oo][Ll][Tt][Ss]
A space, a plus sign, or a minus sign	[+-]
A sequence of one or more digits	[0-9]+
Zero or more spaces	\s* or * (that is, a space followed by an asterisk)
One or more spaces, tabs, new lines, or carriage returns	[\t \r \n \s]+
One or more characters other than digits	[~0-9]+
The word "Level" only if it begins at the offset position in the string	^Level
The word "Volts" only if it appears at the end of the string	Volts$
The longest string within parentheses	(.*)
The longest string within parentheses but not containing any parentheses within it	([~( )]*)
The character [	[ [ ]
cat, dog, cot, dot, cog, and so on	[cd][ao][tg]

The Match Pattern is a relatively fast and powerful way to search for patterns in a string. It does not however, incorporate every aspect of the regular expression syntax. If you need more specialized options to match strings with regular expressions, you can use the Match Regular Expression function.

The Match Regular Expression function (see Figure 9.25) incorporates a larger set of options and special characters for string matching, but it is slower than Match Pattern. It uses the Perl Compatible Regular Expressions (PCRE) library, an open source library written by Philip Hazel at the University of Cambridge.

Figure 9.25. Match Regular Expression

Because of its additional complexity, Match Regular Expression should only be used if Match Pattern won't work for what you are trying to do.

Match Regular Expression is expandable. Resize it vertically (using the Positioning tool to drag the resize handles up or down) to show submatches.

If you want to learn more about regular expressionsand there is a vast amount of information availableuse your favorite search engine to find resources on the Web. There are even regular expression libraries (online databases of regular expressions) where you can find some very useful regular expressions. For example, here is a regular expression (we found on the Web) that will match an email address:

[View full width]
([a-zA-Z0-9_\-\.]+)@((\[[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.)|(([a-zA-Z0-9\-]1\.)1)) ([a-zA-Z]{2,4}|[0-9]{1,3})(\]?)

Figure 9.26 shows this regular expression, in action. As you can see, regular expressions are an extremely powerful tool!