Returns the position and length of a substring that matches a pattern
Category: Character String Matching
Restriction: Use with the PRXPARSE function
CALL PRXSUBSTR ( regular-expression -id , source , position <, length >);
regular-expression-id
specifies a numeric identification number that is returned by the PRXPARSE function.
source
specifies the character expression that you want to search.
position
is a numeric value that specifies the position in source where the pattern begins. If no match is found, CALL PRXSUBSTR returns zero.
length
specifies a numeric value that is the length of the substring that is matched by the pattern. If no match is found, CALL PRXSUBSTR returns zero.
The CALL PRXSUBSTR routine searches the variable source with the pattern from PRXPARSE, returns the position of the start of the string, and optionally returns the length of the string that is matched. By default, when a pattern matches more than one character that begins at a specific position, CALL PRXSUBSTR selects the longest match.
For more information about pattern matching, see 'Pattern Matching Using SAS Regular Expressions (RX) and Perl Regular Expressions (PRX)' on page 260.
CALL PRXSUBSTR performs the same matching as PRXMATCH, but CALL PRXSUBSTR additionally enables you to use the length argument to receive more information about the match.
The Perl regular expression (PRX) functions and CALL routines work together to manipulate strings that match patterns. To see a list and short description of these functions and CALL routines, see the Character String Matching category in 'Functions and CALL Routines by Category' on page 270.
The following example searches a string for a substring, and returns its position and length in the string.
data _null_; /* Use PRXPARSE to compile the Perl regular expression. */ patternID = prxparse('/world/'); /* Use PRXSUBSTR to find the position and length of the string. */ call prxsubstr(patternID, 'Hello world!', position, length); put position= length=; run;
The following line is written to the SAS log:
position=7 length=5
The following example searches for addresses that contain avenue, drive, or road, and extracts the text that was found.
data _null_; if _N_ = 1 then do; retain ExpressionID; /* The i option specifies a case insensitive search. */ pattern = "/aveavenuedrdriverdroad/i"; ExpressionID = prxparse(pattern); end; input street .; call prxsubstr(ExpressionID, street, position, length); if position ^= 0 then do; match = substr(street, position, length); put match:$QUOTE. "found in " street:$QUOTE.; end; datalines; 153 First Street 6789 64th Ave 4 Moritz Road 7493 Wilkes Place ; run;
The following lines are written to the SAS log:
"Ave" found in "6789 64th Ave" "Road" found in "4 Moritz Road"
Functions and CALL routines:
'CALL PRXCHANGE Routine' on page 354
'CALL PRXDEBUG Routine' on page 356
'CALL PRXFREE Routine' on page 358
'CALL PRXNEXT Routine' on page 359
'CALL PRXPOSN Routine' on page 361
'PRXCHANGE Function' on page 739
'PRXPAREN Function' on page 747
'PRXMATCH Function' on page 743
'PRXPARSE Function' on page 748
'PRXPOSN Function' on page 750