SOUNDEX Function


Encodes a string to facilitate searching

Category: Character

Restriction: SOUNDEX algorithm is English- biased .

Syntax

SOUNDEX ( argument )

Arguments

argument

  • specifies any SAS character expression.

Details

The SOUNDEX function encodes a character string according to an algorithm that was originally developed by Margaret K. Odell and Robert C. Russel (US Patents 1261167 (1918) and 1435663 (1922)). The algorithm is described in Knuth, The Art of Computer Programming, Volume 3 (See 'References' on page 926). Note that the SOUNDEX algorithm is English-biased and is less useful for languages other than English.

The SOUNDEX function returns a copy of the argument that is encoded by using the following steps:

  1. Retain the first letter in the argument and discard the following letters :

    • A E H I O U W Y

  1. Assign the following numbers to these classes of letters:

    • 1: B F P V

    • 2: C G J K Q S X Z

    • 3: D T

    • 4: L

    • 5: M N

    • 6: R

  1. If two or more adjacent letters have the same classification from Step 2, then discard all but the first. (Adjacent refers to the position in the word prior to discarding letters.)

The algorithm that is described in Knuth adds trailing zeros and truncates the result to the length of 4. You can perform these operations with other SAS functions.

Examples

SAS Statements

Results

x=soundex( ' Paul ' );

put x;

P4

word= ' amnesty ' ;

x=soundex(word);

put x;

A523




SAS 9.1 Language Reference Dictionary, Volumes 1, 2 and 3
SAS 9.1 Language Reference Dictionary, Volumes 1, 2 and 3
ISBN: N/A
EAN: N/A
Year: 2004
Pages: 704

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net