Recipe 6.13 Approximate Matching

6.13.1 Problem

You want to match fuzzily, that is, allowing for a margin of error, where the string doesn't quite match the pattern. Whenever you want to be forgiving of misspellings in user input, you want fuzzy matching.

6.13.2 Solution

Use the String::Approx module, available from CPAN:

use String::Approx qw(amatch); if (amatch("PATTERN", @list)) {     # matched } @matches = amatch("PATTERN", @list);

6.13.3 Discussion

String::Approx calculates the difference between the pattern and each string in the list. If less than a certain number by default, 10 percent of the pattern length of one-character insertions, deletions, or substitutions are required to make the string fit the pattern, it still matches. In scalar context, amatch returns the number of successful matches. In list context, it returns the strings matched.

use String::Approx qw(amatch); open(DICT, "/usr/dict/words")               or die "Can't open dict: $!"; while(<DICT>) {     print if amatch("balast"); } ballast balustrade blast blastula sandblast

Options passed to amatch control case-sensitivity and the permitted number of insertions, deletions, or substitutions. These are fully described in the String::Approx documentation.

The module's matching function seems to run between 10 and 40 times slower than Perl's built-in pattern matching. So use String::Approx only if you're after a fuzziness in your matching that Perl's patterns can't provide.

6.13.4 See Also

The documentation for the CPAN module String::Approx; Recipe 1.22



Perl Cookbook
Perl Cookbook, Second Edition
ISBN: 0596003137
EAN: 2147483647
Year: 2003
Pages: 501

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net