There s a hidden language somewhere in the world of .NET. It s a language that few know about, yet thousands could benefit from. It s a language so potentially complex it can take hours to understand just one line of code.
It s the language of regular expressions .
So, what is a regular expression? Microsoft rather blandly defines it as a concise and flexible notation for finding and replacing patterns of text. That s like saying Big Ben is an interesting clock. You see, expressions are so much more than that. They re a highly cool method of taking a specially written piece of text, an expression ”similar to the way you d use the *.doc expression when searching for Word documents ”then using it to manipulate your text in a special way.
For example, you may want to find all matches for a particular expression in a piece of text (such as finding all the U.S. telephone numbers in a document). You may wish to check whether a piece of text adheres to a particular expression (such as checking that a credit card number is of the correct format). You may wish to replace certain text using your expression (such as removing expletives from forum posts), or reorganize the text using your expression (such as changing an American-styled date into a British one).
It s powerful ”and can save you hours of development time.
I ll give you an example. Previously, I ve written function upon function, each dozens of lines long, to perform the complex task of searching a chunk of HTML for links and extracting as appropriate. It was in-depth . It took me at least two days and was buggy as hell.
If I would ve known about regular expressions, however, I could ve implemented a little code to use the expression a.*href\s*=\s*(?:""(? < 1 > [^""]*)""(? < 1 > \S+)). It looks confusing. It is confusing. But it also does the bug-free job of my line-upon-line of my weird Mid , InStr , and Like code.