Unmatched Tags


In the previous matching tag examples, my input has matched p and H2 tags, and the Regex finds them just fine. However, there s nothing in the regular expression itself that requires the opening and closing tags to match. I m going to add a test that shows that unmatched tags will still pass this Regex, and then see if I can figure out how to require them to match. I seem to remember that there s a way to do that with the Regex class. Here s the new test:

 [Test] public void InvalidXmlNotHandledYet() { 
Regex r = new Regex("<(?<prefix>.*)>(?<body>.*)</(?<suffix>.*)>");
Match m = r.Match("<p>this is a para</H2>");
Assert(m.Success);
AssertEquals("p",m.Groups["prefix"].Value);
AssertEquals("H2",m.Groups["suffix"].Value);
}

Just as expected, the same Regex matches a p followed by an H2. Not what we really want, but we want to be sure we understand what our code does. This test now motivates the next extension, to a Regex that does force the tags to match. I m not sure we will need this ”we may already have gone beyond our current need for regular expressions, but my mission here is to learn as much as I can, in a reasonable time, about how Regex works. Now I ll have to search the Help a bit. Hold on...

The documentation seems to suggest that you can have named backreferences , using \k. I ll write a test. Hold on again... All right! Worked almost the first time: just a simple mistake away from perfect. Here s the new test:

 [Test] public void Backreference() { 
Regex r = new Regex("<(?<prefix>.*)>(?<body>.*)</\k<prefix>.*>");
Match m = r.Match("<p>this is a para</p>");
Assert(m.Success);
m = r.Match("<p>this is a para</H2>");
Assert(!m.Success);
}

In this test, notice that we had to type \\k to get the \k into the expression. This is because C# strings, like most languages strings, already use the backslash to prefix newlines and other special characters . We have to type two of them to get one backslash into the string. The amazing thing is that I actually remembered to do that the first time! The mistake? I left the word suffix there instead of saying \k<prefix>, as was my intent.




Extreme Programming Adventures in C#
Javaв„ў EE 5 Tutorial, The (3rd Edition)
ISBN: 735619492
EAN: 2147483647
Year: 2006
Pages: 291

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net