But I Want Users to Post HTML to My Web Site

But I Want Users to Post HTML to My Web Site!

Sometimes you simply want to allow a small subset of HTML tags so that your users can add some formatting to their comments. The idea of accepting HTML from untrusted sources is highly discouraged because it's extremely difficult to get it right. Allowing tags like <EM>, <PRE>, <BR>, <P>, <I> </I>, and <B> </B> is safe, so long as you use regular expressions to look for these character sequences explicitly. The following regular expression will allow some tags, as well as other safe characters:

if (/^(?:[\s\w\?\!\,\.\'\"]* (?:\<\/?(?:i b p br em pre)\>))*$/i) { # Cool, it's valid input! }

This regular expression will allow spaces (\s), A-Za-z0-9 and _ (\w), a limited subset of punctuation and < followed by an optional / , and the letter or lettersi, b, p, pr, em, or pre followed by a > . The i at the end of the expression makes the check case-insensitive. Note that this regular expression does not validate that the input is well-formed HTML. For example, Hello, </i>World!<i> is legal input to the regular expression, but it is not well-formed HTML even though the tags are not malicious.

CAUTION
Be careful when accepting HTML input. It can lead to compromise unless the solution is bulletproof. This issue became so bad for the distributed crypto-cracking site http://www.distributed.net that they took radical action in January 2002. You can read about the issues they faced and their remedy at http://n0cgi.distributed.net/faq/cache/268.html. By the way, the URL starts with n-zero-cgi.



Writing Secure Code
Writing Secure Code, Second Edition
ISBN: 0735617228
EAN: 2147483647
Year: 2001
Pages: 286

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net