Redemption Steps | Writing Secure Code

There are two steps on the road to XSS redemption:

Restrict the input to valid input only. Most likely you will use regular expressions for this.
HTML encode the output.

You really should do both steps in your code; the following code examples outline how to perform one or both steps.

ISAPI C/C++ Redemption

Calling code like the code below prior to writing data out to the browser will encode the output.

 /////////////////////////////////////////////////////////////////// // HtmlEncode // Converts a raw HTML stream to an HTML-encoded version // Args // strRaw: Pointer to the HTML data // result: A reference to the result, held in std::string // Returns // false: failed to encode all HTML data // true: encoded all HTML data bool HtmlEncode(char *strRaw, std::string &result) {  size_t iLen = 0;  size_t i = 0;  if (strRaw && (iLen=strlen(strRaw))) {  for (i=0; i < iLen; i++)   switch(strRaw[i]) {  case '  /////////////////////////////////////////////////////////////////// // HtmlEncode // Converts a raw HTML stream to an HTML-encoded version // Args // strRaw: Pointer to the HTML data // result: A reference to the result, held in std::string // Returns // false: failed to encode all HTML data // true: encoded all HTML data bool HtmlEncode (char *strRaw, std::string &result) { size_t iLen = 0; size_t i = 0; if (strRaw && (iLen=strlen(strRaw))) { for (i=0; i < iLen; i++) switch(strRaw[i]) { case '\0' : break; case '<' : result.append("&lt;"); break; case '>' : result.append("&gt;"); break; case '(' : result.append("&#40;"); break; case ')' : result.append("&#41;"); break; case '#' : result.append("&#35;"); break; case '&' : result.append("&amp;"); break; case '"' : result.append("&quot;"); break; default : result.append(1,strRaw[i]); break; } } return i == iLen ? true : false; } 
 ' : break;  case '<' : result.append("&lt;"); break;  case '>' : result.append("&gt;"); break;  case '(' : result.append("&#40;"); break;  case ')' : result.append("&#41;"); break;  case '#' : result.append("&#35;"); break;  case '&' : result.append("&amp;"); break;  case '"' : result.append("&quot;"); break;  default : result.append(1,strRaw[i]); break;  }  }  return i == iLen ? true : false; }

If you want to use regular expressions in C/C++, you should either use Microsofts CAtlRegExp class or Boost.Regex explained at http://boost.org/libs/regex/doc/ syntax.html.

ASP Redemption

Use a combination of regular expressions (in this case, the VBScript RegExp object, but calling it from JavaScript) and HTML encoding to sanitize the incoming HTML data:

 <%  name = Request.Querystring("Name")  Set r = new RegExp  r.Pattern = "^\w{5,25}$"  r.IgnoreCase = True    Set m = r.Execute(name)  If (len(m(0)) > 0) Then  Response.Write(Server.HTMLEncode(name))  End If %>

ASP.NET Forms Redemption

This code is similar to the above example, but it uses the .NET Framework libraries and C# to perform the regular expression and HTML encoding.

 using System.Web; // Make sure you add the System.Web.dll assembly ... private void btnSubmit_Click(object sender, System.EventArgs e) {  Regex r = new Regex(@"^\w{5,25}");  if (r.Match(txtValue.Text).Success) {  Application.Lock();  Application[txtName.Text] = txtValue.Text  Application.UnLock();  lblName.Text = "Hello, " +    HttpUtility.HtmlEncode(txtName.Text);  } else {  lblName.Text = "Who are you?";  }

JSP Redemption

In JSP, you would probably use a custom tag. This is the code to an HTML encoder tag:

 import java.io.IOException; import javax.servlet.jsp.JspException; import javax.servlet.jsp.tagext.BodyTagSupport; public class HtmlEncoderTag extends BodyTagSupport {  public HtmlEncoderTag() {  super();  }  public int doAfterBody() throws JspException {  if(bodyContent != null) {  System.out.println(bodyContent.getString());  String contents = bodyContent.getString();  String regExp = new String("^\w{5,25}$");  // Do a regex to find the good stuff  if (contents.matches(regExp)) {  try {  bodyContent.getEnclosingWriter().  write(contents);  } catch (IOException e) {  System.out.println("Io Error");  }  return EVAL_BODY_INCLUDE;  } else {  try {  bodyContent.getEnclosingWriter().  write(encode(contents));  } catch (IOException e) {  System.out.println("Io Error");  }  System.out.println("Content: " + contents.toString());  return EVAL_BODY_INCLUDE;  }   } else {  return EVAL_BODY_INCLUDE;  }  }  // JSP has no HTML encode function  public static String encode(String str) {  if (str == null)  return null;  StringBuffer s = new StringBuffer();  for (short i = 0; i < str.length(); i++) {  char c = str.charAt(i);  switch (c) {  case '<':  s.append("&lt;");  break;  case '>':  s.append("&gt;");  break;  case '(':  s.append("&#40;");  break;  case ')':  s.append("&#41;");  break;  case '#':  s.append("&#35;");  break;  case '&':  s.append("&amp;");  break;  case '"':  s.append("&quot;");  break;  default:  s.append(c);  }  }  return s.toString();  } }

And finally, here is some sample JSP that calls the tag code defined above:

 <%@ taglib uri="/tags/htmlencoder" prefix="htmlencoder"%> <head>  <title>Watch out you sinners...</title> </head> <html>  <body bgcolor="white">  <htmlencoder:htmlencode><script  type="javascript">BadStuff()</script></htmlencoder:htmlencode>  <htmlencoder:htmlencode>testin</htmlencoder:htmlencode>  <script type="badStuffNotWrapped()"></script>  </body> </html>

PHP Redemption

Just like in the earlier examples, youre applying both remedies, checking validity, and then HTML encoding the output using htmlentitities():

 <?php  $name=$_GET['name'];  if (isset($name)) {  if (preg_match('/^\w{5,25}$/',$name)) {  echo "Hello, " . htmlentities($name);  } else {  echo "Go away!";  }  } ?>

CGI Redemption

This is the same idea as in the previous code samples: restrict the input using a regular expression, and then HTML encoding the output.

 #!/usr/bin/perl use CGI; use HTML::Entities; use strict; my $cgi = new CGI; print CGI::header(); my $name = $cgi->param('name'); if ($name =~ /^\w{5,25}$/) {  print "Hello, " . HTML::Entities::encode($name); } else {  print "Go away!"; }

If you dont want to load, or cannot load, HTML::Entites, you could use the following code to achieve the same task:

 sub html_encode  my $in = shift;  $in =~ s/&/&amp;/g;  $in =~ s/</&lt;/g;   $in =~ s/>/&gt;/g;  $in =~ s/\"/&quot;/g;  $in =~ s/#/&#35;/g;  $in =~ s/\(/&#40;/g;  $in =~ s/\)/&#41;/g;  return $in; }

mod_perl Redemption

Like all the code above, this example checks that the input is valid and well formed , and if it is, encodes the output.

 #!/usr/bin/perl use Apache::Util; use Apache::Request; use strict; my $apr = Apache::Request->new(Apache->request); my $name = $apr->param('name'); $apr->content_type('text/html'); $apr->send_http_header; if ($name =~ /^\w{5,25}$/) {  $apr->print("Hello, " . Apache::Util::html_encode($name)); } else {  $apr->print("Go away!"); }

A Note on HTML Encode

Simply HTML encoding all output is a little draconian for some web sites, because some tags, such as <I> and <B>, are harmless. To temper things a little consider unencoding known safe constructs. The following C# code shows an example of what the author means, as it un-HTML encodes italic, bold, paragraph, emphasis, and heading tags.

 Regex.Replace(s,  @"&lt;(/?)(ibpemh\d{1})&gt;",  "<>",  RegexOptions.IgnoreCase);