So, I have described all the dangers that can appear as a result of the XSS vulnerability and the methods an attacker can use to exploit this vulnerability. Now I'd like to tell you how you can avoid this vulnerability when writing your Web applications.
The cause of the XSS vulnerability is insufficient filtration of the entered data. You should prohibit users from placing tags in their messages; therefore, you should filter the < and > characters that enclose tags.
If the use of tags is required, you can introduce pseudotags that will be replaced with actual tags during output. For example, suggest that your users use a construction like [A=href]text[/A], [B] [B] .
Filtration can be of several types:
It can remove the < and > characters. This can distort the meaning of a message.
It can bar messages with these characters. This method is also unsuitable because it can reject honest messages.
It can convert the < and > characters to a safe form. They can be replaced with the < and > sequences. This method is the most interesting because it doesn't change the meaning of the displayed message.
However, I demonstrated earlier that filtering only these characters is insufficient when some text is output as a tag attribute.
To eliminate XSS in this case, you should prevent the inserted text from overrunning the boundaries of the tag attribute value.
This can be implemented as follows :
You can prohibit the use of spaces in the text output as a value of a tag attribute. Even if the attribute value isn't between quotation marks, an attacker won't be able to overrun the value.
You can enclose the attribute value in quotation marks and restrict the use of them inside the attribute value.
The second variant seems the best because its proper implementation won't limit the attribute values. However, you should restrict the use of quotation marks. Otherwise, an attacker will be able to insert a quotation mark into the attribute value and embed the tag attributes he or she wishes, for example, onMouseOver or Style .
Screening quotation marks or apostrophes with a backslash is ineffective within a tag attribute value.
Consider an HTML document.
<a href="x\" onClick=alert(String.fromCharCode(72,101,108,108,111)); return/**/false; \"">click me </a><br> <a href='x\' onClick=alert(String.fromCharCode(72,101,108,108,111)); return/**/false; \''>click me </a>
In the first case, the following text will be embedded as a value of the href attribute:
x" onClick=alert(String.fromCharCode(72,101,108,108,111));return/ **/false; "
It will be inserted despite screening of the quotation marks with backslashes.
In the second case, screening the apostrophes will also be ineffective.
This example doesn't include the < and > characters that also should be filtered.
Quotation marks and apostrophes can be filtered using the following methods:
They can be deleted from the text. Although this method gives the expected result, it isn't suitable because messages with quotation marks or apostrophes will be processed improperly.
You can bar messages with quotation marks or apostrophes. This method is even worse than the previous one.
You can convert messages to a safe form by replacing the dangerous characters with the " and ' sequences. This method is the best option because it doesn't limit the values of the tag attributes.
PHP offers you the htmlspecialchars() function. It is just what you need to implement filtration. By default, this function converts the <, >, and & characters and quotation marks to safe forms.
By default, the htmlspecialchars() function doesn't convert apostrophes.
In other words, if you use this function to process tag attribute values, the values should be delimited with quotation marks. Note that if you use this function to process tag attribute values and the attributes aren't delimited with quotation marks (or apostrophes), the attacker will be able to exploit the XSS vulnerability as I demonstrated earlier.
If you have to delimit attribute values with apostrophes, you can use the second parameter of the htmlspecialchars() function.
A call to this function, such as htmlspecialchars("text", ENT_QUOTES) , will convert both apostrophes and quotation marks to a safe form.
To summarize, you should stick to the following rules:
Any text that can be affected by an outsider should be processed before it is displayed.
Processing text that is not a part of a tag (a tag attribute value or its portion) is a matter of filtering the < and > characters by replacing them with the < and > sequences. Filtration of ampersands by replacing them with the & sequences will help you avoid discrepancy between the entered text and the text displayed by the browser. In PHP, you should use the htmlspecialchars() function. Note that filtering quotation marks outside tag attribute values isn't necessary.
Each tag attribute that can be affected by an outsider should be between quotation marks (see the warning about processing apostrophes, given earlier in this section).
If the value of a tag attribute is an URL address, it should begin with the name of one of the valid protocols or with a slash, indicating that the document is located on the same site.
Values of tag attributes should be filtered for the < and > characters, ampersands, and quotation marks. In PHP, you should use the htmlspecialchars() function.
If users are allowed to change the names of tag attributes, the set of allowable names should be announced explicitly (and thought out beforehand).
Now I'd like to say a few words about the exploitation of undocumented features when the XSS vulnerability doesn't take place.
As I demonstrated earlier, when users are allowed to include images in their messages, a malicious person can use this to achieve destructive goals. Because the server cannot check an image for honesty, you should be aware of this risk. You should allow the insertion of images only if the users' convenience outbalances the risk of eavesdropping or tricking the users to steal their authentication data.
In addition, I'd like to say a few words about performing concealed actions on behalf of the administrator when the XSS vulnerability is eliminated.
To avoid such an attack, you can check the HTTP Referer header when the administrator performs certain actions. However, this approach will be inconvenient if the administrator's browser doesn't send this header or if the header is removed by a proxy server.
A better solution could involve inserting some additional data, which would identify the administrator, into every link or form. In addition, these data should be dynamic. For example, the session ID can be used as such data, but you shouldn't send it with the HTTP GET method. Rather, insert the hash of the session ID or other information related to the session ID.
The script responsible for authentication should check these data in addition to user authentication. If the attacker doesn't know the data, he or she won't be able to create a malicious form or URL.
In more complicated cases, the system can require the administrator to confirm dangerous actions with his or her password, which should be sent with the HTTP POST method.
In general, the rule requiring all URLs that can be affected by users (in messages and in other places) to begin with the name of a valid protocol (HTTP or FTP) or with a slash will protect the system against this type of attack.