|
|
||
|
|
||
|
|
||
Thank you for choosing my book. I have much to tell you, and I'd like to share my experience with you. In my opinion, the information in this book will be interesting both for novice Web programmers and for experts. You haven't read the book yet, so I'd like to tell you a little about it.
As you might have guessed, the key issues of the book are protection of and attack on a Web application.
You will probably agree that coverage of the security of Web applications should involve a detailed analysis of an attacker's actions. Without knowledge of the attacker's
This book doesn't
To protect your system well, you need to know your enemy. This is why each problem is described in this book from two sides: the attacker's and the defender's.
Note Chapter 8 describing a conceptual virus. Creation of this virus doesn't entail any consequences because it cannot be reproduced. It is a purely theoretical issue. I insist that you don't treat it as a malicious program but study it to learn useful information about Web security.
Thank you for your interest to my book.
Marcel Nizamutdinov
|
|
||
|
|
||
|
|
||
|
|
||
|
|
||
|
|
||
Consider a system or a script. As with any other object in the world, its behavior depends on external and internal conditions. Among internal conditions are the server settings, the type of server, the type of database used in the system, the content of the environment
External conditions are the data sent to the server using the HyperText Transfer Protocol (HTTP). Examples of such data are the GET , POST , and COOKIE parameters. In addition, some headers sent by a client to the server according to HTTP are examples of such data. These settings are specified and changed by the client, and the script will receive them asenvironment variables.
Fortunately, an external
Consider a complex system consisting of many interrelated
| Definition |
In this context, content means the content of a HyperText Markup Language (HTML) page. |
Dynamic content can be defined as a response of the system to changes in its external conditions. This response can be
documented
(i.e., explicitly described or logically
Dynamic content is fraught with threat. A site based completely on static content, that is, including only static HTML pages, will not be vulnerable to attacks on scripts because it has no scripts. By definition, a static system doesn't respond to changes in external conditions; therefore, it has a documented response.
However, you shouldn't think that a static site is invulnerable to all types of attacks. For example, it is possible for a malicious person to attack the site through other services, such as vulnerabilities in other Web sites that are physically located on the same server but are components of another system. In addition, attacks on the Web server are possible. In this book, I describe only Web attacks, that is, attacks on scripts and applications accessible using HTTP.
So, dynamic content is the origin of all holes in Web applications. One obvious solution to the problem could involve abandoning dynamics in the Web. However, the contemporary Web would be
| Definition |
A stable system is a system with a documented response to any change in external conditions. |
It appears that this definition, which I learned as a student, is a clue to writing secure Web applications. A system can work well in normal conditions.
Messages will be added to a forum, a search in a database will return results, and so on. What's more, the system will pass all tests for functioning in normal conditions, that is, in conditions, in which a user doesn't interfere between the browser and the server but just clicks links and sends forms with valid data. In such conditions, the system will work well.
As you can see, interaction between a user and a system, or, in other words, changing the external conditions of the system, can be of two types.
The first type is valid HTTP
| Definition |
An
HTTP request
is a data set sent by a client to a Web server in accordance with HTTP. The data contain the address of the
|
It is recommended that you test the system's behavior in a situation, in which a user examines the HTML code received from the Web server and sends abnormal requests, for example, enters invalid data into form fields.
Consider a few examples.
The script http://localhost/1/1.php returns a person's name stored in a database with an ID. The ID is sent as a GET parameter with the name id .
The system will normally respond to valid ID values:
http://localhost/1/1.php?id=1
http://localhost/1/1.php?id=2
http://localhost/1/1.php?id=100
Therefore, you could say the system works correctly. To be more precise, it correctly responds to correct requests.
In this example, you can see that a request without parameters causes a field to appear, into which you should enter a person's ID. If you enter an integer, you'll see either the person's name or the message telling you that no record was found.
This is an implementation of a simple procedure of retrieving information from the simplest database, a table with two
How will this script behave in other conditions? What will happen if somebody enters data other than an integer into the ID field? The documentation to the system doesn't describe the system's response. You could expect the system to detect the invalid ID and return an error message. However, you should test it.
Try http://localhost/1/1.php?id=a, and you'll see the following message:
Warning: mysql_fetch_object(): supplied argument is not a valid MySQL result resource in x:\localhost.php on line 15 No records were found.
You might be wondering what this means, how an attacker can use this information, and how you should defend your system. I'll comprehensively explain these issues in
This warning message shows that the system improperly responds to an ID that isn't an integer.
Consider another example.
The script http://localhost/1/2.php produces almost the same result as the first one, but it looks for a name in a file rather than in a database. The file name is an ID with the TXT extension.
Test this script by sending the following requests:
http://localhost/1/2.php?id=1
http://localhost/1/2.php?id=2
http://localhost/1/2.php?id=3
You'll see that the script normally responds to normal requests that contain IDs of people whose files are available on the disk.
Test the script's behavior in abnormal situations:
http://localhost/1/2.php?id=9999
http://localhost/1/2.php?id=a
You'll get messages like the following:
Warning: fopen(data/5.txt): failed to open stream: No such file or directory in x:\localhost.php on line 12 Warning: fread(): supplied argument is not a valid stream resource in x:\localhost.php on line 13 Warning: fclose(): supplied argument is not a valid stream resource in x:\localhost.php on line 15
As you can see, the system responds improperly to a request containing an ID that isn't integer or an ID that doesn't
How can an attacker use the information contained in these messages? Again, I'll provide answers in subsequent chapters.
Both examples
If the scripts would return messages that say requests are invalid, this would be a documented response. Instead, you receive the interpreter's messages that say scripts contained errors.
You could see a lot of such examples in everyday life. People focus attention on how a system works in normal external conditions and almost always ignore that the external conditions can be illogical.
Filtration is most important when writing stable systems.
The notion of filtration is often used when discussing vulnerabilities.
| Definition |
Filtration involves changing the contents of a parameter to avoid an undocumented response from the script. |
Sometimes, the script
A character or a sequence of
To demonstrate how SQL responds to the backslash character, I suggest that you make a few SQL requests:
mysql> select 'test - \'tested\' ';
+-----------------+
test - 'tested'
+-----------------+
test - 'tested'
+-----------------+
1 row in set (0.00 sec)
mysql>
As you can see, the quotation marks preceded by backslashes were displayed normally. In contrast, the following request will cause an error message:
mysql> select 'test - 'tested' '; ERROR 1064: You have an error in your SQL syntax. Check the manual that corresponds to your MySQL server version for the right syntax to use near '' '' at line 1 mysql>
Obviously, different parameters should be filtered differently. For example, an unmatched back quotation mark in a string can be crucial in some cases. In other cases, an improper parameter type can cause a system error. This
In essence, filtration can be of two types. These are filtration by barring suspicious parameter values and filtration by setting parameters to safe values.
Filtration by barring is a matter of
This behavior of the protection would seem normal if you remember that a quotation mark makes an SQL request invalid. However, it cannot be justified by common sense.
In my opinion, filtration by setting to safe values is the best. However, it sets all suspicious parameters to a safe form, thus changing their values.
You could think that filtration is a clue to the problem of Web application safety. However, this is not the case.
Consider an example:
http://localhost/1/3.php
. A design specification for this script could be as
Write a script that displays the name of a person whose ID is entered. The data are stored in files that have
3.TXT
.
If no person with the specified ID is found, an appropriate message should be returned.
The ID is sent using the HTTP GET method. If an ID is missing, the script should display a form suggesting that the user enter his or her ID.
Here is the code of this script:
<?
if(empty($id))
{
echo "
<form>
enter id (integer)<input type=text name=id>
<input type=submit>
</form>
";
exit;
};
if(file_exists("data/$id.txt"))
{
$f=fopen("data/$id.txt", "r");
$s=fread($f, 1024);
echo $s;
fclose($f);
}
else
echo "records not found";
?>
Does this script conform to the design specification? It
However, the design specification doesn't tell whether the ID should be an integer.
The script completely implements the design specification. For example, if the ID is omitted, an appropriate form is displayed. When the script receives the ID, it looks for a file with the corresponding name.
If the file isn't found, the "records not found" message is displayed, and the script doesn't try to read any data.
Finally, it the file is found, its contents are sent to the browser.
This behavior seems invulnerable. It seems impossible to imagine a situation that would cause an error. If the file isn't found or the name is invalid, the script sends a message to the browser. Note that this message is generated by the script rather than by the interpreter.
You should test this. Make the following requests:
http://localhost/1/3.php?id=1
http://localhost/1/3.php?id=2
http://localhost/1/3.php?id=3
As a result, you'll receive corresponding records. Even if you send an ID that isn't integer but a corresponding file exists (e.g., http://localhost/1/3.php?id=abc), you'll receive the record you could expect.
Now specify IDs that are missing from the database or contain characters invalid in a file name (in the file allocation table, or FAT).
Try the following requests:
http://localhost/1/3.php?id=999
http://localhost/1/3.php?id=abcde
http://localhost/1/3.php?id=%3F
http://localhost/1/3.php?id=%3C
http://localhost/1/3.php?id=%7C
Note that the sequences %3F, %3C, and %7C code the characters ?, <, and I, respectively. So, these characters are sent as IDs.
As you can see, the system's responses are adequate. It returns an error message telling you that no record was found.
However, despite such a stable behavior, the script has a vulnerability
Remember that some special character sequences are used to change the directory and that nothing
Suppose you know that the file
TEST.TXT
is located in the parent directory of the current subdirectory. You cannot access it using HTTP, but you're
To test how this trick works, make the following request: http://localhost/1/3.php?id=../test . You'll see the contents of the file in the browser window. So, why did the protection let you read the file rather than return a message telling that the file hadn't been found? The reason is that the file is present in the system. What's more, this file name is valid for file functions such as file_exists() or fopen() .
This is a crucial vulnerability. I'll try to explain the cause of this vulnerability. The system seems safe, all erroneous situations being excluded. Nevertheless, there is an obvious hole in the system.
The incorrect design specification is responsible for this hole. A perfect one would be as follows:
Write a script that displays the name of a person whose ID is entered. The data are stored in files that have names identical to IDs and the TXT extensions. For example, the data of a person whose ID is 3 are stored in the file
3.TXT
. The ID is a sequence of digits, uppercase or lowercase letters, underscores, minuses, or periods. If an invalid ID is received, the script should return an error message.
You could specify more valid characters.
A script
I will now summarize the main principles of writing secure code and the main causes of vulnerabilities.
In fact, there is only one cause. A user can interfere between the browser and the server, and he or she can send illogical values of parameters to the server.
The principle that follows from this is simple: Don't trust the data received from outside the server.
A design specification for a script should be brief, but it should take into account all dangerous situations. A script that complies with a correct design specification will be invulnerable to Web attacks.
If a programmer decides to write a script on his or her own, or if a design specification is written by a person incompetent in security issues who uses the wrong terms, the programmer should write or at least keep in mind a detailed design specification that takes into account all security aspects.
All this entails the following principle: The security of a Web application should be thought out at the stage of writing design specification, before the first line of code is written.
A person who
From the
There are a few types of vulnerabilities that are entirely programmers' fault.
These vulnerabilities cannot be foreseen in a design specification
For example, in C and C++, such a
In PHP, a popular programming language for Web applications, a similar problem
The next chapters describe how you can use vulnerabilities of this type, how you should eliminate them, and how you can write secure code.
|
|
||
|
|
||
|
|
||