Threats and Vulnerabilities | Extreme Exploits: Advanced Defenses Against Hardcore Hacks (Hacking Exposed)

Now it is time to discuss the specifics of how your applications and services are vulnerablewhat to look for within your applications and how to solve these problems in advance of an attack. Throughout this section, the vulnerabilities will be grouped and organized into several different classifications. Each grouping will contain a discussion of how each works, what the vulnerability is in detail, and how to either resolve the issue entirely or mitigate your risk. Keep in mind that these vulnerabilities exist within your applications because of programmatic weaknessesunexpected conditions, lack of error handling, or flawed or nonexistent input validation. All of these defects or potential defects in your application are the result of careless software development. In many cases, examples are provided to help you better visualize the process or details of the vulnerability.

Different types of vulnerabilities have common characteristics. For example, language injection (SQL injection) and buffer overflows are both caused by similar problems with application code: failure to validate data inputs from users and/or the program itself (that is, making assumptions about the inputs that the user or applications will provide). There are other types of vulnerabilities as well, though not all resulting in a root shell or crash of the application. The following classes of vulnerabilities are common:

Attacks on sensitive information
Attacks on the application from local users
Attacks on the application from libraries, third parties, and the application runtime environment

The following vulnerabilities will be discussed, which in general fall into one of these three categories. Remember, this isn't a complete list of every type of vulnerability that is out there:

Input Validation Almost all vulnerabilities are caused by this area of programmatic deficiency. Put simply, this means validating user or application inputs properly.
SQL Injection and String Concatenation A specialized form of input validation error where inputs from one language are used to create outputs in another. Proper validation and generation of these output languages is critical.
Buffer Overflows/Overruns Also usually the result of input validation errors, in which program execution is subverted using special knowledge about the methods of the internal languages' approach to memory allocation and storage on the specific target platform.
Race Conditions Attacks on assumptions about the order of flow of an application, or synchronization of interdependent resources.
Memory and Resource Exhaustion Memory allocation or resource allocation errors that, if exploited, often result in denial of service.
Future Vulnerabilities Other advanced attacks on applications, and future concerns.

Input Validation

One of the most important aspects in writing defensive code (or breaking it) is the validation of user inputs, whether they come from the user's interactive shell via a remote login, from command line parameters, from a network socket, or from a third-party application.

Below is an example of a simple, but worthless program with faulty input validation written in C:

 #include <stdio.h> char* get_line(int max, char* buf) { fgets(buf, max, stdin); } int main(int argc, char* argv[]) {         char buf[100];         get_line(1024, buf);         fprintf(stdout, "%s", buf); }

When you execute this program, if you provide more than 99 characters of data as input, the program will usually crash and most operating systems generally produce a "Segmentation Fault" error message on the command line where this was executed. This isn't a useful program as it does nothing more than mismanage memory (insert witty pun about gigantic software companies here). It does, however, demonstrate fundamentally what happens in the real world by making obvious one of the worst consequences of not validating user inputs: in this case, allowing a user to enter 1024 characters into the buffer that can only hold 100 characters (including the ending null ('\0') character). By injecting shellcode at the right location in the program, a buffer overflow can be createdone of the most severe consequences of input validation errors. This is especially dangerous when paired with the strategy outlined above concerning local privilege escalation. If this file were Set-UID to root, you could have just provided the malicious user with a root-level command shell.

Buffer overflows will be discussed in greater depth later in this section. Input validation, however, isn't limited to only stack-based overruns. Problems can occur with overriding function pointers, virtual table addresses, and any other function that writes to memory and doesn't validate the size of the memory and/or input before doing so.

SQL Injection

Although a relatively new and lately popular form of attack, SQL injection is an attack that is strictly caused by data input validation errors. The results of this exploit are unexpected SQL language generation and further execution. The most common way of exploiting SQL injection errors are through programs that build SQL commands by concatenating strings containing SQL commands or portions thereof to form one command to be executed.

By understanding how the strings are being concatenated together and by taking advantage of robust syntax of these languages, entire blocks of valid SQL calls can be unexpectedly sent to a database server revealing sensitive information or destroying data.

Does your application make calls to a SQL database, and if so, are SQL calls being generated using string concatenation? Unfortunately, this is a common software development practice among less experienced or non-security-savvy software developers. For example:

 sqlCmd = "SELECT * FROM Users WHERE UserID = ' + UserID + "'"

If your code looks like this, you should be worried! Let's take a closer look. How is the UserID field processed ? What is its value? What happens if UserID contains something other than the usual alphabet characters you might expect it to contain? Were the contents of the UserID value coming from a web application and inspected/stripped of invalid content before this command (as listed earlier), or have you relied upon the database manager to handle the input validation?

Because of the ability in many SQL database manager applications to provide inline comments, the exploiter can often invalidate whole blocks of SQL text after the input parameter by commenting it out with SQL comment tags. For example, the UserID field, instead of providing a value such as "10234," could be changed to:

 " ''; DROP TABLE somedata -"

Given that the UserID variable holds the data presented above, the full SQL statement would be generated as follows , where the new (injected) SQL is in bold.

 SELECT * FROM Users WHERE USERID = '';  DROP TABLE somedata -  '

This surprises many people, as they would think that the ending tickmark would make it hard to do anything else with this. Unfortunately, this is a perfectly valid SQL statement in many database engines and won't generate any errors ( assuming the user has access to the DROP table command and can access the table somedata ). Look closer at what happened : the value of the UserID variable, which was passed into your application from a variable in a web application (from a form field via an HTTP post), is now a dangerous command to the SQL engine to delete data. The part listed in boldface in the listing above could very well have been any valid SQL statement, not just something destructive.

Many SQL query libraries can also redirect output to a file or read the input from a file, very easily giving the attacker a method of querying for data they should not have access to and having a method of reporting on that data. For example, we could release our injected SQL with:

 SELECT * FROM customer_cc \o /var/www/htdocs/.../.ccl ; --'

This, for example, may very well print out all of your customers' credit card numbers (hopefully they were encrypted!) in a file http:///var/www/htdocs/ /.ccl, where the exploiter is hoping they can reach/download via their web browser because it might be under the document root and not be protected. This simple maneuver completely subverted your database and application security controlsall because of the lack of adequate (simple) input validation.

Because of the interactive nature of these exploits, and because few administrators are actively monitoring the output of SQL error message or warnings that are logged, the exploiter is free to attempt several different types of commands and options until he or she can learn enough to find out what type of commands are available and what files are exploitable. A command such as sp_help or /h may be all the miscreant needs to see what type of SQL libraries and databases are being used.

So what can you do to prevent these types of exploits? The answer falls into three categories:

Check the inputs. Do not send arbitrary inputs to the database. Check them first and determine them to be valid and appropriate. Using libraries that support prepared statements is the easiest way. For example, create a statement such as:
```
 queryStatement = "SELECT * FROM Users WHERE UserID=?"; // then provide that value to the prepared statement: queryStatement.execute(userID); 
```
If the user attempted any of the exploit examples provided above, they would fail because the statement processor would only allow valid inputs for the dataType of the UserID fieldmeaning no tick marks. One field and one field only, please .

If the library doesn't support prepared statements, create your own functions that handle the validation, in that if there are any invalid inputs (for example, tick marks), the program execution (before going to the SQL database) fails immediately, throwing an exception or error message.

Tip

It is far more safe and efficient to have a "deny all" policy and specific allows for reasonable characters in your applications and input validation techniques. It is far too difficult (especially in multicharacterset environments and internationalization/localization) to enumerate every possible "bad" character. Therefore, it is normally a better practice to describe (and filter for) what is allowed and then subsequently throw everything else away (and/or throw an exception).

Protect the database. Create different users for different rolesprotect the database. Don't allow database tables to be deleted by users who shouldn't be allowed to delete data. This access control should occur both within the application and at the database level for the best security. Give your application a controlled environment when working with the database and don't reply on application-level security to protect your database operations. The database should allow your application (the service account your application user utilizes in order to access the database) to do only certain things, and your application should, in turn , allow the end user to do only certain things.
Protect the operating system. Don't allow the database server to do things it shouldn't by having access to files and directories it doesn't need access to. In the previous example the database server would have created a file in the webroot. Disallow the database server from doing this by running the database server as special user, or with a service account, and give that user just the privileges necessary to functionnothing more. This strategy has been pursued well by the OpenBSD development team. Privilege separation has been achieved by breaking apart common applications into multiple pieces and running each piece as a different user that has only the minimum amount of privilege necessary to perform its functions. The qmail and Postfix mail transport agent software packages use this same strategy.

Buffer Overflows

In recent years buffer overflows have been well documented and have become a common attack vector for malicious exploitation of computer software and systems. Despite the increase in good information about how these exploits work, few people really understand them or can explain them in layperson terms. Buffer overflows are common because the mistakes that make them possible are easy to make, and once found they become an instant target for automated attacks, such as viruses, worms, bots, Trojans, and other malicious software. These overflows aren't easy to find, but they have almost instantaneous and usually very serious security ramifications .

What Exactly Is a Buffer Overrun or Overflow?

A buffer overflow is a specific application vulnerability that is the result of memory management errors in computer applications that use low-level libraries and language features that are not protected properly. One common attack is to overfill a memory buffer that is not large enough to holds its contents, thereby overflowing the buffer (the space that has been set aside for it in memory) and causing execution of unintended and usually malicious code. The results of these types of application defects have been around as long these operating systems and languages existed. In developing applications using these languages, usually the C and Cderivative languages, memory management errors are common. While C is a powerful and versatile language, it can be argued that the popularity of C in particular has been a major cause of security problems, because the language makes it easy to code poorly. Under normal execution, these errors usually generate application defects that cause the program to crash, but not infect your computer or reveal sensitive information. In Windows, this was commonly presented to the user as a "Unhandled Exception" or "General Protection Fault" (GPF) message, after which the program terminated unexpectedly. In UNIX the message is often just the string "Segmentation Fault," followed by immediate program termination. These are the same errors that make buffer overflows possible. They are, essentially , the lack of input validation and the mismanagement of computer memory. Basically, what the operating system is saying is that the program tried to access memory outside of its allocated address space, usually because the buffer overflow operation attempted to write to memory beyond its confined spaces.

To understand more clearly, you need to know some of the general housekeeping and protection the operating system uses. One is confinement for all of the processes that execute in user mode, or any regular user program that executes when in protected mode, including those executed with root or super-user privileges. Each program has different memory requirements for execution and the operating system manages the execution and memory management through operating system calls, or special programmatic interfaces. There must be enough memory for the program code itself (it has to fit into memory in order to be initially executed), plus all the static data (data that doesn't change), plus room for data that is global to the program and accessible by all functions, plus room for data that is allocated and deallocated as the program executes and as requested by the program at runtime. These areas of memory are typically referred to as text, data (sometimes referred to as BSS ), and stack, respectively. All of this memory space is well defined. When the program starts up, it is basically fixed. Now, if additional stack memory is requested during program operation and it is available, it will be provided, thereby increasing the size of memory available to the program. This type of memory within the stack section is called heap memory. No matter what the program does, if a program attempts to access memory outside of these confines, the operating system will not let that happen and will generate an exception and terminate the program. This is what happens when you see the "General Protection Fault" messages in Windows or bus error, or segmentation fault in UNIX. What has likely occurred is that the program contains a bug and has mismanaged memory and attempted to access memory beyond the end of the stack section (at the top of the address space) of memory. Another possibility is that data accessible from the heap has overwritten stack memory, including some special areas of stack memory that are used for flow control of the programcalled the stack pointer and return address. Data for the heap grows up (in memory) as it is allocated, and data for the stack grows down as needed, or as the flow of the program executes. These two special variables in memory (on the stack side) simply hold the location in memory of the next function/code to call, and the location to return to when the function is complete. What happens if data is written to the heap that overflows the buffer allocated for it and the data written extends beyond the heap? If the heap memory overlaps the stack variables, what happens if the return address variable is overwritten with some other location in memory, not the address that it should be to continue normal flow of the program? This is a simple programmatic error and happens quite frequentlybut it can be exploited for more devious trickery .

What usually happens when these programmatic errors are made by the legitimate developers is the program simply crashes because the memory location that is executed (or attempted to be executed) is outside the confines of the protected spaces allocated for the program, and the operating system enforces the confinementas it should. You don't want the whole operating system to crash and lock up as was common with operating systems that didn't run in protected mode, do you? (Having the program crash based upon user input in itself might be really bad, which we will talk about a bit later in this chapter.) Suppose for a moment what happens now instead of crashing is that the program is tricked into executing some other code legitimate code that is designed for your computer and operating system. The real code is in memory stored in that buffer that was overwritten, and the location of the return address points to this location in your buffer (inside the program's address space). It could have been a regular computer program if it wanted to be, but right at that moment of that function call return, the operating system is tricked into executing this tricky codeshellcode. Shellcode is the common name used to refer to this injected computer program, which, as the name would have it, originally was purposed with the goal of producing a shell, or an interface to a command interpreter, so the exploiter could type more commands and see what else was available on your system. This is commonly used with programs of higher privileges (Set-UID programs, for example) because it can lead to local privilege escalation.

Why does the operating system allow this to occur within the program at all? Shouldn't it better protect itself? That could be a really long discussion, one which we don't have room for herebut the answer is likely yes. However, suffice it to say most of these buffer overflow attacks are targeted at programs written for the C or C++ program language, and these languages are simple and powerful, sometimes too powerful. They were designed to be fastwhen fast and small was really important. If the program wasn't fast and small, the computer wouldn't function at all, or would have been so expensive you couldn't afford it. Now it seems that matters much less, but nevertheless, operating systems and many of the most popular software written today are still developed using these lower-level languages (C and C++). Further, these languages contain certain unsafe functions, mostly related to input and output of string data for user inputs, and they do virtually zero (in their default form) to ensure they are used properly. There are basically low-level mechanisms for input and output, which all involve reading and writing data to memory. In reality these vulnerabilities exist because the interface between the language and the program isn't safe, meaning the language doesn't protect itself, and thereby you as the application developer and/or user aren't protected.

A buffer overflow takes advantage of the way in which the C language translates the code for function calls (these unsafe ones) into machine instructions and the alignment or location of memory variables that it needs to make the call to this function. This includes passing the input parameters, returning a value, and then jumping back to the original code for continued execution. To understand how the shellcode ultimately gets executed, it is important to understand how variable data is stored in the stack area of memory. The data for these local stack-based variables grow up in memory (into higher memory locations as allocated) whereas the dynamically allocated memory grows down from the heap memory segment. Unfortunately, special variables (the return value and the Stack Frame Pointer (SPF)) are stored in the area of memory above these user-defined stack variablesin between the user variables on the stack and user variables on the heap. As mentioned, since these stack variables grow up, there exists a case where, if user variables overflow or can be written to inappropriately by putting more data into them than memory was allocated, they can extend into these special and critical variables. The diagram in Figure 18-5 demonstrates the allocation of memory to support a normal function call, the overflow of the user-defined variables, and the jump and execution of the shellcode because the return value from the function has been overwritten to point to another memory location.

Figure 18-5: Allocation and alignment of memory for function call

Tip	Several operating system and/or compiler extensions assist in reducing or preventing buffer overflow attacks. For more information see Chapter 7's section "Buffer Overflow Prevention."

If the diagram in Figure 18-5 doesn't make sense, now you would have to really understand how the language is translated into machine instructions, and that is an exercise we defer to other fine texts on the subject. Try reading one of the excellent books mentioned in the "Recommended Reading" section at the end of the chapter.

Writing these shellcodes and finding and exploiting these buffer overflow vulnerabilities on a particular platform is not an easy task, but it happens frequently enough. These are often the reports you read from CERT, or the reason you are getting a Windows Update. A popular vulnerability scanner, as discussed in greater detail in Chapters 1215, currently contains modules to test for hundreds of applications that contain or have been found (at one point in time) to contain buffer overflows. They also test for other types of exploits, but these in particular tend to be very severe problems, especially if they can occur remotely. The problem is compounded by the fact that once a buffer overflow is found, the exploiter doesn't necessarily have to write all the shellcode themselves , as there are many readily available for most platforms that have already been written and new ones are shared rapidly among the miscreant underworld.

So what does this buffer overflow attack really look like? What is this shellcode exactly? Here is a very simple example, one that exploits a mock program developed only for an example to be exploited. Inheriting the previous very simple C program that allocates a buffer and then reads into it, we have changed it slightly to read data from the command line. Then, shellcode is injected into a buffer that isn't large enough to hold all the data and doesn't check the length of the input properly. This buffer overflow overriding the return address of the function call and the new address points to code to do something else (create our shell). First, look at the code and then see what happens with a long string of input valuesthe usual crash:

 #include <stdio.h> int main(int argc, char* argv[]) {         char buf [100] ;         if (argc > 1) {                  strcpy (buf, argv[1]);         }         fprintf (stdout, "buf=%s\n", buf); } $ cc -o ex3 ex3.c $ ./ex3 234343434343434343434343434343434343434343434333434343434343434343434 343434343434343434343434343433434343434343434343434344 buf=23434343434343434343434343434343434343434343433343434343434343434 3434343434343434343434343434343433434343434343434343434344 Segmentation fault

Now if the input is changed to something like the following shellcode:

 $CODE =$'047704770477047704770477047704770477047704770477047704770477047704770477047704770477047704770477047704770477047704770477047704770477047704770477047704770477047704770477047704770477047704770477047'

Below, the program is executed again, this time with the shellcode in the environment variable passed into the program as argument/parameter 1. It is subsequently read and copied into the same buffer:

 *ex3 $CODE buf=