Defensive Programming | Network Programming for Microsoft Windows , Second Edition (Microsoft Programming Series)

< Day Day Up >

Defensive programming is the practice of trying to anticipate where errors can occur in programs, and then adding code to identify or work around the issue to avoid program failures or security holes. Not only does this make programs more reliable and secure, but in the early stages, it makes debugging the programs easier. In many cases, an error isn’t apparent where it actually occurred, but where the error was propagated in the system.

Because this practice is so important, a variety of techniques are collected and discussed here together.

Using Safe Functions to Avoid Buffer Overflow

Buffer overflows (or overruns) are a huge concern in the security industry because in some cases they can permit external applications to take control of a system. A buffer overrun occurs when more space is used for a given resource than was actually reserved for it by the application. Buffer overruns occur not only in heap variables, but also for stack variables, allowing the stack to be manipulated by a devious application.

Let’s look at an example of how a buffer overrun can occur. Consider the following fragment of code that is used to extract a username from a buffer received through a socket:

void getUsername( ... ) {   char buffer[40];   retBytes = recv( sock, buffer, 256, 0 );   if (!strncmp(buffer, "Username: ", 10)) {     strcpy( username, buffer[10] );     ...

What happens in the previous code if the length of the username in the buffer is greater than the previous 40 bytes reserved for it? If this happens, a buffer overrun writes over whatever follows the buffer’s allocation in memory (in this case, the stack). We could correct this situation as follows, using both error checking and the safe version of strcpy:

void getUsername( ... ) {   char buffer[MAX_BUFFER];   retBytes = recv( sock, buffer, 256, 0 );   if (!strncmp(buffer, "Username: ", 10)) {     if ( strlen(&buffer[10]) ) {       printf("username overflow detected in getUserName\$$\n");     }     /* Truncate the username... */     strncpy( username, &buffer[10], 38 );     buffer[39] = 0;     ...

In this case, we detect the overflow situation and alert the user via a printf. Additionally, we use the safe strcpy function called strncpy that includes a third argument defining the maximum number of characters that can be copied into buffer.

Other safe functions exist with the standard C library including snprintf (for sprintf), strncmp (for strcmp), strncat for strcpy, and others. In all cases, the safe functions should be used instead.

Rigorously Checking Error Returns

This item was covered previously in this chapter, but from the perspective of defensive programming, it is one of most important from the aspect of internal consistency. Failing to recognize error returns from internal functions makes debugging very difficult because isolating an error that has been propagated through other functions is never easy.

Rigorously Checking Input and Output Parameters

In addition to error returns, a function should validate both what it receives as well as what it provides. From an error propagation perspective, debugging can be made simpler by identifying where an error first occurs rather than trying to debug the later effects of the error. C provides the assert function that can be used to detect and abort on erroneous situations. Other less catastrophic mechanisms can be used as well. Consider the following example:

int validateUser( char *username ) {   int ret = -1;   assert( username );   // or...   if ( username == NULL ) {     printf("validateUser:  NULL received as input\$$\n");     return( -1 );   }   ...   if ( ret == -1 ) {     printf("validateUser:  Returning failure for %s\n", username );   }   return( ret ); }

The application could do an assert if a NULL username was provided, or simply test the username and indicate the situation to the user, in addition to returning an error. Later, we could identify when a function failure occurs, and identify this before exiting the function (just in case the caller does not check the return status).

Note that some languages, such as Eiffel, provide language features for this type of checking called interface contracts.

One additional point here is the validation of the internal consistency of a function or application. Emitting error messages when erroneous switch default statements are encountered, or entering nonexistent states within an internal state machine should emit errors to immediately localize any errors.

Declaring String Arrays

A very simple technique that is missed in many applications is the usable size of a declared string. If a developer creates a string of size 10 bytes, only 9 bytes are actually usable by the developer as 1 byte will be used for the NULL terminator. If the string fills up, it’s very easy then to overrun the buffer.

A simple solution to this problem is shown in the following code:

#define MAX_BUFFER_SIZE         10 char myString[MAX_BUFFER_SIZE+1];

We declare the size that we want for our buffer as a symbolic constant (MAX_BUFFER_SIZE) and then when we declare our actual string, we simply use this constant and add one to it. We can now use the MAX_BUFFER_SIZE constant in our application to identify the maximum size of the buffer, but because we’ve added one to the buffer size, the NULL is taken care of and a potential error is removed.

Minimizing Protocol Feedback

Where possible, minimizing feedback from applications that identify version information should be performed. If an external application can identify the version of your networked application, they can then take advantage of any known exploits.

Consider the SSH protocol (secure shell). If an external hacker telnets to the SSH port on a host, it immediately emits the version information, which can then be used to identify if any exploits exist.

 telnet theirdomain.com 22 Trying 192.168.1.1... Connected to theirdomain.com. Escape character is '^]'. SSH-2.01-OpenSSH_4.1.3p2

The purpose of this version information could be to identify the server to the client so that it knows if it’s compatible, or how to tailor the conversation for a given scenario. The issue is that the hacker then knows the version and can work to exploit it with known version-specific exploits.

This issue exists not only in SSH, but in NNTP, HTTP, SMTP, and many other Application layer protocols.

Not emitting version information won’t stop an external hacker trying to take advantage of an exploit, but it will make it more difficult for them because the information won’t be immediately available.

Initializing All Variables

Although this seems to be a simple problem, it’s still very widespread. If a variable isn’t initialized, the default comes from the current contents of the location. Consider the following contrived example:

int validateUser( char *username ) {   int validated;   // Test the username with known users...   // set validated to one if trusted   return( validated ); }

Because the validated variable isn’t initialized, the function can erroneously return a validated status regardless of whether the user was actually trusted. This is because the calling application can load the location in the stack that will be validated with a non-zero value, then call it knowing what it will return. This is, of course, a contrived example, but illustrates how an application can manipulate other functions.

Enabling All Compiler Warnings

A final suggestion for developers is to always enable compiler warnings when building the application (or enable warnings for the interpreter, as is done with ‘perl -w’). Many problems can be found, such as a failure to initialize variables, in the compilation stage as long as warnings are enabled.

< Day Day Up >