Preventing Buffer Overruns

Preventing Buffer Overruns

The first line of defense is simply to write solid code! Although some aspects of writing secure code are a little arcane, preventing buffer overruns is mostly a matter of writing a robust application. Writing Solid Code (Microsoft Press, 1993), by Steve Maguire, is an excellent resource. Even if you're already a careful, experienced programmer, this book is still worth your time.

Always validate all your inputs the world outside your function should be treated as hostile and bent upon your destruction. Likewise, nothing about the function's internal implementation, nothing other than the function's expected inputs and output, should be accessible outside the function. I recently exchanged mail with a programmer who had written a function that looked like this:

void PrintLine(const char* msg) { char buf[255]; sprintf(buf, "Prefix %s suffix\n", msg);  }

When I asked him why he wasn't validating his inputs, he replied that he controlled all the code that called the function, he knew how long the buffer was, and he wasn't going to overflow it. Then I asked him what he thought might happen if someone else who wasn't that careful needed to maintain his code. Oh, he said. This type of construct is just asking for trouble functions should always fail gracefully, even if unexpected input is passed into the function.

Another interesting technique I learned from a programmer at Microsoft is something I think of as offensive programming. If a function takes an output buffer and a size argument, insert a statement like this:

#ifdef _DEBUG memset(dest, 'A', buflen); //buflen = size in bytes #endif

Then, when someone calls your function and manages to pass in a bad argument for the buffer length, their code will blow up. Assuming you're using the latest compiler, the problem will show up very quickly. I think this is a great way to embed testing inside the application and find bugs without relying on complete test coverage. You can accomplish the same effect with the extended variants of the Strsafe.h functions, which are covered later in this chapter.

Safe String Handling

String handling is the single largest source of buffer overruns, so a review of the commonly used functions is in order. Although I'm going to cover the single-byte versions, the same problems apply to the wide-character string-handling functions. To complicate matters even further, Windows systems support lstrcpy, lstrcat, and lstrcpyn, and the Windows shell contains similar functions, such as StrCpy, StrCat, and StrCpyN exported from Shlwapi.dll. Although the lstr family of calls varies a little in the details and the calls work with both single-byte and multibyte character sets depending on how an LPTSTR ends up being defined by the application, they suffer from the same problems as the more familiar ANSI versions. Once I've covered the classic functions, I'll show how the new strsafe functions are used.

strcpy

The strcpy function is inherently unsafe and should be used rarely, if at all. Let's take a look at the function declaration:

char *strcpy( char *strDestination, const char *strSource );

The number of ways that this function call can blow up is nearly unlimited. If either the destination or the source buffer is null, you end up in the exception handler. If the source buffer isn't null-terminated, the results are undefined, depending on how lucky you are about finding a random null byte. The greatest problem is that if the source string is longer than the destination buffer, an overflow occurs. This function can be used safely only in trivial cases, such as copying a fixed string into a buffer to prefix another string.

Here's some code that handles this function as safely as possible:

/ *This function shows how to use strcpy as safely as possible.*/ bool HandleInput(const char* input) { char buf[80]; if(input == NULL) { assert(false); return false; } //The strlen call will blow up if input isn't null-terminated. //Note that strlen and sizeof both return a size_t type, so the //comparison is valid in all cases. //Also note that checking to see if a size_t is larger than a //signed value can lead to errors more on this in Chapter 20 //on conducting a security code review. if(strlen(input) < sizeof(buf)) { //Everything checks out. strcpy(buf, input); } else { return false; } //Do more processing of buffer. return true; }

As you can see, this is quite a bit of error checking, and if the input string isn't null-terminated, the function will probably throw an exception. I've had programmers argue with me that they've checked dozens of uses of strcpy and that most of them were done safely. That may be the case, but if they always used safer functions, there would be a lower incidence of problems. Even if a programmer is careful, it's easy for the programmer to make mistakes with strcpy. I don't know about you, but I write enough bugs into my code without making it any easier on myself to add even more bugs. I know of several software projects in which strcpy was banned and the incidence of reported buffer overruns dropped significantly.

Consider placing the following into your common headers:

#define strcpy Unsafe_strcpy

This statement will cause any instances of strcpy to throw compiler errors. The new strsafe header will undefine functions like this for you, unless you set a #define STRSAFE_NO_DEPRECATE before including the header. I look at it as a safety matter I might not get tossed off my horse often, but I always wear a helmet in case I am. (Actually, I did get tossed off my horse in September 2001, and it's possible the helmet saved my life.) Likewise, if I use only safe string-handling functions, it's much less likely that an error on my part will become a catastrophic failure. If you eliminate strcpy from your code base, it's almost certain that you'll remove a few bugs along with it.

strncpy

The strncpy function is much safer than its cousin, but it also comes with a few problems. Here's the declaration:

char *strncpy( char *strDest, const char *strSource, size_t count );

The obvious problems are still that passing in a null or otherwise illegal pointer for source or destination will cause exceptions. Another possible way to make a mistake is for the count value to be incorrect. Note, however, that if the source buffer isn't null-terminated, the code won't fail. You might not anticipate the following problem: no guarantee exists that the destination buffer will be null-terminated. (The lstrcpyn function does guarantee this.) I also normally consider it a severe error if the user input passed in is longer than my buffers allow that's usually a sign that either I've screwed up or someone is trying to hack me. The strncpy function doesn't make it easy to determine whether the input buffer was too long. Let's take a look at a couple of examples.

Here's the first:

/*This function shows how to use strncpy. A better way to use strncpy will be shown next.*/ bool HandleInput_Strncpy1(const char* input) { char buf[80]; if(input == NULL) { assert(false); return false; } strncpy(buf, input, sizeof(buf) - 1); buf[sizeof(buf) - 1] = '\0'; //Do more processing of buffer. return true; }

This function will fail only if input or buf is an illegal pointer. You also need to pay attention to the use of the sizeof operator. If you use sizeof, you can change the buffer size in one place, and you won't end up having unexpected results 100 lines down. Moreover, you should always set the last character of the buffer to a null character. The problem here is that we're not sure whether the input was too long. The documentation on strncpy helpfully notes that no return value is reserved for an error. Some people are quite happy just to truncate the buffer and continue, thinking that some code farther down will catch the error. This is wrong. Don't do it! If you're going to end up throwing an error, do it as close as possible to the source of the problem. It makes debugging a lot easier when the error happens near the code that caused it. It's also more efficient why execute more instructions than you have to? Finally, the truncation might just happen in a place that causes unexpected results ranging from a security hole to user astonishment. (According to The Tao of Programming [Info Books, 1986], by Jeffrey James, user astonishment is always bad.) Take a look at the following code, which fixes this problem:

/*This function shows a better way to use strncpy. It assumes that input should be null-terminated.*/ bool HandleInput_Strncpy2(const char* input) { char buf[80]; if(input == NULL) { assert(false); return false; } buf[sizeof(buf) - 1] = '\0'; //Some advanced code scanning tools will flag this //as a problem best to place a comment or pragma //so that no one is surprised at seeing sizeof(buf) //and not sizeof(buf) 1. strncpy(buf, input, sizeof(buf)); if(buf[sizeof(buf) - 1] != '\0') { //Overflow! return false; } //Do more processing of buffer. return true; }

The HandleInput_Strncpy2 function is much more robust. The changes are that I set the last character to a null character first as a test and then allow strncpy to write the entire length of the buffer, not sizeof(buf) 1. Then I check for the overflow condition by testing to see whether the last character is still a null. A null is the only possible value we can use as a test; any other value could occur by coincidence.

sprintf

The sprintf function is right up there with strcpy in terms of the mischief it can cause. There is almost no way to use this function safely. Here's the declaration:

int sprintf( char *buffer, const char *format [, argument] ... );

Except in trivial cases, it isn't easy to verify that the buffer is long enough for the data before calling sprintf. Let's take a look at an example:

/* Example of incorrect use of sprintf */ bool SprintfLogError(int line, unsigned long err, char* msg) { char buf[132]; if(msg == NULL) { assert(false); return false; } //How many ways can sprintf fail??? sprintf(buf, "Error in line %d = %d - %s\n", line, err, msg); //Do more stuff such as logging the error to file //and displaying it to user. return true; }

How many ways can this function fail? If msg isn't null-terminated, Sprintf LogError will probably throw an exception. I've used 21 characters to format the error. The err argument can take up to 10 characters to display, and the line argument can take up to 11 characters. (Line numbers shouldn't be negative, but something could go wrong.) So it's safe to pass in only 89 characters for the msg string. Remembering the number of characters that can be used by the various format codes is difficult. The return from sprintf isn't a lot of help either. It tells you how many characters were written, so you could write code like this:

if(sprintf(buf, "Error in line %d = %d - %s\n", line, err, msg) >= sizeof(buf)) exit(-1);

There is no graceful recovery. You've overwritten who knows how many bytes with who knows what, and you might have just overwritten your exception handler pointer! You cannot use exception handling to mitigate a buffer overflow; your attacker can cause your exception-handling routines to do their work for them. The damage has already been done the game is over, and the attacker won. If you're determined to use sprintf, a nasty hack will allow you to do it safely. (I'm not going to show an example.) Open the NUL device for output with fopen and call fprintf and the return value from fprintf tells you how many bytes would be needed. You could then check that value against your buffer or even allocate as much as you need. The _output function underlies the entire printf family of calls, and it has considerable overhead. Calling _output twice just to format some characters into a buffer isn't efficient.

_snprintf

The _snprintf function is one of my favorites. It has the following declaration:

int _snprintf( char *buffer, size_t count, const char *format [, argument] ... );

You have all the flexibility of _sprintf, and it's safe to use. Here's an example:

/*Example of _snprintf usage*/ bool SnprintfLogError(int line, unsigned long err, char * msg) { char buf[132]; if(msg == NULL) { assert(false); return false; } //Make sure to leave room for the terminating null! //Remember the off-by-one exploit? if(_snprintf(buf, sizeof(buf)-1,  "Error in line %d = %d - %s\n", line, err, msg) < 0) { //Overflow! return false; } else { buf[sizeof(buf)-1] = '\0'; } //Do more stuff, such as logging the error to a file //and displaying it to user. return true; }

It seems that you must worry about something no matter which of these functions you use: _snprintf doesn't guarantee that the destination buffer is null-terminated at least not as it's implemented in the Microsoft C run-time library so you have to check that yourself. To make matters even worse, this function wasn't part of the C standard until the ISO C99 standard was adopted. Because _snprintf is a nonstandard function, which is why it starts with an underscore, four behaviors are possible if you're concerned about writing cross-platform code. It can return a negative number if the buffer was too small, it can return the number of bytes that it should have written, and it might or might not null-terminate the buffer. If you're concerned about writing portable code, it is usually best to write a macro or wrapper function to check for errors that will isolate the differences from the main line of code. Other than remembering to write portable code, just remember to specify the character count as one less than the buffer size to always allow room for the trailing null character, and always null-terminate the last character of the buffer.

Concatenating strings can be unsafe using the more traditional functions. Like strcpy, strcat is unsafe except in trivial cases, and strncat is difficult to use because the length specifier is the amount of room remaining in the buffer, not the actual size of the buffer. Using _snprintf makes concatenating strings easy and safe. As a result of a debate I had with one of my developers, I once tested the performance difference between _snprintf and strncpy followed by strncat. It isn't substantial unless you're in a tight loop doing thousands of operations.

Standard Template Library Strings

One of the coolest aspects of writing C++ code is using the Standard Template Library (STL). The STL has saved me a lot of time and made me much more efficient. My earlier complaint about there not being a native string type in C is now answered. A native string type is available in C++. Here's an example:

/*Example of STL string type*/ #include <string> using namespace std; void HandleInput_STL(const char* input) { string str1, str2; //Use this form if you're sure that the input is null-terminated. str1 = input; //If you're not sure whether input is null-terminated, you can //do the following: str2.append(input, 132); // 132 == max characters to copy in //Do more processing here. //Here's how to get the string back. printf("%s\n", str2.c_str()); }

I can't think of anything easier than this! If you want to concatenate two strings, it's as simple as

string s1, s2; s1 = "foo"; s2 = "bar" //Now s1 = "foobar" s1 += s2;

The STL also has several really useful member functions you can use to find characters and strings within another string and truncate the string. It comes in a wide-character version too. Microsoft Foundation Classes (MFC) CStrings work almost exactly the same way. The only real caveat I need to point out about using the STL is that it can throw exceptions under low-memory conditions or if you encounter errors. For example, assigning a NULL pointer to an STL string will land you in the exception handler. This can be somewhat annoying. For example, inet_ntoa takes a binary Internet address and returns the string version. If the function fails, you get back a NULL.

On the other hand, a large server application at Microsoft recently used a string class for all strings. An expensive and thorough code review by a well-respected consulting company failed to find even a single buffer overrun in the code where the string handling was done by a string class. It's also possible to take advantage of object typing to declare a wrapper over a string named User Input. Now any place in your app where you see a UserInput object referenced, you know exactly what you're dealing with and know to handle it with care.

gets and fgets

A chapter on unsafe string handling wouldn't be complete without a mention of gets. The gets function is defined as

char *gets( char *buffer );

This function is just a disaster waiting to happen. It's going to read from the stdin stream until it gets a linefeed or carriage return. There's no way to know whether it's going to overflow the buffer. Don't use gets use fgets or a C++ stream object instead.

Using Strsafe.h

During the Windows Security Push conducted during the early part of 2002, we realized that the existing string-handling functions all have some problem or another and we wanted a standard library that we could start using on our internal applications. We thought that the following properties (excerpted from the SDK documentation) were desirable:

  • The size of the destination buffer is always provided to the function to ensure that the function does not write past the end of the buffer.

  • Buffers are guaranteed to be null-terminated, even if the operation truncates the intended result.

  • All functions return an HRESULT, with only one possible success code (S_OK).

  • Each function is available in a corresponding character count (cch) or byte count (cb) version.

  • Most functions have an extended ( Ex ) version available for advanced functionality.

    NOTE
    You can find a copy of Strsafe.h in the companion content in the folder Secureco2\Strsafe.

Let's consider why each of these requirements is important. First, we'd always like to know the size of the buffer. This is readily available by using sizeof or msize. One of the most common problems with functions like strncat is that people don't always do their math properly always taking the total buffer size gets us out of all those confusing calculations. Always null-terminating buffers is just general goodness why the original functions don't do this is something I can't understand. Next, we have a number of possible results. Maybe we truncated the string, or maybe one of the source pointers was null. With the normal library functions, this is hard to determine. Note the gyrations we go through to safely use strncpy. As I pointed out previously, truncating the input is normally a serious failure now we can tell for sure what the problem was.

One of the next most common problems, especially if you're dealing with mixed Unicode and ANSI strings, is that people mistakenly think that the size of the buffer in bytes is the same as the size in characters. To overcome this, all the strsafe functions come in two flavors: number of bytes and number of characters. One cool feature is that you can define which of the two you want to allow in your code. If you'd like to standardize using one or the other, set STRSAFE_NO_CB_FUNCTIONS or STRSAFE_NO_CCH_FUNCTIONS (but obviously not both).

Next, there are extended functions that do nearly anything you can think of. Let's take a look at some of the available flags:

  • STRSAFE_FILL_BEHIND_NULL

    Sets a fill character that pads out the rest of the available buffer. This is great for testing your callers to check whether the buffer is really as large as they claim.

  • STRSAFE_IGNORE_NULLS

    Treats a null input pointer as an empty string. Use this to replace calls like lstrcpy.

  • STRSAFE_FILL_ON_FAILURE

    Fills the output buffer if the function fails.

  • STRSAFE_NULL_ON_FAILURE

    Sets the output buffer to the null string ( ) if the function fails.

  • STRSAFE_NO_TRUNCATION

    Treats truncation as a fatal error. Combine this with one of the two flags listed above.

The extended functions do incur a performance hit. I'd tend to use them in debug code to force errors to show up and when I absolutely need the extra functionality. They also have some other convenient features, like outputting the number of characters (or bytes) remaining in the buffer and providing a pointer to the current end of the string.

Here's one of the best features of Strsafe.h: unless you define STRSAFE_NO_DEPRECATE, all those nasty old unsafe functions will now throw compiler errors! The only caution I have is that doing this on a large code base late in a development cycle will cause a lot of thrash and possibly destabilize your app. If you're going to get rid of all the old functions, it's probably best to do it early in a release cycle. On the other hand, I'm more afraid of security bugs than any other kind of bug, so prioritize your risks as you think appropriate. See http://msdn.microsoft.com/library/en-us/winui/winui/windowsuserinterface/resources/strings/usingstrsafefunctions.asp for full details and a place you can download this update.

The following code samples show a before and after scenario, converting C run-time code to use strsafe:

// CRT code utterly unsafe void UnsafeFunc(LPTSTR szPath,DWORD cchPath) { TCHAR szCWD[MAX_PATH]; GetCurrentDirectory(ARRAYSIZE(szCWD), szCWD); strncpy(szPath, szCWD, cchPath); strncat(szPath, TEXT("\\"), cchPath); strncat(szPath, TEXT("desktop.ini"),cchPath); }

// Safer strsafe code bool SaferFunc(LPTSTR szPath,DWORD cchPath) { TCHAR szCWD[MAX_PATH]; if (GetCurrentDirectory(ARRAYSIZE(szCWD), szCWD) && SUCCEEDED(StringCchCopy(szPath, cchPath, szCWD)) && SUCCEEDED(StringCchCat(szPath, cchPath, TEXT("\\"))) && SUCCEEDED(StringCchCat(szPath, cchPath, TEXT("desktop.ini")))) { return true; } return false; }

A Word of Caution About String-Handling Functions

Safer string-handling functions, such as those offered by strsafe, still require you to engage the gray matter. Take a look at the following strsafe code fragment. Can you spot the flaw?

char buff1[N1]; char buff2[N2]; HRESULT h1 = StringCchCat(buff1, ARRAYSIZE(buff1), szData); HRESULT h2 = StringCchCat(buff2, ARRAYSIZE(buff1), szData);

Look at the second argument to both calls to StringCchCat. The second call is incorrect. It is populating the buff2 variable, based on the size of buff1. The corrected code should read

char buff1[N1]; char buff2[N2]; HRESULT h1 = StringCchCat(buff1, ARRAYSIZE(buff1), szData); HRESULT h2 = StringCchCat(buff2, ARRAYSIZE(buff2), szData);

The same applies to the n versions of the C run-time functions. Michael and I often joke about spending a month converting all calls to strcpy and strcat to strncpy and strncat, respectively, and then spending the next month fixing the bugs because of the massive code change. What's wrong with this code?

#define MAXSTRLEN(s) (sizeof(s)/sizeof(s[0])) if (bstrURL != NULL) { WCHAR szTmp[MAX_PATH]; LPCWSTR szExtSrc; LPWSTR szExtDst; wcsncpy( szTmp, bstrURL, MAXSTRLEN(szTmp) ); szTmp[MAXSTRLEN(szTmp)-1] = 0; szExtSrc = wcsrchr( bstrURL, '.' ); szExtDst = wcsrchr( szTmp , '.' ); if(szExtDst) { szExtDst[0] = 0; if(IsDesktop()) { wcsncat( szTmp, L"__DESKTOP", MAXSTRLEN(szTmp) ); wcsncat( szTmp, szExtSrc , MAXSTRLEN(szTmp) );

The code looks fine, but it's a buffer overrun waiting to happen. The problem is the last argument to the string concatenation functions. The argument should be, at most, the amount of space left in the szTmp buffer, but it is not. The code always passes in the total size of the buffer; however, the effective size of szTmp is shrinking as data is added by the code.



Writing Secure Code
Writing Secure Code, Second Edition
ISBN: 0735617228
EAN: 2147483647
Year: 2001
Pages: 286

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net