All CC String Buffers Annotated with SAL | Writing Secure Code for Windows Vista (Best Practices (Microsoft))

All C/C++ String Buffers Annotated with SAL

The goal of the Standard Annotation Language (SAL) is to enable programmers to explicitly state the contracts between implementations (callees) and clients (callers) that are implicit in the C and C++ source code. The main benefit of SAL is that you can find more code bugs with some upfront work. We have found that the process of adding SAL annotations to existing code can also find bugs as the developer questions the assumptions previously made about how the function being annotated works. By this we mean that as a developer adds annotations to a function, she must think about how the function works in more detail than simply assuming it was written correctly. This process finds assumption flaws.

Important

You should know that SAL comes in two flavors: one is called declspec syntax, and the other is attribute syntax. Windows Vista uses the former, and the examples in this book also follow the declspec syntax. However, Microsoft Visual Studio 2005 does support both syntaxes.

When you annotate a function with SAL, any code that calls that function will get the benefit of the annotation. To this end, we have annotated the majority of C runtime functions included with Visual Studio 2005 and the Windows functions included in the Windows Vista Software Development Kit. This means you’ll get the benefit of the annotations added by Microsoft, and you might find bugs lurking in your code.

SAL defines proper use of buffers. A C/C++ pointer can be used to represent the address of a single object, or an array of objects. Sometimes the size is known at compile time, and sometimes it’s only known at run time. Because C/C++ pointer types are overloaded, you can’t rely on the type system to help you program with buffers properly! That’s why we have SAL.

The Windows Vista security quality gate stated that “all mutable non-constant string buffers for new features” must be annotated with SAL. In reality, this translated into all functions that read from or write to a buffer.

SAL by Example

Probably the best way to explain SAL is by showing an example. Let’s say you have a C/C++ function:

 void FillString(   TCHAR* buf,   size_t cchBuf,   TCHAR ch) {   for (size_t i = 0; i < cchBuf; i++)     buf[i] = ch; }

We won’t insult your intelligence by explaining what the function does, but what makes this code interesting is that two of the arguments, buf and cchBuf, are closely linked: buf should be at least cchBuf characters long. If buf is not as big as cchBuf claims it is, then the FillString function could overflow the buf buffer. If you compile the code below with Visual Studio 2005, at warning level 4 (/W4) you’ll see no warnings and no errors, yetclearly a buffer overrun vulnerability is in this code:

 TCHAR *b = (TCHAR*)malloc(200*sizeof(TCHAR)); FillString(b,210,'x');

What SAL does is allow a C or C++ developer to inform the compiler of the relationship between the two arguments by using syntax such as this:

 void FillString(   __out_ecount(cchBuf) TCHAR* buf,   size_t cchBuf,   TCHAR ch) {   for (size_t i = 0; i < cchBuf; i++)     buf[i] = ch; }

Note the use of __out_ecount(n) just before buf in the argument list. This is a macro that wraps some very low-level SAL constructs you should never have to worry about, but in essence __out_ecount(n) means this:

“buf is an out parameter, which means it will be written to by the callee, and buf cannot be NULL. The length of buf is ‘n’ elements, in this case cchBuf TCHARS.”

When this code is compiled with a Microsoft C++ compiler that has the /analyze option, such as the version shipped with Visual Studio 2005 Team Suite or any of the Visual Studio Team System products, or cl.exe included with the Windows Vista Software Development Kit, you’ll see output like this:

 c:\code\saltest\saltest.cpp(54) : warning C6203: Buffer overrun for non-stack buffer 'b' in call to 'FillString': length '420' exceeds buffer size '400' c:\code\saltest\saltest.cpp(54) : warning C6386: Buffer overrun: accessing 'argument 1', the writable size is '200*2' bytes, but '420' bytes might be written: Lines: 53, 54 c:\code\saltest\saltest.cpp(54) : warning C6387: 'argument 1' might be '0': this does not adhere to the specification for the function 'FillString': Lines: 53, 54

There are many other SAL macros, including:

__in

The function using __in will only read from the single-element buffer, and the buffer must be initialized (not NULL); as such __in is exactly the same as __in_ecount(1) and __in is implied if the argument is a const. In fact, __in is somewhat redundant, and it’s better to use const because the compiler can perform better optimizations in some cases.

The following function prototype shows how you can use __in:

 BOOL AddElement(__in ELEMENT *pElement);

__out

The function using __out fills a valid (not NULL) buffer, and the buffer can be dereferenced by the calling code. The following function prototype shows how you can use __out.

 BOOL GetFileVersion(   LPCWSTR lpsFile,   __out FILE_VERSION *pVersion);

__in_opt

The function using __in_opt expects an optional buffer, meaning the buffer can be NULL. The following code shows how you could use __in_opt. In this example, if szMachineName is NULL, the code will return operating system information about the local computer.

 BOOL GetOsType(   __in_opt char *szMachineName,   __out MACHINE_INFO *pMachineInfo);

__inout

The function using __inout expects a readable and writeable buffer, and the buffer must be initialized by the caller. Here is some sample code that shows how you might use __inout.

 size_t EncodeStream(   __in HANDLE hStream,   __inout STREAM *pStream);

__inout_bcount_full(n)

The function expects a buffer that is n-bytes long that is fully initialized on entry and exit. Note the use of bcount rather than ecount: “b” means bytes and “e” means elements; for example, a Unicode string in Windows that is 12 characters (an element in SAL parlance) long is 24 bytes long. There is an element variant too: __inout_ecount_full(n). The following code example takes a BYTE * that points to a buffer to switch from big-endian format to little-endian format so it makes sense that the incoming buffer is fully initialized and is a fully initialized buffer on function exit. You’ll also see another SAL macro in the function prototype, __out_opt, which means the data will be written to by the function, but it can be NULL. In the case of a NULL exception point, the function will not return exception data to the caller.

 void ConvertToLittleEndian(      __inout_bcount_full(cbInteger) BYTE *pbInteger,      DWORD cbInteger,      __out_opt EXCEPTION *pException);

__inout_bcount_part(n,m)

The__inout_bcount_part(n,m) annotation is a variant of __inout_bcount_full, but rather than initializing the entire buffer, the code will only fill up to m-bytes in the destination buffer. The following function prototype implies that Read() will copy no more than cbCount bytes into pBuff, and the actual number of bytes is reflected in pcbHowMuchRead. Note that the function will return one size_t argument as the last argument, unless it’s NULL (the “opt” part of the SAL macro name).

 HRESULT Read(      __inout_bcount_part(cbCount, *pcbHowMuchRead) LPVOID pBuff,      __in size_t cbCount,      __out_ecount_opt(1) size_t *pcbHowMuchRead);

There is an element variant too: __inout_ecount_part(n).

__deref_out_bcount(n)

The function argument marked with __deref_out_bcount(n) when dereferenced will be set to an uninitialized buffer of n-bytes, in other words, *p is initialized, but **p is not.

 HRESULT StringCbAlloc(                 size_t cb,                 __deref_out_bcount(cb) char **ppsz) {    *ppsz = (char*)LocalAlloc(LPTR, cb);    return *ppsz ? S_OK : E_OUTOFMEMORY; }

There is an element variant, too, __deref_out_ecount(n).

SAL’s usefulness extends beyond function arguments. It can also be used to detect errors on function return. If you look closely at the list of warnings earlier in this document, you’ll notice a third one:

 c:\code\saltest\saltest.cpp(54) : warning C6387: 'argument 1' might be '0': this does not adhere to the specification for the function 'FillString': Lines: 53, 54

This bug really has little to do with any function argument, rather it occurs because the code calls malloc() but does not check that the return value from malloc() is non-NULL. If you look at the function prototype for malloc() in malloc.h in Visual Studio 2005 or the Windows Vista Software Development Kit, you’ll see this:

 __checkReturn __bcount_opt(_Size) void *__cdecl malloc(__in size_t _Size);

Because the return from malloc() could be NULL (because the function may fail), we use a __bcount_opt(n) SAL macro (note the use of opt in the macro name). If we change the code that calls malloc() to check the return is non-NULL prior to calling FillString, the C6387 warning goes away. Don’t confuse an optional NULL return value with __checkReturn; the latter detects whether you ignored the result altogether. For example:

 size_t cb = 10 * 12; malloc(cb);

This code will yield this warning when compiled with /analyze:

 c:\code\saltest\saltest.cpp(30) : warning C6031: Return value ignored: 'malloc'

How to Use SAL in Existing Code

This section describes the recommended work flow for adding SAL annotations to existing code.

Make sure you have a solid code baseline that compiles cleanly.
Determine ahead of time which SAL annotations you are going to use. For example, are you going to focus on annotating read and write buffers, or write buffers exclusively?
Make sure to #include “sal.h” in your code.
Annotate the necessary function prototypes in your headers.

A great SAL documentation source is the comment block at the start of the sal.h header file, as well as the SAL annotations already in the Windows Vista Software Development Kit headers and the C runtime headers included with Visual Studio 2005.

In addition to using SAL, proper use of Hungarian notation (Simonyi 1999) around string buffers can help prevent problems. If you know that cbCount is a count of bytes, and cchCount is a count of characters, this can be helpful in spotting problems during code review. While it’s possible to go overboard with Hungarian notation, using it with buffer size variables will save you a lot of trouble.