General-Purpose Annotations


General-purpose annotations provide PREfast with information about the information flow between a caller and the function that is being called, both in terms of direction of information flow and in providing size and type information that can be checked to detect potential buffer overflows.

Inside Out 

You can use the general-purpose annotations in both driver and nondriver code. General-purpose annotations are defined in Specstrings.h and described with extensive comments in Specstrings_strict.h. Both files are in %wdk%\inc\api.

This section provides guidelines and examples for using the general-purpose annotations and modifiers listed in Table 23-1.

Table 23-1: General-Purpose Annotations
Open table as spreadsheet

Annotation

Usage

__in

__out

__inout

Input and output parameters

_opt

_deref

Annotation modifiers

_ecount(size)

_bcount(size)

Parameter size

_full(size)

_part(size, length)

Partial parameter size

__nullterminated

__nullnullterminated

__possibly_notnullterminated

String annotations

__reserved

Reserved parameters

__checkReturn

Function return value

Inside Out 

PREfast does not interpret certain annotations such as __fallthrough, but these can still be useful when they are applied as comments in code. See the comments in %wdk%\inc\api\Specstrings_strict.h for a complete list of annotations.

 Primitive versus composite annotations  Annotations can be either primitive or composite. A composite annotation is composed of two or more primitive annotations and other composite annotations. This chapter explains some annotations in terms of primitive annotations; however, you should choose composite annotations instead of primitive annotations whenever possible because composite annotations are more resilient to future changes.

Important 

For simplicity, the examples in this chapter show standard functions that are not completely annotated. You should apply all appropriate annotations to your code, as described in this chapter, to ensure that your functions are completely annotated and can be fully analyzed by PREfast. Do not limit the annotations in your source code to only those shown in these examples.

Input and Output Parameter Annotations

C passes parameters by value, so parameters themselves are always input parameters to a function. However, because C can pass a pointer by value, it is impossible to tell just by looking at the function prototype whether the value that is passed by using a pointer is intended as input to the function, output from the function, or both input and output.

The __in, __out, and __inout annotations enable PREfast to check parameters that have these annotations and report any errors if it finds that uninitialized values are used incorrectly:

  • If a parameter is marked __in, then PREfast checks to be sure that the parameter is initialized before the call.

    You can also annotate scalar parameters such as integers or enumerated types with __in. The __in annotation is optional for scalar parameters, but it helps to make code more consistent and readable.

  • If a parameter is marked __out, then PREfast does not check that it is initialized before the call, but assumes that it is initialized after the call and thus is safe to use as an __in parameter to a subsequent function.

  • If a parameter is marked __inout, then PREfast checks to be sure that the parameter is initialized before the call and assumes that it is safe to use as an __in parameter to a subsequent function.

Tip 

The __in annotation is typically all that is needed for opaque types, including KMDF handles such as WDFDEVICE.

The __in and __out Contract

The term "initialized" is used informally when discussing __in and __out, but these annotations actually describe what makes the annotated parameters valid in relation to pre-state and post-state:

  • "Valid" means that all levels of the data structure are initialized and that pointers at all levels of dereference except for the last pointer are non-NULL values, unless the parameter has been explicitly annotated differently.

  • "Pre-state" refers to the state of the analysis just before the call is made. If a variable has been given a value in a prior assignment, then it has that value in the pre-state.

  • "Post-state" refers to the state just after the call has returned. For an __out parameter, the post-state is the one in which the parameter has a new value or a value at all.

Formally, __in means that the value that is being passed as an argument must be valid in pre-state and the function does not change it, and __out means that the function has contracted to return a valid value in post-state and that PREfast can ignore the pre-state.

In pre-state, __out implies that any intermediate pointers that lead to the final value to be modified are individually valid and not NULL, but the parameter that is annotated with __out is not required to be recursively valid in pre-state. For example, suppose you want to pass a pointer to a pointer to a structure s (that is, you want to pass **p). In pre-state, this requires that both p and *p must be valid pointers to other pointers. However, for an __out parameter, **p is not required to be valid-that is, in pre-state, **p is not required to be fully initialized.

For the __out annotation in post-state, the structure at **s must be valid (that is, the structure is expected to be filled in.) The _opt modifier can be used to indicate whether any intermediate levels can be omitted-for example, if *p can be NULL.

__inout means that the parameter must be valid in pre-state and valid in post-state and that both should be checked. It also means that any assumptions that PREfast has made about the value of the parameter-beyond simple validity-are no longer true in post-state. That is, the value is presumed to have changed.

As stated earlier in this chapter, PREfast analyzes both sides of the contract. The __in, __out, and __inout annotations are a definite illustration of that statement. For the function that is being analyzed-the callee-PREfast assumes that the pre-state is true at the entry point and checks that the post-state is achieved at the exit point. Specifically, the analysis of a function starts with the initial state of the parameters. PREfast assumes that any __in parameters are valid at the beginning of the function and checks that __out parameters are valid at the return from the function.

The following two examples show how PREfast analyzes contracts according to annotations. In the following example, the __in annotation causes PREfast to check that the s input parameter is initialized before a strlen call in the following declaration for strlen:

 size_t strlen(__in PCSTR s); 

Another example of checking both sides of the contract is the __nullterminated annotation that is implicit in the string types such as PSTR, PWSTR and so on, as shown in the following example:

 PSTR substitute(__inout_ecount(len) PSTR str,                 __in int len, __in PSTR oldstr, __in PSTR newstr); 

In this example, a function named substitute takes a string as input and substitutes all instances of oldstr with the value of newstr. The function never overruns len bytes, and it ensures that the resulting value of str is always null terminated. Assuming the substitute function does something, the __inout part of the annotation applied to str indicates that the value of str before and after the call is different, but it is valid both before and after the call. PREfast performs the following analysis of this contract:

  • For the caller part of the contract, PREfast checks that str, oldstr, and newstr are all null terminated at the point of the call, as far as can be determined statically. It also checks that the buffer at str is big enough for len bytes.

  • For the callee part of the contract, PREfast checks that no access is made past len bytes into str. PREfast assumes that str, oldstr, and newstr are null terminated. The critical check that PREfast makes is that the final result value of str is null terminated.

Most of the annotations described later in this chapter, including the IRQL, memory, and nonmemory resource driver annotations, have both caller and callee semantics to assure that both sides of the contract are met.

__in, __out, and __inout versus IN, OUT, and IN OUT

In general, you should replace all instances of IN, OUT, and IN OUT with __in, __out, and __inout, respectively. However, do not simply redefine these older macros in terms of the newer annotations.

Although the IN and OUT macros often appear in source code, they have never been given a value and are never validated by any tool or compiler, so they do not always reflect the actual usage of the parameters and could be incorrect. Therefore, these macros might be incorrectly used or placed in existing source code. You should review functions that use IN, OUT, and IN OUT and make sure to place the correct __in, __out, and __inout annotations in the appropriate locations.

Annotation Modifiers

For various reasons related to implementation, many annotations that must be applied to function parameters must be represented as a single macro, rather than as a series of adjacent macros. In particular, this is true for most of the various basic annotations, which should appear as a single composite macro for each parameter.

This is accomplished by adding modifiers to the annotation to compose a more complete annotation. The two most common modifiers, _opt and _deref, are examples of how to create more complex annotations by combining simpler annotations.

The _opt Modifier

The __in annotation does not allow null pointers, but often a function can take a NULL in the place of an actual parameter. The _opt modifier indicates that the parameter is optional; that is, it can be NULL. For example, an optional input parameter-such as a pointer to a structure-would be annotated as __in_opt, whereas an optional output parameter would be coded as __out_opt.

Typically, __in_opt and __out_opt are used for pointers to structures with a fixed size. Additional modifiers can be applied to annotate variable-sized objects, as described in "Buffer-Size Annotations" later in this chapter.

In general, you should replace all instances of the OPTIONAL macro with the _opt modifier. However, because no tool or compiler validates OPTIONAL, check the code carefully to ensure that parameters labeled OPTIONAL actually are optional.

The _deref Modifier

User-defined types such as structures can be declared as parameter types, so it is sometimes necessary to annotate the dereferenced value of a parameter. The _deref modifier indicates that an annotation should be applied to the dereferenced value of a parameter, and not the parameter itself.

For example, consider the following function:

 int myFunction(struct s **p); 

When you pass a pointer such as struct s *p to a function, the memory that *p points to is passed by reference. p itself is passed by value. In this example, the p parameter is a variable of type pointer-to-s that is being passed by reference. **p is a variable of type struct s, *p is a pointer to that variable, and p is a pointer to that pointer.

In this example, the myFunction function is defined to modify *p, the pointer to the variable of type struct s. The function requires that p not be NULL. However, the function allows *p to be NULL-if *p is NULL, the function simply takes no action on *p.

Annotating p as __inout would require that *p be non-NULL. Annotating p as __inout_opt would allow p to be NULL. However, neither of these annotations correctly expresses the intended behavior of myFunction.

Adding the _deref modifier to the annotation applies __inout_opt to the proper dereferenced value of p, as shown in the following example:

 int myFunction(__inout_deref_opt struct s **p); 

This annotation specifies that the _opt annotation applies to *p, which is the dereferenced value of p; that is, *p can be NULL. The _opt annotation does not apply to p itself; that is, p cannot be NULL. Put another way, _deref_opt applies to the parameter that is passed by reference-*p-instead of the address of the reference-p.

The _deref modifier can appear more than once in an annotation, to indicate multiple levels of dereference. For example, __in_deref_deref_opt indicates that **p can be NULL. Many of the examples later in this chapter show the use of _deref with other annotations.

 Note  The __null and __notnull annotations, which explicitly indicate that a particular parameter can be NULL or must not be NULL, are built in to the composite general-purpose annotations such as __inout. It is not necessary to include __null and __notnull in annotations such as the ones in this example.

Buffer-Size Annotations

A variable-sized object is any object that does not carry its own size with it. Many bugs in code, particularly security bugs, are caused by buffer overflows in which a variable-sized object is being passed. The following annotations can be used to express the contract between the caller and the callee about the size of buffers:

  • _ecount(size)

  • _bcount(size)

  • _full(size)

  • _part(size, length)

The contract must specify the size-or where to find it-and how it will be used. This information ensures that neither the caller nor the callee accesses data outside the bounds of the buffer, but the contract can also express the difference between available memory and initialized memory, so that access to uninitialized memory can be detected.

In C, buffers are typically arrays of something. When you describe the size of a buffer in your code, the size can be measured in two ways: as the number of bytes in the buffer or the number of elements in the buffer. For arrays of anything other than char, the size in elements differs from the size in bytes. In most cases-even for arrays of char-the size in elements is more useful and easier to express.

 Note  The sizes of wide character strings such as wchar_t are usually expressed in elements, not bytes. UNICODE_STRING is a notable exception.

Fixed-Size Buffer Annotations

The _ecount(size) and _bcount(size) annotations are used to express the size of a buffer. Use _ecount(size) to express the size of a buffer as a number of elements. Use _bcount(size) to express the size of the buffer as a number of bytes. The size parameter can be any general expression that makes sense at compile time. It can be a number, but it is usually the name of some parameter in the function that is being annotated.

The following example of the memset function shows a typical buffer annotation:

 void * memset(     __out_bcount(s) char *p,     __in int v,     __in size_t s); 

In this example, __out_bcount(s) specifies that the content of the memory at p is set by the function and that the value of s is the number of bytes to be set. Nothing in the C source code tells the compiler that p and s are related in this way. The annotation provides this useful information.

With this information provided by the annotation, PREfast can check the implementation of memset to be sure it never accesses past the end of the buffer-that is, it never accesses more than s bytes into the buffer. Often, PREfast can also check that the value of p+s is within the declared bounds of the array when memset is called. In this case, the buffer size is expressed in bytes because that is what memset expects.

Compare memset with a similar function, wmemset:

 wchar_t * wmemset(     __out_ecount(s) wchar_t *p,     __in wchar_t v,     __in size_t s); 

This example uses __out_ecount to indicate that s is represented in elements of the wchar_t type. If some incorrect code called this function with a byte count-which is an easy mistake to make-the value of s is likely to be twice as large as it should be. With the __out_ecount annotation, PREfast has a good chance of detecting a buffer overrun in the caller and identifying a probable bug.

Note that for __in parameters, the definition of "valid" requires that the whole parameter being passed must be initialized. This also applies to arrays, which are passed by reference. Thus, when you use __in for a parameter that is an array, the whole array must be initialized up to the limit specified by _bcount or _ecount. See "String Annotations" later in this chapter for details about how this applies to null-terminated strings.

The _bcount and _ecount annotations are sufficient to describe __in buffers that are not modified or __out buffers that are fully initialized. For buffers that are partially initialized and that might have the initialized portion extended or reduced in place, you can combine these annotations with the _part and _full modifiers:

  • The _full modifier applies to the entire buffer. For an output buffer, the _full modifier indicates that the function initializes the entire buffer. For an input buffer, the _full modifier indicates that the buffer is already initialized, although this is redundant with other annotations.

  • The _part modifier indicates that the function initializes part of the buffer and explicitly indicates how much.

When you combine these modifiers with __inout buffers and _full(size) or _part(size, length) annotations, you can use them to represent the "before" and "after" sizes of a buffer. Size and length can be constants, or they can be parameters of the function being annotated. The following examples show the use of size and length in buffer annotations:

  • __inout_bcount_full(cb) describes a buffer that is cb bytes in size, is fully initialized at entry and exit, and might be written to by this function.

  • __out_ecount_part(count, *countOut) describes a buffer that is count elements in size and is to be partially initialized by this function. The function indicates the number of elements it initialized by setting *countOut.

Summary of Annotations for Buffers

This section summarizes the annotations that can be combined to describe a buffer. Table 23-2 lists these annotations.

Table 23-2: Annotations for Buffers
Open table as spreadsheet

Level

Usage

Size

Output

Optional

Parameters

omitted

omitted

omitted

omitted

omitted

omitted

_deref

_in

_ecount

_full

_opt

(size)

_deref_opt

_out

_bcount

_part

 

(size, length)

 

_inout

_xcount(expr)

    

The headings in Table 23-2 are described in the following list.

Level in Table 23-2 describes the buffer pointer's level of dereference from the parameter or return value p. Level can be one of the following:

omitted

p is the buffer pointer.

_deref

*p is the buffer pointer. p must not be NULL.

_deref_opt

*p is the buffer pointer. p can be NULL, in which case the rest of the annotation is ignored.

Usage in Table 23-2 describes how the function uses the buffer. Usage can be one of the following:

omitted

The buffer is not accessed. If used on the return value or with _deref, the function provides the buffer and the buffer is uninitialized at exit. Otherwise, the caller must provide the buffer. This should be used only for alloc and free functions.

_in

The buffer is used for input only. The caller must provide the buffer and initialize it.

_out

The buffer is used only for output. If used on the return value or with _deref, the function provides the buffer and initializes it. Otherwise, the caller must provide the buffer and the function initializes it.

_inout

The function may freely read from and write to the buffer. The caller must provide the buffer and initialize it. If used with _deref, the buffer may be reallocated by the function.

Size in Table 23-2 describes the total size of the buffer. This can be less than the space that is actually allocated for the buffer, in which case it describes the accessible amount. Size can be one of the following:

omitted

No buffer size is given. If the type specifies the buffer size-such as with LPSTR and LPWSTR-that amount is used. Otherwise, the buffer is one element long. This must be used with __in, __out, or __inout.

_ecount

The buffer size is an explicit element count.

_bcount

The buffer size is an explicit byte count.

_xcount(expr)

The buffer size cannot be expressed as a simple byte or element count. For example, the count might be in a global variable, in a structure member, or implied by an enumeration. PREfast treats expr as a comment and does not use it to check buffer size. expr can be anything that is meaningful to the reader, such as an actual expression or a quoted string.

Important 

_xcount satisfies the need to annotate a buffer, but causes PREfast to skip actual size checks. You can use _xcount as a placeholder for annotations that become meaningful in future tools. However, use _xcount with caution and restraint because it suppresses potential warnings and analysis.

Output in Table 23-2 describes how much of the buffer is initialized by the function. For __inout buffers, this partial annotation also describes how much is initialized at entry. Omit this category for __in buffers-they must be fully initialized by the caller.

Output can be one of the following:

omitted

The type specifies how much is initialized. For example, a function that is initializing an LPWSTR must null-terminate the string.

_full

The function initializes the entire buffer.

_part

The function initializes part of the buffer and explicitly indicates how much.

Optional in Table 23-2 describes whether the buffer itself is optional. This annotation modifier can be one of the following:

omitted

The pointer to the buffer must not be NULL.

_opt

The pointer to the buffer might be NULL. It is checked by PREfast before being dereferenced.

Parameters in Table 23-2 gives explicit counts for the size and length of the buffer. Size and length can be either constant expressions or an expression that involves a parameter-usually other than the one being annotated. Length should refer to the resulting value of an __out parameter. Parameters can be one of the following:

omitted

There is no explicit count. Use when neither _ecount nor _bcount is used.

(size)

This is the buffer's total size. Use with _ecount or _bcount but not with _part.

(size, length)

This is the buffer's total size and initialized length. Use with _ecount_part and _bcount_part.

Tips for Applying Annotations to Buffers

When applying annotations to buffers, remember the following:

  • Each buffer annotation describes a single buffer with which the function interacts: where it is, how large it is, how much is initialized, and what the function does with it.

    The buffer can be a string, a fixed-length or variable-length array, or just a pointer.

  • You should use only a single buffer annotation for each parameter.

  • Some combinations do not make sense as buffer annotations. See the buffer annotation definitions in Specstrings.h for a list of meaningful combinations.

Buffer Annotation Examples

The examples in Listing 23-8 and 23-9 show uses of buffer annotations.

Listing 23-8: Example of annotations for in-place substitution on a counted array of characters

image from book
 void substUCS8(     __inout_ecount_part(*s, *s) wchar_t *buffer,     __inout size_t *s); 
image from book

Listing 23-9: Example of annotations for a buffer size that cannot be expressed as a simple expression

image from book
 GetString(     __out_xcount("23, 42, or 26, depending on 'which'")     LPSTR *msgBuffer,     __in which); 
image from book

The example in Listing 23-8 does an in-place substitution on a counted array of characters. The old size is the input value for *s, and the new size is the output. This function might be used to substitute UCS-8 for non-ASCII characters.

The example in Listing 23-9 shows the use of _xcount to annotate a buffer size that cannot be expressed as a simple expression. There are several better ways to implement this kind of function. This example simply illustrates the use of _xcount. The function returns a string of one of three known lengths, depending on the input parameter which. This annotation causes PREfast to skip size checks on *msgBuffer.

Tip 

Specstrings.h defines a number of similar annotations to specify buffer sizes. If you do not find what you need, read the comments in Specstrings_strict.h to see if the problem you are trying to solve has already been addressed.

String Annotations

C null-terminated strings represent a special case of buffers. The following annotations describe null-terminated strings:

  • __nullterminated

  • __nullnullterminated

  • __possibly_notnullterminated

String annotations are useful when applied to typedef declarations. These annotations enable PREfast to check that the type is used correctly in a function without requiring the programmer to annotate every function parameter that uses the type. See "Annotations on typedef Declarations" earlier in this chapter for information about applying annotations to types.

Inside Out 

 Note  The examples in this section are intended only to illustrate the use of annotations for null-terminated strings. You should always use the safe string functions declared in %wdk%\inc\ddk\ntstrsafe.h for string and UNICODE_STRING manipulations instead of writing your own string manipulation functions.

__nullterminated

Many of the types declared in system header files are already annotated. If you use the appropriate STR type for all functions that take strings described as char * or wchar_t * parameters, it is not necessary to apply the __nullterminated annotations to these types. PSTR, PCSTR, and their "L" and "W" variations all imply that a string is null terminated. You should explicitly apply __nullterminated only to types that are not already annotated with __nullterminated.

The implied use of __nullterminated through use of PSTR or PCSTR types is sufficient for input string buffers. If the parameter is strictly for input, use the CSTR forms because placement of the const modifier must be done inside the typedef.

If a function can create or add to a string, the function must have the actual size of the output buffer so it can avoid overruns. Output buffers should also have an additional _ecount or _bcount count annotation that gives the actual buffer size because __nullterminated by itself does not provide that information. For __out parameters, the count annotation specifies that the resulting string is null terminated and that PREfast should check for buffer overruns. For __inout parameters, the count annotation implies that the buffer is initialized up to the NULL and that the updated value is also null terminated.

For example, the StringCchCopy function copies up to cchDest elements. The __out_ecount annotation specifies that although the string is null terminated-as indicated by the LPSTR type-it does not overflow cchDest bytes, as shown in the following example:

 StringCchCopyA(     __out_ecount(cchDest) LPSTR pszDest,     __in size_t cchDest,     __in LPCSTR pszSrc); 

__nullnullterminated

The __nullnullterminated annotation is intended for the occasional "string of strings" that is terminated by a double null, such as a registry value whose type is REG_MULTI_SZ. Currently, PREfast does not check __nullnullterminated, so this annotation should be considered advisory only. However, __nullnullterminated might be enabled in a future version of PREfast. In the meantime, use a #pragma warning directive or the _xcount annotation to silence PREfast noise related to strings terminated with a double NULL.

__possibly_notnullterminated

Several older functions usually return null-terminated strings but occasionally do not. The classic examples are snprintf and strncpy, where the function omits the null terminator if the buffer is exactly full. These functions are considered deprecated and should not be used. Instead, you should use the equivalent functions declared in StrSafe.h for user-mode applications or NtStrSafe.h for kernel-mode code because they guarantee a null-terminated buffer on success.

However, it might not be possible to completely eliminate this kind of function in existing code, so you should annotate these functions by applying __possibly_notnullterminated to their output parameters, as shown in the following example:

 int _snprintf(    __out_ecount(count) __possibly_notnullterminated LPSTR buffer,    __in size_t count,    __in LPCSTR *format    [, argument] ... ); 

When PREfast encounters a __possibly_notnullterminated annotation, it attempts to determine whether an action was taken to assure null termination of the output string. If it cannot find one, PREfast generates a warning.

Reserved Parameters

Occasionally a function has a parameter that is intended for future use. The __reserved annotation ensures that in future versions, old callers to a function can be reliably detected. This annotation insists that the provided parameter be 0 or NULL, as appropriate to the type.

For example, someday the following function will take a second parameter, but all current use of that parameter should be coded with NULL. The following annotation enables PREfast to check that current callers do not misuse the reserved parameter:

 void do_stuff(struct a *pa, __reserved void *pb); 

Function Return Values

Many functions return a status that indicates whether the function was successful. However, it is common to find code that assumes that a function call is always successful and that does not check the return value. Memory allocators are often in this class, but there are quite a few others as well. For example, malloc is the classic function that should be marked with __checkReturn, as shown in the following example:

 __checkReturn void *malloc(__in size_t s); 

The __checkReturn annotation indicates that the function return value should be checked. PREfast can detect two different errors for a function annotated with __checkReturn:

  • The function return value is simply ignored.

  • The function return value is placed into a variable and the variable is then ignored.

To avoid a warning when calling a function that is annotated with __checkReturn, either use the return value directly in a conditional expression or assign it to a variable that is subsequently used in a conditional expression. Although __checkReturn is traditionally applied to return values, PREfast can detect a __checkReturn annotation that is applied to an __out parameter to insist that the value be examined.

In the rare case when it might make sense to ignore the return value, call the function in an explicit void context: (void)mustCheckReturn(param).

Returning the value to a caller qualifies as successfully checking the return value; however, that parameter or the return value should itself be annotated with __checkReturn so that the caller checks the value.

Kernel-mode drivers should be annotated to check all memory allocations, and the driver should attempt to fail gracefully if a memory allocation fails.

 Note  If you have used the /analyze option in Visual Studio, you might notice that PREfast behaves slightly differently when analyzing functions that are annotated with __checkReturn. Both /analyze and PREfast issue a warning if the function's return value is discarded at the point of the function call. However, PREfast also issues a warning if the function's return value is assigned to a variable and that variable is not used in subsequent code.




Developing Drivers with the Microsoft Windows Driver Foundation
Developing Drivers with the Windows Driver Foundation (Pro Developer)
ISBN: 0735623740
EAN: 2147483647
Year: 2007
Pages: 224

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net