The Inline Directive | Inside Delphi 2006 (Wordware Delphi Developers Library)

The inline directive, introduced in Delphi 2005, gives you advanced control over procedure and function compiling. The inline directive enables you to increase the speed of small functions and procedures. Inlined procedures and functions are faster, but they increase the size of the executable.

To successfully use the inline directive, you have to know what happens when you call a procedure or a function and what effect the inline directive has on your code.

When a procedure or function is called, it has to allocate memory for all local variables and parameters it has. The memory for local variables and parameters is allocated from the stack, and this allocation is called stack frame creation. The stack is a segment of memory that the application uses when working with variables. The stack is automatically allocated by the application at application startup. By default, all Delphi applications allocate 16 KB and can allocate a maximum of 1 MB of memory for the stack. Although you can change these values, as shown in Figure 5-10, you shouldn't do so unless there is a very good reason.

image from book
Figure 5-10: Stack size settings

The stack frame is a piece of stack memory where the parameters and local variables of the procedure temporarily exist. When the procedure finishes, the parameters and local variables on the stack frame are automatically deallocated. This is the reason you can't use local variables and parameters outside of the procedure or function in which they are declared.

All these allocations and deallocations aren't done magically. While the process may appear magical from the Delphi programmer's point of view, the Delphi compiler has to generate machine language instructions to do these things at run time. So, when a procedure or a function is compiled, the executable contains not only the machine language version of the procedure's logic but also some entry and exit code that manages these stack allocations and deallocations. Also, for every procedure call we make, Delphi has to generate machine language instructions that do parameter passing and the actual procedure call.

Let's take a look at what machine language instructions are generated for the very simple MyAbs function that returns an absolute value of an integer. The Delphi code of the function is displayed in Listing 5-16.

Listing 5-16: An inlined function

unction MyAbs(I: Integer): Integer; begin   if I < 0  then     Result := -I   else     Result := I; end; begin   x := MyAbs(-20); end.

For the MyAbs call in the main block, Delphi generates the following three instructions (compiler optimizations are turned off):

mov eax,$ffffffec call MyAbs mov [$0040565c],eax

The first instruction moves the number –20 into the eax register. This is necessary because the MyAbs function expects the value for the I parameter to be located in the eax register. The second instruction obviously calls the MyAbs function, and the last instruction copies the result of the function to the x variable. The $0040565c value is the address of the x variable in memory (note that this value isn't constant).

The instructions generated for the MyAbs function are:

// begin push ebp mov ebp,esp add esp,-$08 mov [ebp-$04],eax // if I < 0  then cmp dword ptr [ebp-$04],$00 jnl $00403a19 // Result := -I mov eax,[ebp-$04] neg eax mov [ebp-$08],eax jmp $00403a1f // Result := I; mov eax,[ebp-$04] mov [ebp-$08],eax mov eax,[ebp-$08] // end; pop ecx pop ecx pop ebp ret

The first two instructions deal with setting up the stack frame. The third instruction, add esp,-$08, allocates 8 bytes of memory from the stack (enough space to save two integer values). The more logical instruction would be to use sub esp,$08 to allocate the space from the stack, but the developers of the Delphi compiler used a little trick here. Adding a negative value gives you the same result as subtracting a positive value, except adding executes faster. The fourth instruction moves the value of the I parameter to the stack.

The second section of the generated code performs a test to see if the value of the I parameter needs to be converted to a positive value.

// if I < 0 then cmp dword ptr [ebp-$04],$00 jnl $00403a19

The first instruction compares the I parameter on the stack with 0. The second instruction is the jump-not-less instruction that determines whether the value of the I parameter needs to be converted to a positive value. If the value of the I parameter is larger than 0, the value doesn't need to be converted and the function copies the original parameter value to the Result function:

// Result := I; mov eax,[ebp-$04] mov [ebp-$08],eax mov eax,[ebp-$08]

The first instruction moves the value of the parameter to the eax register (in actuality, it moves the parameter value to the function's Result). The second parameter copies the Result value to the other stack location that the function allocated at the beginning. This is needed because the function, in the end, returns the value located at the ebp-$08 memory location.

If the parameter value has to be converted to a positive value, the function executes the third section of the code:

// Result := -I mov eax,[ebp-$04] neg eax mov [ebp-$08],eax jmp $00403a1f

The function uses the neg instruction to change the sign of the parameter value and moves the negated value to the ebp-$08 stack location. Then it executes the jmp (jump) instruction that jumps to the end of the function and executes the last instruction that moves the changed value to the Result function:

mov eax,[ebp-$08]

The last four instructions remove the local variables and parameters allocated on the stack at the beginning of the function. These instructions restore the stack to the state before the function call. This way, the caller's stack is restored, and the caller can continue executing like nothing happened to the stack. The final instruction, ret, returns the control to the caller.

A large number of instructions in the generated code are involved in reading and writing values to the stack. Actually, there are more instructions that deal with reading and writing values to the stack than there are instructions that perform the function logic.

The inline directive completely changes the way Delphi generates code for the function. When we mark a function or procedure with the inline directive, Delphi no longer generates the call for the function but copies the function body to the place of the call. This way, the compiler doesn't have to generate code that creates the stack frame and sets up the stack frame with appropriate values, move values from the stack to the Result, or generate the code that returns to the caller and restores the caller's stack frame.

Using the inline directive on small functions and procedures can drastically increase the speed of the application, especially if these functions and procedures are used in tight loops. When you mark a function with the inline directive, all instructions from the function are always copied to the place of the call, which causes the final executable to be larger. When you mark small functions with the inline directive and turn on compiler optimizations in the Project Options dialog box, this increase is absolutely insignificant.

Here is the inlined version of the MyAbs function:

function MyAbs(I: Integer): Integer; inline; begin   if I < 0  then     Result := -I   else     Result := I; end;

When the function is inlined, nothing is generated for the function. The instructions are only generated at the call site. If you turn on the compiler optimizations and inline small functions like this one, the compiler will generate first-class, extremely fast code:

mov eax,$ffffffec test eax,eax jnl $00403a6c neg eax mov ebx,eax

Normally, you would have three instructions that only call the function, plus many more instructions in the function itself. With inlining and optimizations, you get five instructions that do the whole thing.