| ||
Before we get too far along we should discuss the methods of passing arguments on a stack. In essence, a function call has to push arguments (if not a void function) onto the stack, push the current processor's instruction pointer (EIP or RIP) (the pointer to where the instruction being executed is) onto the stack, and perform a subroutine call. Use the stack yet again for any local data and then return to where it left off while unwinding the stack. There are three basic methods to this. From a high-level language such as C/C++ this is taken for granted, but from the low level of assembly language this has to be done carefully or the stack and program counter will be corrupted.
We are going to examine function calls using a 32-bit processor, as that is what most of you are currently using. Thus, each argument that gets pushed onto the stack is 4 bytes in size . An item such as a double- precision floating-point, which uses 8 bytes, is actually pushed as two halves lower 4 bytes, upper 4 bytes. When the processor is in 64-bit mode, 8 bytes are pushed on the stack.
int hello(int a, int b) { int c = a + b; return c; } int i = hello(1, 2);
The function call to hello is straightforward:
00401118 push 2 0040111A push 1 0040111C call hello 00401121 add esp,8
Once the instruction pointer (EIP) arrives at the first byte of the function hello, the stack will look similar to this:
Register | Address (NN+3) | HexValue | Description |
---|---|---|---|
0012FF00h | 00000002 | Arg#2 | |
0012FEFCh | 00000001 | Arg#1 | |
ESP= | 0012FEF8H | 00401121 | Return address |
0012FEF4H |
EIP= | 004010D0 | hello() |
The function hello looks similar to the following. I have left the addresses for each line of assembly for reference but they are not needed.
; Set up stack frame 004010D0 push ebp ; Save old ebp 004010D1 mov ebp,esp ; Set local frame base 004010D3 sub esp,4
Let us peek at the stack one more time and note the changes:
Register | Address (NN+3) | HexValue | Description |
---|---|---|---|
0012FF00h | 00000002 | Arg#2 | |
0012FEFCh | 00000001 | Arg#1 | |
0012FEF8H | 00401121 | Return address | |
EBP= | 0012FEF4H | ??? | (old EBP) |
ESP= 0012FEF0H Local arg ' c ' |
0012FEECH | |||
EIP= | 004010E8 | hello() |
The EBP register is used to remember where the ESP was last, and the ESP is moved lower in memory, leaving room for the local stack arguments and positioned for the next needed push.
; Do the calculation a+b 004010E8 mov eax,dword ptr [ebp+8] 004010EB add eax,dword ptr [ebp+0Ch] ; Restore stack frame 004010F1 mov esp,ebp ; Restore esp 004010F3 pop ebp ; Restore ebp 004010F4 ret ; Restore eip
So upon returning, anything lower than ESP in stack memory is essentially garbage, but the instruction pointer (EIP) is back to where it can continue in the code. But the stack pointer still needs to be corrected for the two arguments that were pushed.
00401118 push 2 0040111A push 1 0040111C call hello 00401121 add esp,8 ;2*sizeof(int)
They can either be popped:
pop ecx pop ecx
or, more simply, just adjust the stack pointer for two arguments, four bytes each:
add esp,8
So in a C declaration (CDECL) type function call, the calling function corrects the stack pointer for the arguments it pushed. One other item to note is that immediate values {1, 2} were pushed on the stack. So the stack was used for the arguments and for the instruction pointer.
Let us now examine the standard calling convention using this same code sample:
00401118 push 2 0040111A push 1 0040111C call hello
You will note that there is no stack correction upon returning. This means that the function must handle the stack frame correction upon returning.
; Restore stack frame 004010F1 mov esp,ebp ; Restore esp 004010F3 pop ebp ; Restore ebp 004010F4 ret 8 ; Restore eip
In reality, the return instruction RET handles the stack correction by adjusting the return address by the number of bytes specified by the immediate value. In the previous snippet, it was adjusted by 8 bytes.
Let us now examine the fast calling convention using this same code sample. On a MIPS or PowerPC processor this is actually a very fast method of calling functions, but on an 80x86 it is not quite so fast. On those platforms there are 32 general-purpose registers of which a portion of them are used as stack arguments. As long as the number of arguments is reasonable the registers are used. When there are too many, the stack-like mechanism is used for the overage. On the 80x86 there are very few general-purpose instructions available in place of stack arguments for 16/32-bit mode. For example, under VC6 only two registers are available ECX and EDX at which point the stack is used for the additional arguments.
mov edx,2 ; arg#2 Register used mov ecx,1 ; arg#1 Register used call hello
You will notice that the arguments were actually assigned to registers and the stack was only used to retain the program counter (EIP) for the function return. Since the values are already in registers, there is no need for the function to access them from the stack or copy them to a register.
When three arguments are used, however:
i = hello(1, 2, 3); push 3 ; arg#3 Stack used mov edx,2 ; arg#2 Register used mov ecx,1 ; arg#1 Register used call hello
the arguments that were pushed on the stack are stack corrected upon return by the function; this is the same as the fast call mechanism!
mov esp,ebp pop ebp ret 4 ; One 4-byte arg to be popped.
It is very important to realize that both the calling routine and the function itself must be written using the same calling convention. These can all be used within a single application but can get very confusing as to which was used where, and so consistency is important or your code will fail.