| ||
In the previous few sections, I tried to explain to you how to restore the program source code on the basis of the disassembled listing. Now, you can evaluate and conclude whether I have succeeded when accomplishing this task. This can be achieved by comparing the restored code to the source code. You'll immediately notice that I mistakenly used the DO loop instead of the WHILE loop, which was used in the source code. However, if you compile and build the resulting program, it will work in the same way as the original one. The difference between the texts of the two programs (both are simple) is because the code is optimized by the translator. Because of this optimization, it is principally impossible to restore the source code using the executable code. However, it is still possible to get the program code that correctly describes the program's algorithm. This means that code analysis allows you to understand the idea of the program and what operations it executes. This is the main problem solved in this chapter.
Now, consider a small and trivial C program.
void main () { int i; char a[10]; char b[11]; for(i=0; i<10; i++) { a[i]>= 'a'; *(b+i) = 'a'; printf("%c %c\n", a[i], b[i]); } ExitProcess (0); }
Now, consider the Assembly code of this program obtained using Borland C++ 5.0.
.text:00401108 _main proc near ; DATA XREF: .data:0040A0B8 .text:00401108 .text:00401108 var__18 = BYTE PTR -18H .text:00401108 var_C = BYTE PTR -0CH .text:00401108 argc = DWORD PTR 8 .text:00401108 argv = DWORD PTR 0CH .text:00401108 envp = DWORD PTR 10H .text:00401108 .text:00401108 PUSH EBP .text:00401109 MOV EBP, ESP .text:0040110B ADD ESP, 0FFFFFFE8H .text:0040110E PUSH EBX .text:0040110F XOR EBX, EBX .text:00401111 .text:00401111 loc_401111: ; CODE XREF: _main+30 .text:00401111 MOV [EBP+EBX+VAR_C], 61H .text:00401116 MOV [EBP+EBX+VAR_18], 61H .text:0040111B MOVSX EAX, [EBP+EBX+VAR_C] .text:00401120 PUSH EAX .text:00401121 Movsx EDX, [EBP+EBX+VAR_C] .text:00401126 PUSH EDX ; CHAR .text: 00401127 PUSH OFFSET ACC ; __VA_ARGS .text:0040112C CALL _PRINTF .text:00401131 ADD ESP, 0CH .text:00401134 INC EBX .text:00401135 CMP EBX, 0AH .text:00401138 J1 Short Loc_401111 .text:0040113A PUSH 0 ; uExitCode .text:0040113C CALL ExitProcess .text:00401141 POP EBX .text:00401142 MOV ESP, EBP .text:00401144 POP EBP .text:00401145 RETN .text:00401145 _main ENDP
By the way, the ExitProcess (0) function was introduced into the program text for a quick search of the required fragments in a debugger or in the disassembled code. Even with the naked eye, you'll notice that there was no optimization. Using this text, it doesn't make it difficult to restore the original C program. Now, consider the code optimized by Visual C++ 7.0.
.text:00401000 sub_401000 proc near ; CODE XREF: start+AF .text:00401000 PUSH ESI .text:00401001 MOV ESI, OAH .text:00401006 .text:00401006 loc_401006: ; CODE XREF: sub_401000+18 .text:00401006 PUSH 61H .text:00401008 PUSH 61H .text:0040100A PUSH OFFSET ACC ; "%C %C\N" .text:0040100F CALL SUB_401030 ; PRINTF .text:00401014 ADD ESP, 0CH .text:00401017 DEC ESI .text:00401018 JNZ SHORT LOC_401006 .text:0040101A PUSH 0 ; uExitCode .text:0040101C CALL DS:ExitProcess .text:00401022 POP ESI .text:00401023 RETN .text:00401023 sub_401000 ENDP
I guess that the code presented in Listing 25.7 will surprise you. But what's surprising here? Consider the code of the original C program. Two arrays of characters specified in the source code are not needed. The Visual C++ compiler immediately noticed this and modified the code so that it became impossible to restore the original code despite all efforts. Naturally, the optimization became possible only because these arrays are used in a limited area.
Now, it is time to consider some methods of optimization that might be useful when writing programs in Assembly language.
When dealing with the CPU, the topic of speed versus size is important. For example, the MOV EAX, EBX command executes faster than XCHG EAX, EBX ; however, its code is longer. Being aware of this feature, you can either set the size of your program or make it execute faster. For example, such operations as MUL are often replaced by other commands, such as SHL , and DIV operations are commonly replaced by SHR . Doing so can considerably increase the program's performance but increase its size. The interesting point is that arithmetical operations can also be carried out using the LEA command. Contemporary translators actively use this feature. Thus, the MUL command isn't encountered in the translated code as frequently as it might seem at first glance (based on the program's source code). Generally, it is a good practice to investigate properties of specific commands because you'll discover lots of interesting facts. For example, unsigned modulo-4 division is carried out as follows : AND EAX, 0003h an original approach, isn't it?
As you'll detect, there are some reserves . For example, the conditional check can be organized to minimize the number of jumps. This will make the program fragment under consideration run faster. Assume that you have the following code fragment:
... CMP EAX, 100 JB L1 Fragment 1 JMP L2 L1: Fragment 2 L2:
You know, however, that the EAX register most often contains the values smaller than 100. Consequently, this fragment can be replaced by the following:
... CMP EAX, 100 JNB L1 Fragment 2 JMP L2 L1: Fragment 1 L2:
Consider the following code fragment:
P1 PROC . . . CALL P2 RET P1 ENDP . . . P2 PROC . . . RET P2 ENDP
This fragment can be replaced by a mote efficient one. This approach can often be encountered when viewing program code using a debugger:
P1 PROC . . . JMP P2 P1 ENDP . . . P2 PROC . . . RET P2 ENDP
The code becomes shorter and faster; however, it becomes less understandable. To conclude this description of optimization techniques, I'd like to recommend that you read a book [12], in which you'll find more information on this topic.
| ||