5.2. Built-In IDA Pro Programming Language


5.2. Built-In IDA Pro Programming Language

The IDA Pro disassembler has a built-in programming language, through which it is possible to extend the disassembler's functionality by writing small programs for analyzing the disassembled code.

5.2.1. About the IDA Pro Built-In Programming Language

The built-in IDA Pro programming language is a simplified variant of the classical C. The name of this language is IDC (short for Interactive Disassembler C). The IDC subdirectory contains several programs written in this language. IDA Pro uses these programs for analyzing disassembled texts. All of these programs are easily analyzable, so you can use them for studying the IDC language.

General Information

There are two methods of executing IDC commands.

  • The first method consists of using the command window. To call the command window, use either the File | IDC command... menu items or the <Shift>+<F2> shortcut. The command window is shown in Fig. 5.8. You can use the edit field in this window to enter the sequence of IDC commands, separating commands with a semicolon. After you enter the commands and click OK, IDA Pro will interpret the supplied commands and try to execute them. Thus, using this window, it is possible to write simple programs in the IDC language.

  • A more fundamental approach is creating a file with the IDC file name extension, which would contain the code written in IDC. To load a program, use the File | Idc file menu. In this case, the program is compiled and then executed immediately. In addition, the new window (Fig. 5.9) with buttons for editing and executing a program appears in the main IDA Pro window.

image from book
Figure 5.8: The command window that allows execution of the sequence of the IDC language constructs

image from book
Figure 5.9: Toolbar for editing and executing an IDC program

Now, consider the program structure and the IDC language syntax.

Program Structure and IDC Language Syntax

Functions

Similar to the C programming language, programs written in IDC are made up of functions. As usual, program execution starts from the main function. The function structure appears as in Listing 5.11.

Listing 5.11: Structure of the IDC function

image from book

 static func(argl, arg2, ...) { ... } 

image from book

All functions must be declared as static. When specifying arguments, it is not necessary to specify their types because there are only two variable types in IDC: string variables and numeric variables. Thus, the variable type can be easily determined by the first assignment operation. All types are converted automatically.

Variables

All variables are local. They are declared using the auto keyword. Again, there are two types of variables: numeric and string. The maximum length of a string variable is 255 characters. Numeric variables are subdivided into two types: 32-bit signed integers and floating-point numbers. The translator determines the variable type by the first assignment operator that assigns some value to it.

Type conversion deserves special attention. Consider several typical situations:

First, there is conversion of a string variable to an integer type. If the left part of the string is a decimal number, the conversion result is equal to that number; otherwise, the result will be zero, as in Listing 5.12.

Listing 5.12: Fragment of an IDC program illustrating conversion of a string variable to an integer type

image from book

 auto a, b, c, d; c = "w"; d = "q"; a = "451"; b = "123qwert234"; c = a; d = b; Message("%d:%d\n", c, d); 

image from book

The program will output the following: 451:123.

Note 

The Message function of the IDC built-in language outputs information into the message window (or message console). IDA Pro opens this window at start-up. In particular, IDA Pro outputs to this window all messages about executable code loading and analysis. The IDC Message function is an analogue of the standard printf function in C.

Another possibility is conversion of an integer to a string type. This conversion appears unusual if you are accustomed to the conversion method that simply replaces the number with the string without changing the value (2345 "2345"). The idea of conversion is as follows: Each byte of the number, counted from right to left, is converted to a character in appropriate encoding; however, it is placed within the resulting string counted from left to right, like in Listing 5.13.

Listing 5.13: Fragment of an IDC program illustrating conversion of an integer to a string type

image from book

 auto i1; auto a; a = 0x4241; i1 = "q"; i1 = a; Message("%s\n", i1); 

image from book

The AB string will be output as the result of executing this fragment.

You may also see conversion of a string type to a floating-point number. This type is converted in the same way string data is converted to numeric type.

Finally, there is conversion of floating-point numbers to the string data type. In this case, the numbers are converted according to a simple method: Each digit of the number, including the radix, is converted to the appropriate string character. A certain precision loss, however, is admissible. For example, see Listing 5.14.

Listing 5.14: IDC program fragment converting floating-point numbers to the string data type

image from book

 auto i1; auto a; a = " "; i1 = 3.5; a = char(il); Message("%s\n", i1); 

image from book

As a result of executing this fragment, the following string will be output:

    3.5000000000000018318681 

Type conversion might occur outside the course of assignment, when the type in the right-hand part of the assignment operator is converted to the type of the left-hand part of the assignment operator. Type conversion also takes place in the following cases:

  • If the arithmetic expression contains at least one floating-point data item, then all variables that participate in this expression will be converted to this type so that the expression will operate over floating-point numbers.

  • If bitwise operations are executed over the variable, then this variable is considered a numeric integer variable.

  • The following operations can be executed over numeric variables: assignment, comparison, addition, subtraction, multiplication, and division. In addition, it is possible to carry out bitwise operations over integer variables: cyclic shifts (>> and <<), bitwise AND (&), bitwise OR (|), bitwise NOT (~), and bitwise exclusive OR (^). It is also possible to increment (++) and decrement (--) integer numbers. String variables allow the following operations: assignment (=), comparison (==), and concatenation (+).

Main Constructs

The IDC language supports the main C constructs that modify the execution order.

  • Conditional constructs, such as if/else

  • Loops, such as while, do, break, and continue

  • Loops with counters, such as for

  • Operators for returning from functions (return)

The IDC language lacks such C operators as goto and switch.

Directives

The IDC language supports the following preprocessor directives used in C:

  • #define

  • #undef

  • #include

  • #error

  • #ifdef, #ifndef, #else, #endif

Controlling Strings

The IDC language supports the minimal set of operations for controlling string variables. In contrast to the C language, in IDC strings are not sequences of characters. Rather, strings are some closed elements (or objects) of an undefined structure, for which the concatenation operation and some simple functions are defined.

For the concatenation operation, the + character is defined, for example, as in Listing 5.15.

Listing 5.15: IDC program fragment illustrating the concatenation operation

image from book

 auto  s1, s2, s3; s1 = "Hello"; s2 = "world!"; s3 = s1 + " " + s2; Message("%s\n", s3); 

image from book

As the result, the Hello world! string will be output to the console.

The main functions for working with strings are as follows:

  • strlen — Return the string length. The only parameter of this command is a string variable or a constant.

  • strstr — Search the substring within a string. The first argument of this function is the string to be searched, and the second argument is the substring for searching. The function returns the number of the character, from which the found substring starts. Numbering of characters within the string starts from zero. If the specified substring is not found, the function returns -1.

  • substr — Select and return the specified substring within the string. The first parameter of this function is the string to be searched. The second and the third parameters are the starting and the ending characters of the selected fragment, respectively. Character numbers are counted from zero. The function returns the selected fragment of the string.

  • ltoa — Convert the integer number to a string. The first argument is the numeric variable or constant, and the second argument specifies the numeral system in which the number will be represented. The function returns the string representing the supplied number in the numeral system specified. In case of error, a blank string will be returned.

  • atoll — Convert the string to an integer number. The only argument of this function is the string. In case of error, this function returns zero.

5.2.2. Built-In Functions and IDC Programming Examples

This section is not a reference on built-in IDC functions, because the IDA Pro online help system contains a list of these functions. I'll provide a small overview of the functions most important for analysis of program code. Also, I'll provide several examples of their use. Based on these examples, you'll be able to write a small program for code analysis on your own.

In addition to the IDA Pro online help system, you can obtain reference information from the idc.idc file stored in the IDC subdirectory. This file contains constant definitions and function prototypes, along with brief comments. This file is a header file to include with programs written in the IDC language. This is done in a standard manner, using the #include directive. In addition, the IDC subdirectory contains several simple but useful programs in the IDC language.

Virtual Memory Access

Recall that, before analyzing an executable module, IDA Pro creates virtual memory, to which it then loads that module. By accessing individual cells of this virtual memory, you access the program code loaded there. Note that the code loaded into virtual memory is previously analyzed by IDA Pro.

Navigating the Memory

Consider the program in Listing 5.16, written in the IDC language.

Listing 5.16: Example IDC program illustrating memory access and navigation

image from book

 #include <idc.idc> static main() {        auto ad;        ad = 0x401020;        while(ad <= 0x401041)        {               Message("%x\n", ad);               ad = NextAddr(ad);        }; } 

image from book

Everyone accustomed to writing C programs won't encounter any difficulties in understanding this program. The Message function was covered earlier. It only remains to describe the NextAddr function. This function has a speaking name: It returns the next linear address in relation to the value of the function's argument. If such an address doesn't exist, the function returns —1. For this value, there is the BADADDR constant in the idc.idc file.

The result of executing this function is the column of addresses from 0x401041 to 0x401041, inclusively. Clearly, the same result will be obtained if you add one to the ad variable at each loop iteration. Also, there is the PrevAddr function, which is similar to the NextAddr function but returns the previous address.

Finally, there is another helpful function that can be used to search within the specified byte sequence (or navigate) within the disassembled text. This is the FindBinary function. The first argument of this function is the starting address of the search operation. The second argument is the search-mode flag. The 0 bit of this flag defines the search order (the 0 value is for direct search order, and the 1 value stands for searching in the inverse order). The first bit sets the case-sensitive search mode (0 for a case-insensitive search and 1 for a case-sensitive search). The third argument of the function is the sequence of codes of the searched bytes. When written, bytes must be separated by blank characters and must be enclosed in quotation marks. The current numeral system is used in the course of searching. The function returns the starting address of the searched substring. If the string hasn't been found, the function returns -1. The function call appears as follows: ad = FindBinary(0x404020, 0, "34 AF 56 30").

Reading and Writing

As already mentioned, the NextAddr or PrevAddr function can return -1 if the next or previous address, respectively, does not exist. This means that the respective address either is not available or has not been initialized. What should you do if the command simply tries to access some address? How is it possible to know beforehand whether that address is available? For this purpose, there is the GetFlags function, the only argument of which is a virtual address. The function returns the flags of this address (the attribute). The required flags are checked using the FF_IVL constant (Listing 5.17). The value of this constant is defined in the idc.idc file.

Listing 5.17: Simple IDC program that illustrating memory reading

image from book

 #include <idc.idc> static main() { auto ad, i;        for(ad = 0x401020; ad <= 0x401041; ad++)        {               Message("%x........", ad);               if(GetFlags(ad) & FF_IVL)               { // Output the value of the byte read from the memory.                      i = Byte (ad);                      if (i > 31)                      Message("%x..,%c\n", i, i);                      else                      Message("%x...\n", i);               } else               { // The byte value is undefined.                      Message("Error!\n");               }        } } 

image from book

The IDC language provides three functions for reading from virtual memory: Byte, Word, and Dword. The argument of all three functions is a virtual address. According to their names, these functions return byte, word, and double word values. The program in Listing 5.17 reads the block of virtual memory and outputs it into the message window.

The result of executing the program in Listing 5.17 is presented in Listing 5.18.

Listing 5.18: Output of the program presented in Listing 5.17

image from book

 401020........8b...< 401021........44...D 401022........24...$ 401023........4... 401024........6a...j 401025........0... 401026........68...h 401027........0... 401028........10... 401029........40...@ 40102a........0... 40102b........6a...j 40102c........0... 40102d........68...h 40102e........ec...ì 40102f........50...P 401030........40...@ 401031........0... 401032........50...P 401033........ff...Ÿ 401034........15... 401035........c8...È 401036........50...P 401037........40...@ 401038........0... 401039........6a...j 40103a........0... 40103b........ff...Ÿ 40103C........15... 40103d........0... 40103e........50...P 40103f........40...@ 401040........0... 401041........cc...Ì 

image from book

For writing into virtual memory, three functions are used: PatchByte, PatchWord, and PatchDword. The first argument of these functions is the virtual memory address, and the second argument is the value written into the memory. Listing 5.19 shows a simple program that analyzes the specified memory block and changes the values of some bytes. This program is so simple that it doesn't need any comments.

Listing 5.19: Simple IDC program that analyzes the specified memory block and patches some bytes

image from book

 #include <idc.idc> static main() { auto ad, i, j; j = 0x91;        for(ad = 0x401020; ad <= 0x401041; ad++)        {               if(GetFlags(ad) & FF_IVL)               {               i = Byte(ad);               if(i == 0x50)PatchByte(ad, j);        }     } } 

image from book

The Structure of the Listing Line

In a line of IDA Pro listings, you can find the following elements: processor instructions or data items, comments, labels, or cross-references. These are not the impersonal data, with which you were dealing in the previous section. On the other hand, specific virtual memory cells are related to the listing line. These cells store instruction codes or data items.

Consider functions that can be used for analyzing lines of the disassembled listing. I intentionally use the term "listing line" to join dissimilar elements. These elements are dissimilar, first, in the locations where they are stored. In contrast to instructions and data, which are located in the virtual memory (the file with the ID1 file name extension), the other elements listed previously are stored in special virtual arrays, which are located in the file with the ID0 file name extension. Nevertheless, they are all line elements, so I joined them within the same section.

Selecting Instructions

The program presented in Listing 5.20 outputs to the console the Assembly code located in the specified address range.

Listing 5.20: Simple IDC program that outputs the Assembly code in the specified address range

image from book

 #include <idc.idc> static main() {        auto ad, i, j;        ad = 0x401000;        while(ad <= 0x401042)        { // Represent operands in hex mode.               OpHex(ad,   -1); // Output the instruction address.               Message("%10x ", ad); // Obtain the operand types.               i = GetOpType(ad, 0);               j = GetOpType(ad, 1); // Output the instruction name.               Message("%s ", GetMnem(ad));               if (i > 0)               { // Output the  first operand (if present).                       Message("%s", GetOpnd(ad, 0));                       if (j > 0)                       { // Output the second operand (if present).                             Message(",%s \n", GetOpnd(ad,1));                       } else                             Message("\n");                } else                       Message("\n"); // Go to the next instruction.                ad = NextHead(ad, BADADDR);        } } 

image from book

Consider some of the functions in the preceding listing:

  • The NextHead function is the main function in this program. The first argument of this function is some virtual address. The second argument is the address that limits the range of addresses to return. I have used the BADADDR constant, which in this case is interpreted as a positive integer number — in other words, as FFFFFFFFH (not as -1). The function returns the address of the first byte of the next instruction or data item. There is a similar function that returns the address of the previous instruction or data item — PrevHead.

  • The GetMnem function returns the instruction name (a string) located at the specified address. The argument of this function is the address of the first instruction byte.

  • The GetOpnd function returns the instruction operand in the form of a string value. This function has two arguments: the instruction address and the number (minus 1) of the operand in the instruction counted from left to right.

  • For formatting the output table, I had to use the GetOpType function. This function returns the type of operand in the processor instruction. The first argument of this function is the instruction address, and the second argument is the number (minus 1) of the operand in the instruction counted from left to right. If the operand is present, then the value returned by the function must be greater than zero.

  • Finally, I used the OpHex function to specify the hex format for outputting numeric operands (if the corresponding operand is a number). The second argument of the function specifies the operand number. The -1 value in the function means that it must process all operands of the instruction.

Listing 5.21 presents the result of executing the program shown in Listing 5.20.

Listing 5.21: Result of executing the program shown in Listing 5.20

image from book

 401000 push ebp 401001 mov ebp, esp 401003 sub esp, 0Ch 401006 mov dword ptr [ebp - 4], 0Ah 40100d mov dword ptr [ebp - 8], 0Bh 401014 mov dword ptr [ebp - 0Ch], 0Ch 40101b mov eax, [ebp - 0Ch] 40101e push eax 40101f mov ecx, [ebp - 8] 401022 push ecx 401023 mov edx, [ebp - 4] 401026 push edx 401027 call sub_401050 40102c add esp, 0Ch 40102f push eax 401030 push 4060DOh 401035 call _printf 40103a add esp, 8 40103d xor eax, eax 40103f mov esp, ebp 401041 pop ebp 401042 retn 

image from book

Parsing Data

Consider how to parse the data shown in the disassembled listing. Each data item takes at least 1 byte. The type of data item that starts from the specified address can be determined by the bits of the attributes byte located at that address. Listing 5.22 provides the list of data types and flags that correspond to them, as defined in the idc.idc file.

Listing 5.22: Data types and flags of the attributes byte as defined in the idc.idc file

image from book

 #define FF_BYTE       0x00000000L   // Byte #define FF_WORD       0xl0000000L   // Word #define FF_DWRD       0x20000000L   // Dword #define FF_QWRD       0x30000000L   // Qword #define FF_TBYT       0x40000000L   // Tbyte #define FF_ASCI       0x50000000L   // ASCII? #define FF_STRU       0x60000000L   // Struct? #define FF_OWRD       0x70000000L   // Octaword (16 bytes) #define FF_FLOAT      0x80000000L   // Float #define FF_DOUBLE     0x90000000L   // Double #define FF_PACKREAL   0xA0000000L   // Packed decimal real #define FF_ALIGN      0xB000000L    // Alignment directive 

image from book

The program shown in Listing 5.23 outputs to the console the addresses of data items and their lengths and types.

Listing 5.23: IDC program that outputs to the console addresses of data items and their lengths and types

image from book

 #include <idc.idc> static main() {        auto ad, i, j;        ad = Ox4055d6;        while(ad <= Ox405Aff)        {               ad = NextHead(ad, BADADDR); // Output the instruction address.               Message("%10x ", ad); // Obtain the flag value.               i = GetFlags(ad); // Check whether this is a data item.               if(((i & MS_CLS) == FF_DATA))               {                      Message("Data: size - %d, type - ", ItemSize(ad), i);                      if((i & 0xF0000000) == FF_BYTE)                      {                             Message("byte\n");                             continue;                      }                      if((i & 0xF0000000) == FF_WORD)                      {                             Message("word\n");                             continue;                      }                      if((i & 0xF0000000) == FF_DWRD) {                      {                             Message("qword\n") ;                             continue;                      }                      if((i & 0xF0000000) == FF_TBYT)                      {                             Message("tbyte\n");                             continue;                      }                      if((i & 0xF0000000) == FF_ASCI)                      {                             Message("string ASCII\n");                             continue;                      }                      if((i & 0xF0000000) == FF_STRU)                      {                             Message("structure\n");                             continue;                      }                      if((i & 0xF0000000) == FF_OWRD)                      {                             Message("octaword\n");                             continue;                      }                      if((i & 0xF0000000) == FF_FLOAT)                      {                             Message("float\n");                             continue;                      }                      if((i & 0xF0000000) == FF_DOUBLE)                      {                             Message("double\n");                             continue;                      }                      if((i & 0xF0000000) ==  FF_PACKREAL)                      {                             Message("packed decimal real\n");                             continue;                      }                      if((i & 0xF0000000) == FF_ALIGN)                      {                             Message("align\n");                             continue;                      };                      Message("??\n");               }               else                      Message("?\n");        } } 

image from book

As you can see, this program uses the previously-mentioned NextHead function, which is the most convenient one for navigating the disassembled text.

To determine the data type, the flags of the first byte of the data attribute are used. For this purpose, the table in Listing 5.22 is used. The required bits are selected using the i & F0000000h command.

Finally, the length of the data is determined using the ItemSize function. The only argument of this function is the address of the first byte of the data item.

Listing 5.24 presents the results of executing the program shown in Listing 5.23.

Listing 5.24: Results of executing the program in Listing 5.23

image from book

 4055d8 Data: size - 160, type - string ASCII 405678 Data: size - 25, type - string ASCII 405698 Data: size - 177, type - string ASCII 405749 Data: size - 3, type - align 40574c Data: size - 35, type - string ASCII 40576f Data: size - 1, type - align 405770 Data: size - 12, type - structure 405a82 Data: size - 66, type - string ASCII 405c84 Data: size - 2, type - word 

image from book

Other Elements of the Code Line

Other elements of the code line are comments (automatically created or entered by the user), labels (software labels and variables), and cross-references. You not only can obtain these elements programmatically but also can add such elements into the line of code.

Listing 5.25 provides a fragment of the idc.idc file, containing the list of all possible elements and values of the flags of the first byte of an instruction or data item, supplied with my comments.

Listing 5.25: All possible flag elements and values for byte 1 of a data item or instruction (idc.idc file)

image from book

 #define FF_COMM 0x00000800L // Has a comment?                             // Comment #define FF_REF  0x0000l000L // Has references?                             // Cross-reference #define FF_LINE 0x00002000L // Has the next or previous comment lines?                             // Line of a multiline comment #define FF_NAME 0x00004000L // Has a user-defined name?                             // User-defined label or name #define FF_LABL 0x00008000L // Has a dummy name?                             // Label (name) #define FF_FLOW 0x000l0000L // Execute flow from the previous instruction?                             // Cross-reference to the previous instruction #define FF_VAR  0x00080000L // Is a byte variable?                             // Variable (label for a data item) 

image from book

The program in Listing 5.26 views the listing generated by IDA Pro and finds programmatic labels that are later output to the message console. The code lines that contain labels are supplied with comments (the Label sting).

Listing 5.26: Program that views the IDA Pro listing and finds software labels for console output

image from book

 #include <idc.idc> static main() {        auto ad, i, j;        ad = Ox401cfe;        while(ad <= 0x401d41)        {                 ad = NextHead(ad, BADADDR); // Output the  instruction address.                 Message("%10x ", ad);                 i = GetFlags(ad);                 if(i & FF_LABL)                 {                        Message("%s  \n", GetTrueName(ad));                        MakeComm(ad,   "Label!");                 } else Message("\n");        } } 

image from book

The code lines with labels are sought by going from line to line and checking the appropriate bit of the first byte of the element (an instruction or a data item) using the FF_LABL constant. To create a comment, the MakeComm function is used. The first argument of this function is the address of the line, and the second argument is the comment string.

Working with Functions

A function is a listing object that can be made up of several code lines containing instructions. The function has its starting and ending addresses, as well as other properties (Listing 5.27). Dividing the disassembled code into functions allows considerable improvement of the listing's readability and simplifies understanding of the program's operating logic.

Listing 5.27: Fragment of the idc.idc file containing the list of flags defining the function properties

image from book

 #define FUNC_NORET    0x0000000lL    // Function doesn't return.                // The function doesn't return control to the ret command. #define FUNC_FAR      Ox00000002L    // Far function                  // The function returns control to the retf instruction. #define FUNC_LIB      0x00000004L    // Library function                       // The library function #define FUNC_STATIC   0x00000008L    // Static function                       //A static function #define FUNC_FRAME    0x000000lOL // Function uses a frame pointer (BP).                       // The function uses the EBP register as a pointer                       // to local variables and parameters. #define FUNC_USERFAR  0x00000020L    // User has specified farness.                       // The function is defined as far by the user. #define FUNC_HIDDEN   0x0000004OL    // Hidden function                       // A hidden (collapsed) function #define FUNC_THUNK    0x00000080L    // Thunk (jump) function                       //A stub function containing only                       // the jump instruction #define FUNC_BOTTOMBP 0x00000l00L  // BP points to the bottom                                    // of the stack frame;                               // the EBP register points to the "bottom"                                    // of the stack frame. 

image from book

Listing 5.27 presents the list of flags that define the function properties. This list is a fragment of the idc.idc file supplied with my comments.

The program in Listing 5.28 outputs to the console names of the functions within the specified interval of addresses, and it sets comments for library functions.

Listing 5.28: Outputting function names within the address interval; setting library function comments

image from book

 #include <idc.idc> static main() {        auto ad, s, i;        ad = 0x401000;        while(ad <= Ox4030bc)        {                 s — GetFunctionName(ad);                 Message("%s\n", s) ;                 i = GetFXmctionFlags(ad);                 if(i & FUNC_LIB)                 {                        SetFunctionCmt(ad, " This is s library function", 1);                 }                 ad — NextFunction(ad) ;        } } 

image from book

To navigate the functions of the listing generated by IDA Pro, the NextFunction and PrevFunction functions are used. The only parameter of these functions is an address. Both functions return an address: The NextFunction function returns the address of the next function (the one used in the program), and PrevFunction returns the address of the previous function.

The program outputs to the console all names of all functions it has encountered. They are returned by the GetFunctionName IDC function. Any address belonging to a function can serve as a function argument.

For obtaining the function flags, the GetFunctionFlags function is used. The flags were listed in Listing 5.27.

The program sets comments for all library functions that it has encountered (and which are considered library functions by IDA Pro). For this purpose, the SetFunctionCmt function is used. This function has three arguments: function address, string comment, and type comment. Two types of comments can be set for functions: constant (parameter 0) and repeatable (parameter 1). The first comment is present only before the function definition, while the second type is duplicated in all calls to this function.

User Interface Elements

The IDA Pro disassembler provides the minimum set of functions for automating the input and output procedures. These are output to the message console (the Message function), which has been mentioned and used several times, controlling the cursor in the disassembled listing, several types of dialogs, and several other functions.

The program in Listing 5.29 searches three sequential PUSH instructions within the specified address range, and moves the cursor to that group of commands. For moving the cursor, the Jump command is used, the argument of which is the virtual address.

Listing 5.29: Locating three sequential PUSH commands and moving the cursor to that group

image from book

 #include <idc.idc> static main() {        auto ad, s;        ad = 0x401000;        while(ad <= 0x4030bc)        {              if(GetMnem(ad) == "push" &&              GetMnem(NextHead(ad, BADADDR)) == "push" &&              GetMnem(NextHead(NextHead(ad, BADADDR), BADADDR)) == "push")              { // Move the cursor to the located address.                      Jump (ad); // Exit the loop.                      break;               }               ad = NextHead(ad, BADADDR);        } } 

image from book

Other Possibilities of Code Analysis in IDA Pro

Although I have no room to consider the entire range of the IDC functional capabilities or, to be more precise, the library of functions provided by IDA Pro, I'd like to cover several interesting and important issues.

Structures and Enumerations

In the IDA Pro disassembler, there are built-in capabilities that allow you to automatically recognize and determine such important high-level language constructs as structures and enumerations. In IDA Pro, both structures and enumerations are characterized by three specific features that allow you to identify them:

  • Identifier of a structure or enumeration

  • Name of the structure or enumeration

  • Index of a structure or enumeration

The program presented in Listing 5.30 outputs to the message console the list of identifiers and names of all structures that IDA Pro has recognized when analyzing the executable code.

Listing 5.30: Outputting names and identifiers of all structures and enumerations detected by IDA Pro

image from book

 #include <idc.idc> static main() {        auto n, i, s;        n = 0;        while(n != -1)        {               i = GetStrucId(n);               s = GetStrucName(i);               n = GetNextStruddx (n) ;               Message("%x %s\n", i, s);        } } 

image from book

The GetNextStrucIdx function returns the next index of the structure in relation to the specified index. The GetStrucId function returns the structure identifier by its index, and the GetStrucName function returns the structure name by to its index. It is necessary to bear in mind that the values of structure or enumeration indexes can change in the course of analysis because new structures can be added and existing ones can be deleted; identifiers, however, remain unchanged.

Working with Files

Built-in files allow you to work with structures. Using the GenerateFile function, it is possible to generate a report file. This function is equivalent to the File | Produce File menu commands.

The IDA Pro disassembler supports a set of functions for controlling files of an arbitrary structure. This set of functions in general corresponds to the set of standard library functions for working with files, which are defined in the stdio.h and io.h header files. These functions are as follows:

  • fopen — Open a file. This function returns the descriptor, which is then used in other functions.

  • flose — Close the file descriptor.

  • filelength — Return the length of the file previously opened by the fopen file.

  • fgetc — Read one character from the file.

  • fputc — Write one character into the file.

  • f tell — Obtain the current position of the pointer.

  • fseek — Move the pointer to the specified position within a file.




Disassembling Code. IDA Pro and SoftICE
Disassembling Code: IDA Pro and SoftICE
ISBN: 1931769516
EAN: 2147483647
Year: 2006
Pages: 63
Authors: Vlad Pirogov

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net