5.1. Introduction to IDA Pro

IDA stands for Interactive Disassembler, although the About window displays a beautiful young lady. This instrument is so elegant that its name makes you imagine someone like her.

5.1.1. Getting Started

First, it is necessary to mention that the IDA Pro distribution set includes both console (idaw.exe) and graphical (idag.exe) variants of the program. All further sections will mainly relate to the GUI variant.

General Information About Virtual Memory

If you load some executable module into IDA Pro, two files will be created into the directory, from which you have loaded that module. These will be two auxiliary files with the ID0 and ID1 file name extensions. These are auxiliary virtual memory files used by the IDA Pro debugger for storing intermediate data. After you unload the previously-loaded module (using the File | Close menu commands), both files will disappear. The file with the same name as the loaded executable module and with the ID1 file name extension is used for loading the image of that executable module. This image is identical to the image loaded into the 32-bit flat memory model of the Windows operating system. Thus, it becomes possible to ensure that the module being investigated is identical to the module executed by the operating system. This feature makes IDA Pro close to an exclusive debugger. For each address, the file stores a 32-bit characteristic: an 8-bit cell corresponding to the given address and a 24-bit attribute defining various properties of this cell. In particular, this attribute specifies whether the given memory cell relates to an instruction or to the data (and, in the latter case, the type of this data item). Furthermore, this attribute specifies whether there are other objects in the string, such as comments, cross-references, or labels.

Mechanisms of working with the virtual memory used by IDA Pro are identical to the similar mechanisms used by the Windows operating system. When accessing an individual cell, the entire page containing this cell is loaded into the main memory (buffer). If the memory cell is modified, the entire virtual memory page is rewritten. IDA Pro holds part of the memory pages in random access memory. Modified cells are periodically flushed to the disk. When it is necessary to load a page but the page buffer is full, IDA Pro searches the buffer to find the page that was modified first, flushes it to the disk, and loads the required page into the freed space.

In addition to storing the image of the loadable module, IDA Pro requires memory for information such as labels, function names, and comments. This information is stored in the file with the ID0 file name extension. In official documentation, this memory is called memory for b-tree.

The Program Interface

General Information

In Fig. 5.1, the main IDA Pro window with the loaded executable program is shown. The background analysis of the loaded program has been completed, as designated by the message in the bottom left corner: The initial autoanalysis is finished.

image from book
Figure 5.1: The IDA Pro main window with the loaded executable module

The IDA Pro main window has lots of tabs. By default, there are nine tabs, although in reality there might be more. You can add new tabs using the Views | Open subviews... menu. There are two windows that can be duplicates: IDA View and Hex View. Thus, different sections of code and data can be viewed in different windows. These windows are supplied with suffixes — A, B, C, etc. — so that the user can easily distinguish them.

The main window is IDA View. This window displays the main result of executable code analysis. In this window, the user can participate in further analysis of the code.

When working with the IDA Pro debugger, do not forget that there are three main methods of controlling this program: menu commands, toolbar buttons, and hotkeys. Hotkeys do not cover all IDA Pro capabilities; however, there are hotkeys for the most frequently used operations. For example, if some data block raises your suspicion, you can always convert it into code (disassemble) by pressing the <C> key (short for CODE). On the other hand, if some block of Assembly commands seems meaningless, you can always convert it into data by pressing the <D> key (short for DATA).

IDA Pro uses the following configuration files: Ida.cfg is the common configuration file, idatui.cfg is the configuration file for the console variant of the program, and idagui.cfg is the configuration file for the GUI variant of the program. Configuration files must reside in the CFG subdirectory of the IDA Pro main directory.

Loading the Executable Code

After you load an executable module into IDA Pro, you'll see the main IDA Pro window shown in Fig. 5.2. Using this window, you can configure the loading process and the initial analysis. This window provides lots of configuration settings that will be described in the next few sections. In most cases, IDA Pro suggests optimal settings so that the user doesn't need to configure anything. You only need to click the OK button and rely on your luck and the disassembler's capabilities. However, because these options are used occasionally, I'll provide brief descriptions.

Load file directory/name as contains the list of formats that can be recognized by the current IDA Pro version for the chosen module. In most cases, IDA Pro recognizes the type of file chosen for loading. Other options available in this window are set automatically depending on the chosen type of the loadable module. For example, carry out the following simple experiment. Disassemble the MS-DOS stub of some PE module (see Section 1.5.1). To achieve this, choose the MS-DOS executable option from the list. To confirm your choice, click the Set button. I'd like to point out again that this list corresponds to the choice of the PE module. PE modules can be interpreted both as normal PE modules and as MS-DOS programs, or even as binary files. If you choose a new executable (NE) module, for example, the content of this list will be different.
Processor type is a dropdown list that allows you to choose the processor, for which the chosen module was compiled.
Loading segment and Loading offset are fields that allow you to load the module into a specific segment with a specific offset, which might be useful both for MS-DOS modules and for binary files. These parameters are not used for PE modules.
Enabled is a flag from the Analysis group that allows you to disable the initial analysis of the executable code. This flag is set by default, which means that the initial analysis will be carried out after loading.
Indicator enabled specifies whether the analysis process indication should be carried out. By default, this flag is set.
Create segments is not used for PE modules. If this flag is set, IDA Pro creates the required segments.
If the Load resources flag is set, the resources of the PE module will be loaded. For binary modules, this flag is called Load as code segment and is used, for example, for COM programs.
If the Rename DLL entries flag is not set, IDA Pro provides additional comments for functions imported by ordinals; otherwise, functions are renamed at the disassembler's discretion.
If the Manual load flag is set, the disassembler will consult the user at every step of the loading process.
Fill segment gaps is a flag important only for NE modules. It instructs the disassembler to fill the intersegment space, thus creating one large segment.
Make imports segment — when this flag is set, it instructs the disassembler to interpret the .idata section only as related to the imported information. In this case, the disassembler would ignore the data that might also be contained in this section.
Don't align segments instructs the disassembler to align segments. This flag is not used for the modules under consideration.
Kernel optionsl is a button that displays the window enabling the user to configure options used when analyzing executable code, by setting the flags.
- Using the Create offsets and segments using fixup info flag, you can instruct the disassembler to use the information from the relocations table in the course of code analysis.
- Mark typical code sequence as code instructs the disassembler to use typical processor command sequences in the course of analysis.
- Delete instructions with no xrefs allows the disassembler to ignore microprocessor instructions, for which there are no cross-references.
- Trace execution flow allows tracing, so that you can discover the processor instructions.
- Create functions if call is present instructs the disassembler to recognize functions by calls.
- Analyze and create all xrefs is one of the main options that makes the disassembler use cross-references in the main analysis.
- Use FLIRT signatures instructs the disassembler to use fast library identification and recognition technology (FLIRT) for recognizing library functions using signatures.
- Create function if data xref data->code32 exists instructs the disassembler to check the references to executable code in the data area.
- Rename jump functions as j_... allows IDA Pro to rename simple functions containing only the jmp somewhere command as j_somewhere.
- Rename empty functions as nullsub_... allows IDA Pro to rename functions containing one RET command as nullsub_....
- Create stack variables instructs the disassembler to create (define) local variables and parameters of the functions.
- Trace stack pointer instructs IDA Pro to trace the value of the ESP register.
- Create ASCII string if data xref exists instructs the disassembler to consider the data item referenced as ASCII string if its length exceeds a certain value.
- Convert 32-bit instruction operand to offset instructs the disassembler to consider a direct data item in the processor instruction as an address, provided that its value falls into the predefined interval.
- Create offset if data xref to seg32 exists instructs the disassembler to consider values stored in the data area as addresses, provided that their values fall into the predefined interval.
- Make final analysis pass instructs the disassembler to convert all uninvestigated bytes into data or instructions when carrying out the final stage of the analysis.
Kernel options2 is a button that calls another window with another set of flags for the options used in the course of executable code analysis:
- Locate and create jump tables instructs IDA Pro to draw conclusions about the address and size of the jump table.
- If the Coagulate data in the final pass flag is off, then only bytes of the code segment are converted at the last stage of analysis (see the Make final analysis pass flag).
- Automatically hide library functions instructs the disassembler to hide (collapse) library functions detected using FLIRT.
- Propagate stack argument information instructs the disassembler to save information about stack parameters of the call in case of future calls (such as a function call from another function).
- Propagate register argument information instructs the disassembler to save information about register parameters of the call in case of further calls (such as function calls from another function).
- Check for Unicode strings allows the disassembler to check the program for the presence of Unicode strings.
- Comment anonymous library functions instructs the disassembler to mark anonymous library functions using the library name and the signature, with which a specific function was detected.
- Multiple copy library function recognition allows the disassembler to recognize several copies of the same function within a program.
- Create function tails allows you to search for function tails and add themto function definitions.
Processor options is a button that calls the window with the option flags.
- Convert immediate operand of "push" to offset indicates the possibility of converting the direct operand in the PUSH command to an offset (an address).
- Convert db 90h after "jmp" to "nop" specifies to the disassembler that 90H bytes that follow the jmP command must be interpreted as NOP commands.
- Convert immediate operand of "rnov reg,..." to offset indicates the possibility of converting the direct operand in the Mov reg,... command (reg stands for the register) into an offset (an address).
- Convert immediate operand of "mov memory,..." to offset indicates the possibility of converting the direct operand in the Mov mem,... command to an offset (an address).
- Disassemble zero opcode instructions gives the disassembler the following instruction: 00 00 ADD [EAX], AL. By default, this flag is off.
- Advanced analysis of Borland's RTTI (RTTI stands for run-time type information) allows IDA Pro to check and create RTTI structures.
- Check "unknown_libname" for Borland's RTTI allows the disassembler to check names marked as unknown_libname for the presence of RTTI structures.
- Advanced analysis of catch/finally block after function allows the disassembler to search for catch/finally exception processing blocks.
- Allow references with different segment bases allows the disassembler to specify references to characters even when the value stored by the specified address is not a character (doesn't represent a character code).
- Don't display redundant instruction prefixes instructs the disassembler to hide some command prefixes to improve the listing's readability.
- Interpret int 20 as VxDcall instructs the disassembler to interpret INT 20H as VxDcall/jump.
- Enable FPU emulation instructions specifies that commands such as INT 3?H must be interpreted as emulations of arithmetic coprocessor commands.
- If the Explicit RIP-addressing flag is set, it is assumed that relative instruction pointer (RIP) addressing is used in the program. This flag is in force for 64-bit processors.
System DLL directory is a field that specifies the directory where IDA Pro would search for DLLs, provided that the file with the .ids file name exception corresponds to the given library.

image from book
Figure 5.2: The window controlling executable code loading

The Disassembler Window

Because most work with IDA Pro is carried out in the disassembler window, it is expedient to consider this window in detail. It is necessary to point out that the developers of this disassembler have carefully considered representation of the disassembled function and methods of navigating it. Consider some key aspects related to this topic:

Hiding functions — Functions in the disassembler window can be shown in a collapsed form (hide) or an expanded form (unhide). In the collapsed form, the function is represented by a single line. This useful feature allows you to considerably improve the disassembled code's readability. To expand and collapse functions, use the <+> and <-> keys on the numeric keypad or the View | Unhide and View | Hide menu options.
Indicating jumps — Fig. 5.3 shows the disassembler window. Pay special attention to the leftmost section of the window. This section is intended for simplifying navigation of the listing. Commands are marked by dots. If the line doesn't contain a dot, this means that the string contains a comment. When the user clicks a dot with the mouse, IDA Pro sets a breakpoint to the respective address. Jumps are designated by continuous or dashed lines. Continuous lines designate unconditional jumps, and dashed lines correspond to conditional jumps.
Using special comments —Addresses within a program, to which jumps are carried out (conditional and unconditional jumps or the CALL. command) or referenced, contain special comments. The comment starts either from CODE XRFF, if the reference has the meaning of jump to the specified address, or from DATA XREF, if this instruction is referenced as data (for example, as follows: MOV EAX, OFFSET L1). These comments are called cross-references because the given address represents the crossing where references from other program locations meet. The cross-reference mark is followed by a colon, which, in turn, is followed by the address counted from the start of the function or section, from which this reference originates. By clicking this address with the mouse, you can call the pop-up window with the code fragment that refers to the given instruction. The address must contain the ↑ and ↓ characters that specify the direction to the line of code that references this instruction. To jump to the line, from which the reference originates, double-click the address with the mouse. If there are fewer than four references to the given line, they are listed; otherwise, the references are designated as dots. In this case, you can right-click one of these addresses and can choose the required item from the Jump to cross reference context menu. After that, the window will appear with the list of all addresses. This window will contain the reference to the requested code line. Choose the address you need by clicking it with the mouse (or by clicking the OK button after positioning the cursor on the required item), and you'll find yourself at the required position within the listing. The fragment of the disassembler window containing the cross-references is shown in Fig. 5.4.
Designating an address — The listing shown in the disassembler window demonstrates various methods of designating an address. For example, if you are dealing with an API function, the name of that function is explicitly specified. In addition, IDA Pro usually bases the names of references to detected strings on the content of that string. For example, if the string contains the text You are wrong!, then IDA Pro would designate the reference to that string as aYouAreWrong. In this case, the prefix means IDA Pro considers this string an ASCII string. All other names designating function names or data addresses are based on the prefix and an address. For example, you can encounter the following prefixes:
- sub_ — Function
- locret_ — Address of the return instruction
- loc_ — Instruction address
- off_ — Data specifying the address (offset)
- seg_ — Data specifying the segment address
- asc_ — Address of an ASCII string
- byte_ — Byte address
- word_ — Word address
- dword_ — Double word address
- qword_ — Address of a 64-bit value
- flt_ — Address of a 32-bit floating-point number
- dbl_ — Address of a 64-bit floating-point number
- tbyte_ — Address of an 80-bit floating-point number
- stru_ — Structure address
- algn_ — Alignment directive
- unk_ — Address of an uninvestigated area
Using the context menu — When working with the disassembler window, it is convenient to use the context menu that pops up when you click the right mouse button within a window. Some menu items differ for different parts of the listing, such as function names, instructions, comments, and selected blocks. Some menu items relate to IDA Pro operation as a debugger (Run to cursor, Add breakpoint, and Add execution trace). These items will be described later in this chapter. In particular, pay attention to the Rename menu item. This item allows you to edit command contents (operands).
Navigating a listing — The most important issue is navigation of the listing. Jumps to locations pointed to by cross-references have been already covered. The same approach (double-clicking the cross-reference with the mouse) can be used for returning (for example, to the conditional jump, to the CALL command, or to the address in a command like MOV EAX, OFFSET address). Note that IDA Pro remembers all of your jumps so that you can always move forward or backward along the chain (as you would follow the links in an Internet browser) using the following toolbar buttons: .

image from book
Figure 5.3: Indication of jumps in the disassembler window

image from book
Figure 5.4: Cross-references

Other Windows

Hex View — This window contains the hex dump of the loaded module, as well as ASCII characters corresponding to this dump. This window is an auxiliary one in relation to the disassembler window and can be easily synchronized with it. To achieve this, it is enough to click the right mouse button somewhere within the window and choose the Synchronize with | IDA View... item from the context menu. After switching to the disassembler window, you'll find yourself in the program location that corresponds exactly to the address in the dump window. In addition, IDA Pro tracks the addresses, with which you are working, in the disassembler window. When you switch to the dump, you automatically jump to the required location.
Exports — This window contains the list of exported functions. It is helpful for working with DLLs. For normal executable modules, the list is made up of a single element, namely, the start function.
Imports — This window contains the list of imported functions and the modules, from which they are imported. When you double-click the imported function, you switch to the disassembler window and find yourself in the entry point. Thus, you can easily locate all cross-references to this function within the program.
Names — This window contains the list of all imported and library functions, as well as the names of variables and labels recognized by IDA Pro. On the left side of each name is a character that defines the name type:
- L — Library function
- F — Regular functions and API functions
- C — Instruction (label)
- A — ASCII string
- D — Data
- I — Imported function

Double-clicking the name with the mouse jumps you to the program location where that name is used. To create a new name (for example, for the label) and specify the address corresponding to that name, press <Insert>. The entered name will also appear in the disassembler window.

Functions — This window contains the entire list of functions recognized by IDA Pro, including library functions and imported user functions.
Strings — This window contains all strings found by the disassembler. If you double-click a string, you'll automatically jump to the location within the listing where that string was defined. By default, only C-style strings are presented in this window. If you right-click this window and choose the Setup command from the context menu, you can display other types of strings in this window, for example, Unicode strings or Pascal strings.
Structures — This window contains all structures found by the disassembler. To add a new structure to the list, press <Insert>.
Enums— This window is intended for displaying all enumerations located within the program being investigated.

In addition to the preceding windows, the disassembler can use other windows. In particular, note the libraries window. In the online help system, this window is called the signatures window. This window contains the list of signatures used for recognizing library functions. The signatures window is shown in Fig. 5.5. As you can see, the list specifies the name of the file containing the function signatures, the number of functions found using these signatures, and the name of the library to whose functions these signatures were applied. By pressing <Insert>, you can add the required signature file from the displayed list. The signatures of that file will be immediately used for recognizing new functions.

image from book
Figure 5.5: The signatures window

Menus and Toolbars

I am not going to provide a detailed explanation of all IDA Pro menu items and all toolbar buttons. In most cases, you won't encounter any difficulties in studying IDA Pro functional capabilities on your own. It is only necessary to pay special attention to some important functions:

The File menu items are as follows:
- Open — Load the executable module to be disassembled.
- Load — Load different files: Reload the input file reloads the disassembled module, Additional binary file loads an additional binary file into the database; IDS file loads the intrusion-detection system (IDS) file containing the information about the functions of specific import library (all IDS files located within the IDS directory are loaded automatically, PDB file loads the PDB file containing debug information, DBG file loads the file containing debug information, FLIRT signature file loads and applies the signatures file (the same operation is executed in the signatures window, as shown in Fig. 5.5), and Parse C header file reads the type definitions from the header file for further declarations of new structures and enumerations (see the description of the Enums and Structures windows).
- Produce file — Create new files of different structures on the basis of the disassembled code: a MAP file that can be used by debuggers, an Assembly file (having the ASM file name extension), an LST file (listing), a listing in the HTML format, etc.
- IDC file — Load and execute the script file (see Section 5.2.1).
- IDC command — Call the window for immediate script execution.
- Save... — Save the current disassembling database in the file with the IDB file name extension.
- Save as... — Save the current disassembling database under the specified name.
- Close — Close the disassembled file, saving the disassembling database.
The Edit menu items are as follows:
- Copy — Copy the selected fragment into the clipboard.
- CODE — Convert the block to the executable code.
- DATA — Convert the selected block to data.
- Struct var... — Convert the block to the selected structure.
- Strings — Convert to a string (string types can be chosen from the submenu).
- Array — Convert to the array with the predefined parameters.
- Undefine — Mark the selected block as data of an undefined structure.
- Name — Rename.
- Operand type — Specify the operand type.
- Comments — Control comments.
- Segments — Control segments.
- Structs — Control structures.
- Functions — Control functions.
- Other — Perform other functional capabilities, such as specifying the alignment directive, entering instructions or data, or highlighting with a color.
- Plugins — Use external plug-in modules.
The items of the Jump menu are intended for various jumps in the disassembled code, such as jumping to the specified address, jumping to the specified function (which can be chosen from the list), jumping to the program's entry point, marking a code line, and jumping to the specified label.
Items of the Search menu are intended for various search operations in the disassembled text, such as searching for text, searching for the next data block, searching for the next Assembly instruction, and searching for the next byte sequence.
Using items of the View menu, it is possible to customize the look of the IDA Pro disassembler: Open new windows (Open Subviews), create and delete toolbars (Toolbars), hide and unhide functions (Hide and Unhide commands, respectively), open the calculator window, etc.
Commands from the Debugger menu allow you to use various IDA Pro debugging capabilities: control breakpoints (Breakpoints), control watches (Watches), control tracing (Tracing); view the contents of various registers (General registers, Segment registers, and FPU registers); etc.
The Options menu items allow you to change various IDA Pro settings, some of which were covered earlier when I described the loading control window.
Using items of the Windows menu, you can control IDA Pro windows.
The Help menu items allow you to display help topics and obtain technical support.

Program Start-Up Keys

When starting IDA Pro, you can use the following start-up keys:

-a — Disable automatic analysis.
-A — Start IDA Pro and automatically load the last database.
-b#### — Specify the address for loading a module.
-B — Start IDA Pro and automatically generate IDB and ASM files.
-c — Remove the old disassembling database.
-ddirective — Start IDA Pro and specify the loading directive for the first-pass analysis.
-Ddirective — Start IDA Pro and specify the loading directive for the second-pass analysis.
-f — Exclude FPU instructions.
-h — Open the IDA Pro help window.
-i — Specify the address of the program's entry point.
-M — Disable the mouse (for the console variant of loading).
-O####— Pass options for the plug-in module, -Oplugl:optl:opt2:opt3, where plugl is the name of the plug-in module and optl, opt2, and opt3 are module options.
-o#### — Specify the database name (used in combination with the -c key).
-p — Specify the processor type.
-P+ — Pack the database.
-P- — Do not pack the database.
-R — Load resources from the executable file.
-S#### — Execute the specified IDC file.
-W#### — Specify the Windows directory.
-x — Do not create segments.
-? — Display help about IDA Pro start-up keys.

5.1.2. Simple Examples of Code Investigation

In this section, I'll return to the examples written in Assembly language considered in Section 1.6. The reason I decided to do so is straightforward. Using Assembly, it is easy to model the required situation to demonstrate specific patterns of code investigation in IDA Pro.

About IDA Pro Capabilities

Easy Examples

In the preceding chapters, lots of examples were considered that illustrate the capabilities of IDA Pro in analyzing executable code. Consider the program shown in Listing 5.1.

Listing 5.1: Easy Assembly program (see Listing 1.43)

 .586P .MODEL FLAT, STDCALL includelib e:\masm32\lib\user32.lib EXTERN        MessageBoxA@16:NEAR ; Data segment _DATA SEGMENT TEXT1 DB 'No problem!', 0 TEXT2 DB 'Message', 0 _DATA ENDS ; Code segment _TEXT SEGMENT START:        PUSH OFFSET 0        PUSH OFFSET TEXT2        PUSH OFFSET TEXT1        PUSH 0        CALL MessageBoxA@16        MOV ESI, 3        ADD ESI, OFFSET L2 L2:        CALL ESI        RETN L1:        XOR EAX, EAX        RETN _TEXT ENDS END START

You should immediately notice the small trick hidden behind the code fragment, shown in Listing 5.2.

Listing 5.2: Fragment of the program in Listing 5.1 that hides a small trick

      MOV ESI, 3      ADD  ESI, OFFSET L2 L2:      CALL ESI      RETN L1:

The CALL ESI command jumps to the L1 label. How would IDA Pro react to this situation? Consider the disassembled code produced by IDA Pro (Listing 5.3).

Listing 5.3: Disassembled listing of the program in Listing 5.1, produced by IDA Pro

 .text:00401000 _text                       segment para public 'CODE' use32 .text:00401000                             assume cs:_text .text:00401000                             ; org 401000h .text:00401000 assume es:nothing, ss:nothing, ds:_data, fs:nothing, gs:nothing .text:00401000 ;  -------  S U B R O U T I N E  ------------------------------ .text:00401000                public start .text:00401000 start          proc near .text:00401000                push    0               ; uType .text:00401002                push    offset Caption  ; 1pCaption .text:00401007                push    offset Text     ; 1pText .text:0040100C                push    0               ; hWnd .text:0040100E                call    MessageBoxA .text:00401013                mov     esi, 3 .text:00401018                add     esi, offset loc_40101E .text:0040101E loc_40101E:                            ; DATA XREF: start+18↑o .text:0040101E                call    esi             ; sub_401021 .text:00401020                retn .text:00401020 start          endp .text:00401021 ;  -------  S U B R O U T I N E  ------------------------------ .text:00401021 .text:00401021 sub_401021    proc near    ; CODE XREF: start:loc_40101E↑p .text:00401021               xor     eax, eax .text:00401023               retn .text:00401023 sub_401021    endp .text:00401023 .text:00401024 ; [00000006 BYTES: COLLAPSED FUNCTION MessageBoxA. .text:00401024 ; PRESS KEYPAD "+" TO EXPAND] .text:0040102A              align 200h .text:0040102A _text        ends

From Listing 5.3, it is evident that IDA Pro clearly traces the value of the ESI register and, thus, determines the start of the sub_401021 procedure. The arithmetic here is easy. It is only necessary to add three to the address of the loc_40101E procedure to obtain the exact address of the called procedure. Having located the start of the procedure, it is easy to determine its end. In this case, the end of procedure is defined by the RETN command nearest the start.

Now, modify the program from Listing 5.1. The modified code is shown in Listing 5.4. As it turns out, even the slightest modification produces some difficulties with code disassembling.

Listing 5.4: Modified code of the program shown in Listing 5.1

 .586P .MODEL FLAT,STDCALL includelib e:\masm32\lib\user32.lib EXTERN        MessageBoxA@16:NEAR ; Data segment _DATA SEGMENT TEXT1 DB 'No problem!', 0 TEXT2 DB 'Message', 0 _DATA ENDS ; Code segment _TEXT SEGMENT START:        PUSH OFFSET 0        PUSH OFFSET TEXT2        PUSH OFFSET TEXT1        PUSH 0        CALL MessageBoxA@16        MOV  ESI, 3        ADD  ESI, OFFSET L2        PUSH ESI        POP  EDI L2:        CALL EDI        RETN L1:        XOR  EAX, EAX        RETN _TEXT ENDS END START

Consider the listing of the disassembled code produced by IDA Pro — after analysis of the executable code created by the Assembly translator from the program shown in Listing 5.4. The disassembled result is provided in Listing 5.5.

Listing 5.5: Disassembled text of the program shown in Listing 5.4

 .text:00401000 _text                  segment para public 'CODE' use32 .text:00401000                        assume cs:_text .text:00401000                        ; org 401000h .text:00401000 assume es:nothing, ss:nothing, ds:_data, fs:nothing, gs:nothing .text:00401000 ;---------- S U B R O U T I N E  ------------------------------ .text:00401000                   public start .text:00401000 start             proc near .text:00401000                   push    0                ; uType .text:00401002                   push    offset Caption   ; lpCaption .text:00401007                   push    offset Text      ; lpText .text:0040100C                   push    0                ; hWnd .text:0040100E                   call    MessageBoxA .text:00401013                   mov     esi, 3 .text:00401018                   add     esi, offset loc_401020 .text:0040101E                   push    esi .text:0040101F                   pop     edi .text:00401020 loc_401020:                         ; DATA XREF: start+18↑o .text:00401020                   call    edi .text:00401022                   retn .text:00401022 start             endp .text:00401023 ; ---------  S U B R O U T I N E ------------------------------ .text:00401023                  xor      eax, eax .text:00401025                  retn .text:00401026 ; [00000006 BYTES: COLLAPSED FUNCTION MessageBoxA. .text:00401026 ; PRESS KEYPAD "+" TO EXPAND] .text:0040102C                  align 200h .text:0040102C _text            ends

Consider Listing 5.5, which presents the analysis carried out by the IDA Pro disassembler. As you can see, as a result of slight modifications of the program's source code the procedure located at the 00401023 address can no longer be recognized by the disassembler. To understand how the analysis was carried out, it is necessary to consider the algorithm used by IDA Pro. However, some conclusions can be drawn even without viewing the algorithm. As I already mentioned, IDA Pro is a careful program. It avoids drawing premature conclusions. In this case, there is a certain probability that some jump (an indirect one) will be made to the loc_401020 label. This jump originates from some different location within a program, in which case it is probable that the procedure address will be different. This is hard to assess and evaluate; however, to be on the safe side it is possible to take such a possibility into account and rely on the interactive work with the user. Nevertheless, consider the code fragment shown in Listing 5.6.

Listing 5.6: Code example, for which IDA Pro correctly identifies the procedure address

      PUSH ESI      POP ESI L2:      CALL ESI

In this example, IDA Pro doesn't encounter any difficulties and correctly identifies the procedure address.

Interactive Work with IDA Pro

Consider examples of interactive work of the code investigator with IDA Pro.

The example shown in Listing 5.7 contains a simple Assembly program. As can be easily seen, the CALL EDI command is executed by the address corresponding to the L1 label.

Listing 5.7: Simple Assembly program illustrating interactive work of the code investigator and IDA Pro

 .586P .MODEL FLAT, STDCALL includelib e:\masm32\lib\user32.lib EXTERN MessageBoxA@16:NEAR ; Data segment _DATA SEGMENT TEXT1 DB 'No problem!', 0 TEXT2 DB 'Message', 0 _DATA ENDS ; Code segment _TEXT SEGMENT START:        MOV ESI, 3        PUSH ESI        PUSH OFFSET 0        PUSH OFFSET TEXT2        PUSH OFFSET TEXT1        PUSH 0        CALL MessageBoxA@16        POP  EDI        ADD  EDI, OFFSET L2 L2:        CALL EDI        RETN L1:        XOR  EAX, EAX        RETN _TEXT ENDS END START

Translate the program, and then load it into the IDA Pro disassembler. The disassembled code is presented in Listing 5.8.

Listing 5.8: Disassembled code of the program shown in Listing 5.7

 .text:00401000 _text       segment para public 'CODE' use32 .text:00401000             assume cs:_text .text:00401000             ; org 401000h .text:00401000 assume es:nothing, ss:nothing, ds:_data, fs:nothing, gs:nothing .text:00401000 ; ------  S U B R O U T I N E  -------------------------------- .text:00401000 .text:00401000           public start .text:00401000 start     proc near .text:00401000           mov     esi, 3 .text:00401005           push    esi .text:00401006           push    0                   ; uType .text:00401008           push    offset Caption      ; lpCaption .text:0040100D           push    offset Text         ; lpText .text:00401012           push    0                   ; hWnd .text:00401014           call    MessageBoxA .text:00401019           pop     edi .text:0040101A           add     edi, offset loc_401020 .text:00401020 .text:00401020 loc_401020:                      ; DATA XREF: start + 1A↑o .text:00401020           call    edi .text:00401022           retn .text:00401022 start     endp .text:00401023           xor     eax, eax .text:00401025           retn .text:00401026 ; [00000006 BYTES: COLLAPSED FUNCTION MessageBoxA. .text:00401026 ; PRESS KEYPAD "+" TO EXPAND] .text:0040102C           align 200h .text:0040102C _text     ends

As could be expected, the disassembler doesn't recognize the address, at which the CALL EDI call will be carried out.

To begin code investigation, create a function at the 00401023 address. It is clear that you are dealing with some function even without determining the address, at which the CALL EDI call is carried out. The sequence XOR EAX, EAX\RETN is clear evidence of the presence of the body of some function. Set the cursor to the first command of the assumed function and press <P> or use the Edit | Functions | Create function menu commands. IDA Pro will create the function automatically (Listing 5.9).

Listing 5.9: Function automatically created by IDA Pro

 .text:00401023 sub_401023      proc near .text:00401023                 xor eax, eax .text:00401025                 retn .text:00401025 sub_401023      endp

Now you can use the references to the function in the disassembled text. The disassembler will automatically encounter your edits and continue analysis with taking into account the corrections you have introduced. Move to the line of code located at the 00401020 address (CALL EDI). Press <;> to enter the comment. To achieve this, you can use the Edit | Comments | Enter comments menu commands. As a result, the window that allows you to enter the comment will appear (Fig. 5.6). You can enter any comment here.

image from book
Figure 5.6: The IDA Pro window that allows the user to enter comments

Comments in IDA Pro have one specific feature: Some comments contain information not only for the code investigator but also for the disassembler. Enter the following line into the edit window: DATA XREF: sub_401023. By doing so, you specify that the procedure is called by the address corresponding to the sub_401023 label. The result obtained is interesting. You are not simply retrieving the comment, by clicking which it is possible to jump to appropriate reference. The line with the 0040101 address also is automatically supplied with the comment. Consider the fragment shown in Listing 5.10.

Listing 5.10: Code fragment illustrating automatic generation of cross-references

 .text:00401019     pop   edi .text:0040101A     add   edi, offset loc_401020 ; DATA XREF: sub_401023 .text:00401020 loc_401020:                      ; DATA XREF: start + 1A↑o .text:00401020     call  edi                    ; DATA XREF: sub_401023 .text:00401022     retn .text:00401022 start   endp .text:00401023 ; ------ S U B R O U T I N E ----------------------------- .text:00401023 .text:00401023 sub_401023    proc near .text:00401023          xor  eax, eax .text:00401025          retn .text:00401025 sub_401023    endp

Debugging in IDA Pro

Although debugging is not the primary function of IDA Pro, this function is quite usable and deserves attention.

After loading an executable module into the IDA Pro disassembler, it is possible to start the debugger. However, it is necessary to define the first breakpoint beforehand. The simplest way of doing this is using the command that executes the program to the current cursor position by using the Debugger | Run to cursor menu command or by pressing <F4>. The best approach is setting the first breakpoint ^[1] at the first instruction of the main or WinMain functions, after which you can use step-by-step tracing to step into procedures (<F7>) or over them (<F8>). It is also possible to use the Debugger | Start process command (or press <F9>), having previously set one or more breakpoints. You can set breakpoints directly in the disassembled code using <F2>, in which case the line where the instruction is located will be highlighted (red by default). Finally, it is possible to use the Debugger setup window by choosing the Debugger | Debugger options... menu commands (Fig. 5.7). In the Events group of flags, you can define events, to which the debugger should react; it is advisable to set the Stop on debugging start checkbox (the debugger would stop at the instance of its start-up) or the Stop on process entry points checkbox (the debugger would stop at the first executable instruction of the program ^[2]).

image from book
Figure 5.7: The Debugger setup window

Note

The Stop on thread start/exit checkbox seems somewhat strange to me. When a process is created, at least one thread will be created, usually called the main thread. However, the developers have ignored this issue meaning that this checkbox refers only to threads explicitly created in the program.

Thus, it is clear where to set the first breakpoint. What possibilities are at your disposal when using IDA Pro as a debugger? The key issues are listed here:

After application start-up, the IDA Pro interface changes. Debug windows will appear that make it possible to control the debugging process. The View EIP window contains the code being debugged, the View ESP window contains the stack contents and the current ESP value, the General registers window contains the current contents of the general-purpose registers and the flags register, and the Threads window contains information about the application threads. The debugger always displays in the list the thread where the debug events take place. In addition, it is possible to open the FPU registers window containing the contents of the coprocessor registers (this window will be opened automatically if floating-point instructions are executed). The Modules window displays the list of loaded modules.
You can run the debugged program step by step using the Debugger | Step over command (<F8>) and the Debugger | Step into command (<F7>), execute the program to the first return command encountered (<Ctrl>+<F7>), and suspend application execution (Debugger | Pause process).
You can watch the specified memory cells in the Watch list window (Debugger | Watches | Watch list). To specify memory cells for watching, use the Debugger | Watches | Add watch menu commands.
You can use tracing; in other words, you can log the state of the program being debugged at each debugging step. To control tracing, use the Debugger | Tracing submenu. All tracing events are displayed in the Trace window. It is possible to display the following tracing events: instruction execution, function execution, and memory read or write operations.

^[1]Recall that earlier in this book, such breakpoints were called nonpermanent breakpoints (see Section 4.1.3).

^[2]I hope you understand that this instruction probably doesn't match the first instruction of the main or WinMain functions.