More Instructions You Need to Know | Debugging Applications for MicrosoftВ® .NET and Microsoft WindowsВ® (Pro-Developer)

[Previous] [Next]

The instructions covered in this section apply to data and pointer manipulation, comparing and testing, jumping and branching, looping, and string manipulation.

Data Manipulation

AND Logical-AND

OR Logical-OR (inclusive)

The AND and OR instructions perform the bitwise operations that should be familiar to everyone because they are the basis of bit manipulation.

NOT One's complement negation

NEG Two's complement negation

The NOT and NEG instructions sometimes cause some confusion because they look similar but they certainly don't indicate the same operation. The NOT instruction is a bitwise operation that turns each binary 1 into a 0 and each 0 into a 1. The NEG instruction is the equivalent of subtracting the operand from 0. The following code snippet shows the differences between these two instructions:

void NOTExample ( void ) { __asm { MOV EAX , 0FFh MOV EBX , 1 NOT EAX // EAX now holds 0FFFFFF00h. NOT EBX // EBX now holds 0FFFFFFFEh. } } void NEGExample ( void ) { __asm { MOV EAX , 0FFh MOV EBX , 1 NEG EAX // EAX now holds 0FFFFFF01h ( 0 - 0FFh ). NEG EBX // EBX now holds 0FFFFFFFFh ( 0 - 1 ). } }

XOR Logical-OR (exclusive)

You'll see the XOR instruction used quite a bit, not because people are keenly interested in exclusive OR operations but because it's the fastest way to zero out a value. Using XOR on two operands will set each bit to 1 if the same bit in each operand is different. If each bit is the same, the result is 0. Because "XOR EAX, EAX" is faster than "MOV EAX, 0" (because the former takes fewer clock cycles), the Microsoft compilers use it to zero out registers.

INC Increment by 1

DEC Decrement by 1

These instructions are straightforward, and you can figure out what they do just from their names. The compiler often uses these instructions when optimizing certain code sequences because each of them executes in a single clock cycle. Additionally, these instructions map directly to the C integer ++ and the -- arithmetic operators.

SHL Shift left, multiply by 2

SHR Shift right, divide by 2

Binary manipulation bit shifts are faster than the corresponding multiplication and division instructions in x86 CPUs. These instructions are akin to the C << and >> bitwise operators, respectively.

DIV Unsigned division

MUL Unsigned multiplication

These seemingly straightforward instructions are in fact a little odd. Both instructions perform their unsigned operations on the EAX register. But the output implicitly uses the EDX register. The high bytes of double-word and higher size multiplications are placed in the EDX register. The DIV instruction stores the remainder in EDX and the quotient in EAX. Both instructions operate on the value in EAX only with register or memory values.

IDIV Signed division

IMUL Signed multiplication

These instructions are similar to the DIV and MUL instructions except that they treat operands as signed values. The same result gyrations happen with the IDIV and IMUL instructions as with the DIV and MUL instructions. An IMUL instruction sometimes has three operands. The first operand is the destination, and the last two are source operands. IMUL is the only three-operand instruction in the x86 instruction set.

LOCK Assert LOCK# signal prefix

LOCK isn't an actual instruction but rather a prefix to other instructions. The LOCK prefix tells the CPU that the memory accessed by the following instruction needs to be an atomic operation, so the CPU executing the instruction locks the memory bus and prevents any other CPUs on the system from accessing that memory.

MOVSX Move with sign-extend

MOVZX Move with zero-extend

These two instructions copy smaller size values to larger size values and dictate how the larger values fill the upper bits. MOVSX indicates that the sign value on the source operand will extend through the upper bits of the destination register. MOVZX fills the upper bits of the destination register with 0. These are two instructions to watch for when you're tracking down sign errors.

Pointer Manipulation

LEA Load effective address

LEA loads the destination register with the address of the source operand. The following code snippet shows two examples of the LEA instruction. The first example shows how to assign an address to an integer pointer. The second shows how to retrieve the address of a local character array with the LEA instruction and pass the address as a parameter to the GetWindowsDirectory API function.

void LEAExamples ( void ) { int * pInt ; int iVal ; // The following instruction sequence is identical to the C code // pInt = &iVal ;. __asm { LEA EAX , iVal MOV [pInt] , EAX } //////////////////////////////////////////////////////////////////// char szBuff [ MAX_PATH ] ; // Another example of accessing a pointer through LEA. This // instruction sequence is identical to the C code // GetWindowsDirectory ( szBuff , MAX_PATH ) ;. __asm { PUSH 104h // Push MAX_PATH as the second parameter. LEA ECX , szBuff // Get the address of szBuff. PUSH ECX // Push the address of szBuff as the first // parameter. CALL DWORD PTR [GetWindowsDirectory] } }

Comparing and Testing

CMP Compare two operands

The CMP instruction compares the first and second operands by subtracting the second operand from the first operand, discarding the results, and setting the appropriate flags in the EFLAGS register. You can think of the CMP instruction as the conditional part of the C if statement. Table 6-4 shows the different flags and the values they correspond to when the CMP instruction executes.

Table 6-4 Result Values and Their Flag Settings

Result (First Operand Compared to Second Operand)	Register Window Flag Settings	Intel Manual Flag Settings
Equal	ZR = 1	ZF = 1
Less than	PL != OV	SF != OF
Greater than	ZR = 0 and PL = OV	ZF = 0 and SF = OF
Not equal	ZR = 0	ZF = 0
Greater than or equal	PL = OV	SF = OF
Less than or equal	ZR = 1 or PL != OV	ZF = 1 or SF != OF

TEST Logical compare

The TEST instruction does a bitwise logical AND of the operands and sets the PL, ZR, and PE (SF, ZF, and PF for the Intel manuals) flags accordingly. The TEST instruction checks whether a bit value was set.

Jump and Branch Instructions

JMP Absolute jump

Just as the name implies, the JMP moves execution to the absolute address.

JE Jump if equal

JL Jump if less than

JG Jump if greater than

JNE Jump if not equal

JGE Jump if greater than or equal

JLE Jump if less than or equal

The CMP and TEST instructions aren't much good if you don't have a way to act on their results. The conditional jumps allow you to branch accordingly. The instructions above are the most common ones you'll see in the Disassembly window, though there are 31 different conditional jumps, many of which perform the same action except that the mnemonic is expressed with "not." For example, JLE (jump if less than or equal) has the same opcode as JNG (jump if not greater than). If you're using a disassembler other than the Visual C++ debugger, you might see some of the other instructions. You should get the Intel manuals and look up the "Jcc" codes so that you can decode all the jump instructions.

I listed the conditional jump instructions in the same order as they're shown in Table 6-4 so that you can match them up. One of the conditional jumps closely follows any CMP or TEST instructions. Optimized code might have a few instructions interspersed between the check and the jump, but those instructions are guaranteed not to change the flags.

When you're looking at a disassembly, you'll notice that the conditional check is generally the opposite of what you typed in. The first section in the following code shows an example.

void JumpExamples ( int i ) { // Here is the C code statement. Notice that the conditional is // "i > 0," but the compiler generates the opposite. The assembly // language that I show is similar to what the compiler generates. // Different optimization methods generate different code. // if ( i > 0 ) // { // printf ( "i > 0\n" ) ; // } char szGreaterThan[] = "i > 0\n" ; __asm { CMP i , 0 // Compare i to 0 by subtracting (i - 0). JLE JE_LessThanOne // If i is less than or equal to 0, jump to // the label. PUSH i // Push the parameter on the stack. LEA EAX , szGreaterThan // Push the format string. PUSH EAX CALL DWORD PTR [printf] // Call printf. Notice that you can // tell printf probably comes from a DLL // because I'm calling through a pointer. ADD ESP , 8 // printf is __cdecl, so I need to clean up // the stack in the caller. JE_LessThanOne: // With the inline assembler, you can jump // to any C label. } //////////////////////////////////////////////////////////////////// // Take the absolute value of the parameter and check again. // The C code: // int y = abs ( i ) ; // if ( y >=5 ) // { // printf ( "abs(i) >= 5\n" ) ; // } // else // { // printf ( "abs(i) < 5\n" ) ; // } char szAbsGTEFive[] = "abs(i) >= 5\n" ; char szAbsLTFive[] = "abs(i) < 5\n" ; __asm { MOV EBX , i // Move i's value into EBX. CMP EBX , 0 // Compare EBX to 0 (EBX - 0). JG JE_PosNum // If the result is greater than 0, EBX // is positive. NEG EBX // Turn negative into positive. JE_PosNum: CMP EBX , 5 // Compare EBX to 5 (EBX _ 5). JL JE_LessThan5 // Jump if less than 5. LEA EAX , szAbsGTEFive // Get the pointer to the correct format // string into EAX. JMP JE_DoPrintf // Go to the printf call. JE_LessThan5: LEA EAX , szAbsLTFive // Get the pointer to the correct format // string into EAX. JE_DoPrintf: PUSH EAX // Push the string. CALL DWORD PTR [printf] // Print it. ADD ESP , 4 // Restore the stack. } }

As you can see, the result in the first example is correct. The idea to remember is that it's more efficient to check the opposite condition and jump around than to jump someplace to execute what's inside the if statement and then jump back.

JA Jump if above

JBE Jump if below or equal

JC Jump if carry

JNC Jump if not carry

JNZ Jump if not 0

JZ Jump if 0

These conditional branch instructions aren't as common as the ones listed earlier, but you might see them in the Disassembly window. You should be able to intuit the condition from the jump names.

Looping

LOOP Loop according to ECX counter

You might not run into too many LOOP instructions because the Microsoft compilers don't generate them that much. In some parts of the operating system core, however (parts that look as if Microsoft wrote them in assembly language), you'll occasionally see them. Using the LOOP instruction is easy. Set ECX equal to the number of times to loop, and then execute a block of code. Immediately following the code is the LOOP instruction, which decrements ECX and then jumps to the top of the block if ECX isn't equal to 0. When ECX reaches 0, the LOOP instruction falls through.

Most of the loops you'll see are a combination of conditional jumps and absolute jumps. In many ways, these loops look like the if statement code presented a moment ago except that the bottom of the if block is a JMP instruction back to the top. The following example is representative of your average code-generation loop.

void LoopingExample ( int q ) { // Here's the C code: // for ( ; q < 10 ; q++ ) // { // printf ( "q = %d\n" , q ) ; // } char szFmt[] = "q = %d\n" ; __asm { JMP LE_CompareStep // First time through, check against // 10 immediately. LE_IncrementStep: INC q // Increment q. LE_CompareStep: CMP q , 0Ah // Compare q to 10. JGE LE_End // If q is >= 10, this function is done. MOV ECX , DWORD PTR [q] // Get the value of q into ECX. PUSH ECX // Get the value onto the stack. LEA ECX , szFmt // Get the format string. PUSH ECX // Push the format string onto the stack. CALL DWORD PTR [printf] // Print the current iteration. ADD ESP , 8 // Clean up the stack. JMP LE_IncrementStep // Increment q, and start again. LE_End: // The loop is done. } }

String Manipulation

The Intel CPUs are adept at manipulating strings. In the vernacular of CPUs, being good at string manipulation means that the CPU can manipulate large chunks of memory in a single instruction. All the string instructions I'll show you have several mnemonics, which you'll see if you look them up in the Intel reference manuals, but the Visual C++ Disassembly window always disassembles string instructions into the forms I show. All these instructions can work on byte, word, and double-word size memory.

MOVS Move data from string to string

The MOVS instruction moves the memory address at ESI to the memory address at EDI. The MOVS instruction operates only on values that ESI and EDI point to. You can think of the MOVS instruction as the implementation of the C memcpy function. The Visual C++ Disassembly window always shows the size of the operation with the size specifier, so you can tell at a glance how much memory is being moved. After the move is completed, the ESI and EDI registers are incremented or decremented depending on the Direction Flag in the EFLAGS register (shown as the UP field in the Visual C++ Registers window). If the UP field is 0, the registers are incremented. If the UP field is 1, the registers are decremented. The increment and decrement amounts depend on the size of the operation: 1 for bytes, 2 for words, and 4 for double words.

SCAS Scan string

The SCAS instruction compares the value at the memory address specified by the EDI register with the value in AL, AX, or EAX, depending on the requested size. The various flag values in EFLAGS are set to indicate the comparison values. The flag settings are the same as those shown in Table 6-4. If you scan the string for a NULL terminator, the SCAS instruction can be used to duplicate the functionality of the C strlen function. Like the MOVS instruction, the SCAS instruction autoincrements or autodecrements the EDI register.

STOS Store string

The STOS instruction stores the value in AL, AX, or EAX, depending on the requested size, into the address specified by the EDI register. The STOS instruction is similar to the C memset function. Like both the MOVS and SCAS instructions, the STOS instruction autoincrements or autodecrements the EDI register.

CMPS Compare strings

The CMPS instruction compares two string values and sets the flags in EFLAGS accordingly. Whereas the SCAS instruction compares a string with a single value, the CMPS instruction walks the characters in two strings. The CMPS instruction is similar to the C memcmp function. Like the rest of the string manipulators, the CMPS instruction compares different size values and autoincrements and autodecrements the pointers to both strings.

REP Repeat for ECX count

REPE Repeat while equal or ECX count isn't 0

REPNE Repeat while not equal or ECX count isn't 0

The string instructions, though convenient, aren't worth a great deal if they can manipulate only a single unit at a time. The repeat prefixes allow the string instructions to iterate for a set number of times (in ECX) or until the specified condition is met. If you use the Step Into key when a repeat instruction is executing in the Disassembly window, you'll stay on the same line because you're executing the same instruction. If you use the Step Over key, you'll step over the entire iteration. If you're looking for a problem, you might want to use the Step Into key to check the strings in ESI or EDI as appropriate. Another trick when looking at a crash in a repeat prefixed string instruction is to look at the ECX register to see which iteration crashed.

In talking about the string instructions, I mentioned which C run-time library function each was similar to. The following code shows, without obvious error checking, what the assembly-language equivalents could look like:

void MemCPY ( char * szSrc , char * szDest , int iLen ) { __asm { MOV ESI , szSrc // Set the source string. MOV EDI , szDest // Set the destination string. MOV ECX , iLen // Set the length to copy. // Copy away! REP MOVS BYTE PTR [EDI] , BYTE PTR [ESI] } } int StrLEN ( char * szSrc ) { int iReturn ; __asm { XOR EAX , EAX // Zero out EAX. MOV EDI , szSrc // Move the string to check into EDI. MOV ECX , 0FFFFFFFFh // The maximum number of characters to // check. REPNE SCAS BYTE PTR [EDI] // Compare until ECX=0 or found. CMP ECX , 0 // If ECX is 0, a JE StrLEN_NoNull // NULL wasn't found in the string. NOT ECX // ECX was counted down, so convert it // to a positive number. DEC ECX // Account for hitting the NULL. MOV EAX , ECX // Return the count. JMP StrLen_Done // Return. StrLEN_NoNull: MOV EAX , 0FFFFFFFFh // Because NULL wasn't found, return -1. StrLEN_Done: } __asm MOV iReturn , EAX ; return ( iReturn ) ; } void MemSET ( char * szDest , int iVal , int iLen ) { __asm { MOV EAX , iVal // EAX holds the fill value. MOV EDI , szDest // Move the string into EDI. MOV ECX , iLen // Move the count into ECX. REP STOS BYTE PTR [EDI] // Fill the memory. } } int MemCMP ( char * szMem1 , char * szMem2 , int iLen ) { int iReturn ; __asm { MOV ESI , szMem1 // ESI holds the first memory block. MOV EDI , szMem2 // EDI holds the second memory block. MOV ECX , iLen // The maximum bytes to compare // Compare the memory blocks. REPE CMPS BYTE PTR [ESI], BYTE PTR [EDI] JL MemCMP_LessThan // If szSrc < szDest JG MemCMP_GreaterThan // If szSrc > szDest // The memory blocks are equal. XOR EAX , EAX // Return 0. JMP MemCMP_Done MemCMP_LessThan: MOV EAX , 0FFFFFFFFh // Return -1. JMP MemCMP_Done MemCMP_GreaterThan: MOV EAX , 1 // Return 1. JMP MemCMP_Done MemCMP_Done: } __asm MOV iReturn , EAX return ( iReturn ) ; }