Working with Strings and Data Arrays | Visual C++ Optimization with Assembly Code

Assembly language also has many advantages concerning processing strings and data arrays. To perform such operations, a whole group of commands has been developed. In the Intel terminology, they are referred to as string commands . While discussing the issues of string processing, we will apply the following operations:

Comparing two strings
Copying the source string to the destination string
Reading a string from a device or a file
Writing a string to a device or a file
Determining the length of the string
Finding a substring in the given string
Merging two strings (concatenation)

The string operations are widely used in high-level languages. By implementing these operations in assembly language, you can achieve considerable gain in performance of high-level language applications, especially for those processing a great number of operating system strings or arrays.

First, we will consider the main assembly language commands for manipulating the strings.

String Operations in Assembly Language

A string of characters or a set of numbers , which is treated by the program as a group, is a common data type. The program may send the string from one location to another, compare it with other strings, or find the given value within the string.

The program presents every word, sentence , or another structure through the use of a sequence of characters in the memory. For example, the text editors mainly use the search and transfer operations. The processor s string commands can perform these operations with minimal overhead, and within minimal time intervals.

First, we will address the main principles of working with strings.

The program may perform string operations over bytes, words, and double words.

The string commands do not use the addressing methods used by other commands. They address the operands with combinations of the ESI or EDI registers. For the source operands, the ESI register is used, and for the resulting operands, the EDI register is used. All string commands correct the address after performing the operation.

The string may consist of several elements, but the string commands can process only one element at a time. The automatic incrementing (increasing) or decrementing ( decreasing ) of the address of the operand makes it possible to process the string data quickly. The Direct Flag in the status register determines the direction of string processing.

If the Direct Flag is set to 1, the address will be decremented, and if it is set to 0, the address is incremented. The increment or decrement value itself is determined by the size of the operand. For example, for character strings with 1-byte operands, the string processing commands change the address by 1 after each operation. If you process an array of integers with each operand having the size of 4 bytes, the string commands will change the address by 4. When the operation is completed, the address pointer in the ESI or EDI register points to the next element of the string.

Now, we will explore the representation of strings in various programming languages. The most frequent are the null- terminated strings. They are used in the C language and in the Windows operating systems. In assembly language, such a string looks like this:

 String_0  DB NULL-TERMINATED STRING,  0

We will deal mainly with the null-terminated strings.

Basic String Commands

There are five basic commands for processing the strings, which are often referred to as string commands:

movs ”the command for moving a data string from one memory location to another.
lods ”for the string with the address stored in the ESI register. This command loads it to the EAX (AX, AL) accumulator .
stos ”this command saves the contents of the EAX (AX, AL) memory register to the address specified in the EDI register.
cmps ”the command for comparing the strings allocated at the addresses stored in the ESI and EDI registers.
scas ”the string scanning command which compares the contents of the EAX (AX, AL) register to the memory value determined by the EDI register.

Each of the string processing commands has three possible formats. For example, the movs command is represented in the following ways: movsb , movsw , movsd . The movsb command can be used for processing 1-byte operands only, the movsw command deals with words, and movsd is intended for processing double words. The b , w , and d suffixes determine the value of increment or decrement for the ESI and EDI index registers. If the command is used in the general format, the type of operands should be defined explicitly.

Before performing string commands, make sure to load the addresses of the needed memory areas to the ESI and/or EDI registers.

To perform repeated operations over strings, the repetition prefix ( rep ) is used in virtually all cases. The number of repetitions for the string operation is determined by the ECX register.

To move data from one location to another, you may find it convenient to combine the lods and stos command. But indeed, there is a designated command for this purpose ”the movs command for string transfer. It reads the data at the given memory address specified in the ESI register, and places it at the address indicated by the EDI register. At the same time, the values of the ESI and EDI registers change so that they point to the next elements of the strings. The movs command does not load the accumulator while transferring the data.

The movs command takes several operands. Except movs , there is only one more string command ( cmps ) that can work with two operands from the memory. All the other commands demand one or both operands to be stored in one of the microprocessor registers. The movs command, as well as the lods and stos commands, can work both with bytes and with words.

Since the string commands deal with fixed addresses, it is the developer who should control what type of operands are sent to this command. Both operands of the command should be of the same type. You can also add suffixes to the command to specify the transfer type: the movsb command is used for byte strings, and movsw for strings consisting of words. If the program uses the command in its basic form ( movs ), the assembler verifies the variables , checking whether the segment addressing is correct and the operands are of the same type.

The movs command with the rep prefix is an efficient way to transfer a block. With the character counter in the ECX register, and the DF flag specifying the transfer direction, the rep movs command is a very quick way to transfer data from one memory location to another.

With regard to the scanning operation ( scas ), the last scanning iteration also resets the zero flag ( ZF ) when the ECX register reaches the zero value (0). This indicates that there is no corresponding element in the array.

The Use of String Commands

Copying the Strings

The following program illustrates the process of copying one string to another, with both strings having the Cstring type. Now, we will use the C++ .NET Application Wizard to develop a dialog box application. In the main application form, place two Edit Control controls. These will be linked to the variables for the source string ( cSrc ) and the destination string ( cDst ), both of the Cstring type. In the cSrc editing field, enter the string that you need to copy to cDst and display on the screen. We will also add two Static Text controls and a Button control to the form. When the button is clicked, the contents of the cSrc string are copied to the cDst string. To clarify further, we will use the interim variables ( s1 and s2 ) of the Cstring type. The source code for the button click handler is shown in Listing 2.12.

Listing 2.12: Using the assembly-language commands for copying a string in a button click handler in a C++ .NET program

 . . .  void CCP_STRINGDlg::OnBnClickedButtonl()  {   // TODO: Add your control notification handler code here.  UpdateData(TRUE);  CString s1, s2;  LPCTSTR lps1, lps2;  s1 = cSrc;  s2 = cDst;      lps1 = s1.GetBuffer(32);      int lsrc = s1.GetLength();     _asm {           lea      ESI, DWORD PTR lps1           lea      EDI, DWORD PTR lps2           mov      ECX, lsrc           cld     next:          lodsb          stosb          loop      next     };     s2 = (CString)lps2;     cDst = s2;     UpdateData (FALSE);     }   . . .

Now, we will analyze the code of the handler. The contents of the cSrc variable are placed to the s1 string. Then, the s1 string is copied to s2 . And finally, the contents of the s2 string are displayed in the editing field corresponding to the cDst variable. When working with strings of the cstring type, it is convenient to refer to them as null-terminated strings. To do this, you need to know the address of the string buffer and the length of the string. To obtain these parameters, use the following operators:

 CString s1, s2;  LPCTSTR lps1, lps2;  s1 = cSrc;  s2 = cDst;  lps1 = s1.GetBuffer(32);  int lsrc = s1.GetLength();

The string buffer size has been made equal to 32 for convenience only.

After that, we copy the string with the lps1 pointer to the string determined by the lps2 pointer. To do this, use the following assembly-language commands:

 . . . _asm {        lea     ESI, DWORD PTR lps1        lea     EDI, DWORD PTR lps2        mov     ECX, lsrc        cld     next:        lodsb        stosb        loop    next      }      . . .

Before starting the copying operation, you need to set the Direct Flag so that the source and destination addresses can be incremented after each operation. To do this, set the flag to 0 with the cld command. The length of the lsrc string is placed to the ECX register.

There are two commands that perform the copying operation: lodsb and stosb . The lodsb command loads the byte from the memory location determined by the ESI register (the lps1 string) to the AL accumulator. And the stosb command writes the resulting byte from the accumulator to the memory address stored in EDI (the lps2 string).

After the data transfer operation, the values of the ESI and EDI registers are incremented by 1 automatically. In this case, the increment value is determined by the type of string command. Our program performs the copying operations over bytes; that is why the addresses will be incremented by 1.

For this code fragment to operate properly, the memory space allocated for the destination string ( lps2 ) must be not less than the size of the source string.

Fig. 2.9 shows the results of the copying operation.

Fig. 2.9: Application that copies one string to another

To simplify the previous program, you can replace the two lodsb and stosb commands with a single movsb command for copying the strings (Listing 2.13).

Listing 2.13: Using the movsb command for copying the strings

 . . . _asm {        lea      ESI, DWORD PTR lps1        lea      EDI, DWORD PTR lps2        mov      ECX, lsrc        cld     next:        movsb        loop     next      }      . . .

To further simplify the source code of the program, you can use the movsb command with the repetition prefix ( rep ). The rep prefix uses the contents of the ECX register as a parameter:

 _asm {        lea      ESI, DWORD PTR lps1        lea      EDI, DWORD PTR lps2        mov      ECX, lsrc lsrc        cld        rep      movsb       }

Copying the Arrays

The copying operations can also be applied to the arrays of integers or floating-point numbers. The following console program (Listing 2. 14) uses the assembly commands to copy the contents of one the integer array ( SARRAY ) to another ( DARRAY ).

Listing 2.14: Using assembly language for copying an array of integers

 // COPY_INT_ARRAYS.cpp : Defines the entry point  // for the console application  #include "stdafx.h"  int _tmain(int argc, _TCHAR* argv[])  {   int sarray[6] = {245, 11,   34, 56, 7, 19};   int darray[8] = {0, 0, 14, 45, 56, 7, 21,   56};   int lenarray = sizeof(sarray) / 4;   printf("sarray: ");   for (int cnt = 0; cnt < sizeof(sarray)/4; cnt++)       printf("%d\t", sarray[cnt]);   printf("\ndarray: ");   for (int cnt = 0; cnt < sizeof(darray)/4; cnt++)      printf("%d\t", darray[cnt]);  _asm {        cld        mov  ECX, DWORD PTR lenarray        lea  ESI, DWORD PTR sarray        lea  EDI, DWORD PTR darray        rep  movsd       };  printf("\ndarray after copy: ");  for (int cnt = 0; cnt < sizeof(darray)/4; cnt++)      printf("%d\t", darray[cnt]);  getchar();  return 0;  }

The source code of the program is easy to analyze. Note the use of the rep movsd command for copying double words. Fig. 2.10 shows the application window.

Fig. 2.10: Application copying the elements of one array to another

Concatenation of Strings

The movs command can also be used for another useful operation called concatenation. This operation adds the elements of the source string to the destination string. In this case, make sure that the size of the destination buffer is large enough to hold the merged string. The following technique is often used: place space characters at the end of the destination string, and then replace them with the elements of the source string. The size of the destination buffer must be at least equal to the sum of the lengths of the strings concatenated . You can see this technique illustrated in the source code of a C++ .NET console application (Listing 2.15).

Listing 2.15: Using assembly language for string concatenation in a C++ .NET program

 // STRINGS_CONCAT.cpp : Defines the entry point  // for the console application  #include "stdafx.h"  int _tmain(int argc, _TCHAR* argv[])    {   char s1[] = "Visual            ";   char s2[] = " C++ .NET";   printf("String-destination: %s\n", s1);   printf("String-source: %s\n", s2);   int ls2 = strlen (s2);   _asm {          lea   ESI, DWORD PTR s2          lea   EDI, DWORD PTR s1          cld          mov   AL, ' '    again:          scasb          je    next          jmp   again     next:          dec   EDI          mov   ECX, ls2          rep   movsb       };   printf("Result of concatenation : %s\n", s1);   getchar();   return 0;  }

Look carefully at the following assembly commands:

 lea   ESI, DWORD PTR s2  lea   EDI, DWORD PTR s1  cld  mov   AL, ' '

They load the addresses of the s1 and s2 strings to the ESI and EDI registers, and also set the direct flag for incrementing the address. In the AL register, we place the space character to determine the address in the buffer of the s1 string, after which we will place the elements of the s2 string.

To copy the s2 string to replace the space characters in s1 , use the rep movsb command. At the same time, the ECX register stores the length of the s2 string.

Fig. 2.11 illustrates the window with this application running.

Fig. 2.11: Application that performs concatenation of two strings

Concatenation of Arrays

The concatenation of arrays of integers or floating-point numbers is slightly different from the corresponding operation over character strings, although they have much in common. The destination array should be large enough to hold the new elements from the source array. When re-counting the needed shift for the destination array, consider the byte size of the array element.

In Listing 2.16, note the source code of a console application that performs concatenation of two integer arrays.

Listing 2.16: Concatenation of two integer arrays

 // CONCAT_INT_AARAYS.cpp : Defines the entry point  // for the console application  #include "stdafx.h"  int _tmain(int argc, _TCHAR* argv[])  {   int i1[] = {23, 44, 8, 0, 0, 0, 0};   int i2[] = {   56, 7,   3, 7};   int ilen = sizeof(i2)/4;   printf("Source array i2:\t ");   for (int cnt = 0; cnt < sizeof(i2)/4; cnt++)     printf("%d\t", i2[cnt]);   printf("\nDest array i2:\t\t ");   for (int cnt = 0; cnt < sizeof(i1)/4; cnt++)     printf("%d\t", i1[cnt]);  _asm {        lea   ESI, DWORD PTR i2        lea   EDI, DWORD PTR i1        cld        mov   EAX, 0   again:       scasd       je     next       jmp    again   next:       sub    EDI, 4       mov    ECX, ilen       rep    movsd   };  printf("\nConcatenated arrays:\t ");  for (int cnt=0; cnt < sizeof(i1)/4; cnt++)    printf("%d\t", i1[cnt]);   getchar();   return 0;  }

This code fragment writes the elements of the source array ( i2 ) to the destination array ( i1 ), beginning with the fourth element. The first elements of these arrays automation event placed to the ESI and EDI registers, and the ECX register holds the number of the elements to be written (it is equal to the size of the i2 array).

Fig. 2.12 shows the application window.

Fig. 2.12: Application that performs concatenation of arrays

Comparing Two Strings

Another frequently used array operation is comparison. To compare the elements of strings and arrays, you can use the cmps command and its modifications. The following program code fragment in the MASM assembly language (Listing 2.17) compares two character strings.

Listing 2.17: Using the assembly commands for comparing two strings

 . . .   SRC    DB "STRING 1 "    LSRC   EQU $   SRC    DST    DB "STRING 1"    LDST   EQU $   DST    FLAG   DD 0     . . .    cld    lea    ESI, SRC    lea    EDI, DST    mov    ECX, LSRC    mov    EDX, LDST    cmp    ECX, EDX    je     next_check    jmp    continue  next_check:    repe   CMPSB    je     equal    mov    EAX, FLAG    jmp    continue  equal:    mov    FLAG, 1  continue:    . . .

As the strings are compared by bytes, this code fragment uses the cmpsb command with the repe prefix for repetition. If the strings are identical, then the FLAG variable is set to 1; otherwise , it is set to 0. In our example, the strings do not coincide, so the FLAG variable will be set to 0. The result will be the same if the strings contain the same number of elements but are different in at least one of them. If the strings differ in length, then the FLAG variable will be set to 0, too.

Comparing Two Arrays

Now, we can modify the previous code fragment, adjusting it to compare the arrays of integers. The resulting program code is shown in Listing 2.18.

Listing 2.18: Comparing the arrays of integers

 . . .    ISRC   DD  3, 16, 89, 11    LISRC  EQU ($   ISRC)/4    IDST   DD  3, 16, 89, 11, 9    LIDST  EQU ($   IDST)/4    FLAG   DD      . . .    cld    lea    ESI, ISRC    lea    EDI, IDST    mov    ECX, LISRC    mov    EDX, LIDST    cmp    ECX, EDX    je     next_check    jmp    continue  next_check:    repe   cmpsd    je     equal    mov    EAX, FLAG    jmp    continue  equal:    mov    FLAG, 1  continue:    . . .

The difference between the program code for processing integer arrays and that for processing bytes is mainly related to the size of the operands. Since an integer takes up 4 bytes in the memory, you should replace the cmpsb command with the cmpsd command to compare double words. As before, we write the size of the original array to the ECX register, but this size is now measured as the number of double words. That is why we need to divide the resulting values by 4:

 ISRC   DD  3, 16, 89, 11  LISRC  EQU ($   ISRC)/4  IDST   DD  3, 16, 89, 11, 9  LIDST  EQU ($   IDST)/4

Filling a String or Array

One more example that is often useful is the task of filling a certain memory location with the given character or number. For example, if you need to fill a character string with space characters, you can use the following code (Listing 2.19).

Listing 2.19: Using the assembly-language commands to fill a string with space characters

 . . . SRC    DB "This string will be filled with space characters"  LSRC     EQU $   SRC    . . .  cld  mov    AL, ' '  mov    ECX, LSRC  lea    EDI, SRC  rep    stosb    . . .

To fill an integer array with zeroes, you can use the code fragment presented in Listing 2.20.

Listing 2.20: Filling an integer array with zeroes

 . . . ISRC    DD 3, 16, 89, 11,   99, 4  LISRC   EQU ($   ISRC)/4     . . .  cld  lea     EDI, ISRC  mov     ECX, LISRC  mov     EAX, 0  rep     stosd    . . .

The string commands of assembly language are extremely useful for optimizing the programs created in Visual C++ .NET. Any high-level language has its own commands for copying the strings, concatenation, searching for elements, and filling a memory location with certain values. But when implemented in assembly language, such operations require a much smaller program code and are performed faster.

Conversion between Lowercase and Uppercase Characters

Now, we will consider one more example dealing with string operations. You may often need to convert lowercase characters to the corresponding uppercase ones. In the code fragment implementing this task, the use of string commands may make the program too complicated where it is hardly justified. So, we will use ordinary operators. Listing 2.21 shows the full source code of the corresponding C++ console application.

Listing 2.21: Converting lowercase characters to uppercase ones

 // CONVERT_TO_UPPER.cpp : Defines the entry point  // for the console application  #include "stdafx.h"  int _tmain(int argc, _TCHAR* argv[])  {    char s1[] = "this string must be converted to uppercase";    int ls1 = strlen(s1);    printf("Before: %s\n", s1);    _asm {         lea    ESI, DWORD PTR s1         mov    ECX, DWORD PTR ls1  next:         mov    AL, BYTE PTR [ESI]         cmp    AL, 'a'         jb     next_addr         cmp    AL, 'z'         ja     next_addr         and    AL, Odfh         mov    BYTE PTR [ESI], AL  next_addr:         inc    ESI         loop   next      }    printf("After: %s\n", s1);    getchar();    return 0; }

To convert the characters to uppercase, we use a block of commands in assembly language. Before the conversion, the address of the s1string is loaded to the ESI register, and its length ( ls1 ) ”to the ECX register. As we deal with letters here, we need to analyze the elements in the 'a' “ 'z' range only. The conversion algorithm is implemented in the following code fragment:

 . . . next:    mov  AL, BYTE PTR [ESI]    cmp  AL, 'a'    jb   next_addr    cmp  AL, 'z'    ja   next_addr    and  AL, 0dfh    mov  BYTE PTR [ESI], AL  next_addr:    inc  ESI    loop next      . . .

Fig. 2.13 illustrates the window with this application running.

Fig. 2.13: Application converting lowercase characters to uppercase ones

Optimizing String Operations without Using String Commands

You may have the impression that string operations are efficient only if implemented through string commands. But now, we will consider an example that contains no string commands at all. You will see that the string operations can be implemented efficiently by using the ordinary assembly commands as well. This approach is illustrated in the following console application that compares two strings (Listing 2.22).

Listing 2.22: An implementation of string operations without using the string commands

 // CMP_STRINGS_WITHOUT_PRIMITIVES.cpp : Defines the entry point  // for the console application  #include "stdafx.h"  int _tmain(int argc, _TCHAR* argv[])  {   char s1[] = "string 1";   char s2[] = "stRing 1";   bool result;   printf("String 1: %s\n", s1);   printf("String 2: %s\n", s2);   _asm {           lea  ESI, DWORD PTR s1 // The address of the s1 string           lea  EDI, DWORD PTR s2 // The address of the s2  string      again:           mov  AL, BYTE PTR [ESI]           mov  DL, BYTE PTR [EDI]           push EAX           push EDX           xor  AL, DL           pop  EDX           pop  EAX           jz   streq           jmp  strnot_eq      streq:           test AL, DL           jz   succ           inc  ESI           inc  EDI           jmp  again      strnot_eq:           mov  EAX, 0           jmp  quit      succ:           mov  EAX, 1      quit:           mov  DWORD PTR result, EAX   };   if (result)      printf("Equal\n");   else      printf("Not equal\n");   getchar();   return 0;  }

In the assembly block, the address of the source string is placed into the ESI register, and the address of the destination string is placed to EDI . The string elements are placed to the AL and DL registers, where their values are compared:

 . . . mov  AL, BYTE PTR [ESI]  mov  DL, BYTE PTR [EDI]  push EAX  push EDX  xor  AL, DL  pop  EDX  pop  EAX  jz   streq  jmp  strnot_eq   . . .

If the characters are not equal, quit the procedure and return 0 to the main program. If the characters are found equal, the program checks if they are equal to 0 (see the jz streq jump command):

 . . . streq:   test AL, DL   jz   succ   inc  ESI   inc  EDI   jmp  again   . . .

If the elements appear to be equal to 0, the end of string has been reached, and the comparison is completed successfully (the strings are found equal). In this case, the result variable returns 1. However, if the elements are not equal to 0 (though they are equal to each other), the program continues on to the next address in the settings and repeats the comparison loop.

In our example, the strings are not equal, so the application window displays the corresponding message (Fig. 2.14).

Fig. 2.14: Application for comparing two strings

As you can see, you can perform string manipulations even without using the specialized string commands, but the resulting code appears somewhat bulky because of the additional operations for incrementing or decrementing the addresses, as well as the additional commands comparing the characters and analyzing the end of strings. The highest performance for string operations is usually achieved in copying one string to another or in moving the string elements from one memory location to another. This is especially evident when you have to move large amounts of data.

A smaller gain in performance (as compared to ordinary commands) can also be achieved by search and scanning commands. In particular, the performance of these string operations is influenced by the size of the operands.