4.11 HLA Strings


4.11 HLA Strings

As the previous section notes, HLA strings consist of four components: a maximum length, a current string length, character data, and a zero terminating byte. However, HLA never requires you to create string data by manually emitting these components yourself. HLA is smart enough to automatically construct this data for you whenever it sees a string literal constant. So if you use a string constant like the following, understand that somewhere HLA is creating the four-component string in memory for you:

 stdout.put( "This gets converted to a four-component string by HLA" ); 

HLA doesn't actually work directly with the string data described in the previous section. Instead, when HLA sees a string object it always works with a pointer to that object rather than working directly with the object. Without question, this is the most important fact to know about HLA strings and is the biggest source of problems beginning HLA programmers have with strings in HLA: strings are pointers! A string variable consumes exactly four bytes, the same as a pointer (because it is a pointer!). Having said all that, let's take a look at a simple string variable declaration in HLA:

 static      StrVariable:           string; 

Because a string variable is a pointer, you must initialize it before you can use it. There are three general ways you may initialize a string variable with a legal string address: using static initializers, using the stralloc routine, or calling some other HLA Standard Library that initializes a string or returns a pointer to a string.

In one of the static declaration sections that allow initialized variables (static, and readonly) you can initialize a string variable using the standard initialization syntax, e.g.,

 static      InitializedString: string := "This is my string"; 

Note that this does not initialize the string variable with the string data. Instead, HLA creates the string data structure (see the previous section) in a special, hidden memory segment and initializes the InitializedString variable with the address of the first character in this string (the "T" in "This"). Remember, strings are pointers! The HLA compiler places the actual string data in a read-only memory segment. Therefore, you cannot modify the characters of this string literal at runtime. However, because the string variable (a pointer, remember) is in the static section, you can change the string variable so that it points at different string data.

Because string variables are pointers, you can load the value of a string variable into a 32-bit register. The pointer itself points at the first character position of the string. You can find the current string length in the double word four bytes prior to this address, you can find the maximum string length in the double word eight bytes prior to this address. The program in Listing 4-8 demonstrates one way to access this data.[13]

Listing 4-8: Accessing the Length and Maximum Length Fields of a String.

start example
 // Program to demonstrate accessing Length and Maxlength fields of a string. program StrDemo; #include( "stdlib.hhf" ); static      theString:string := "String of length 19"; begin StrDemo;      mov( theString, ebx ); // Get pointer to the string.      mov( [ebx-4], eax );   // Get current length      mov( [ebx-8], ecx );   // Get maximum length      stdout.put      (           "theString = '", theString, "'", nl,           "length( theString )= ", (type uns32 eax ), nl,           "maxLength( theString )= ", (type uns32 ecx ), nl      ); end StrDemo; 
end example

When accessing the various fields of a string variable it is not wise to access them using fixed numeric offsets as done in Listing 4-8. In the future, the definition of an HLA string may change slightly. In particular, the offsets to the maximum length and length fields are subject to change. A safer way to access string data is to coerce your string pointer using the str.strRec data type. The str.strRec data type is a record data type (see the section on records a little later in this chapter) that defines symbolic names for the offsets of the length and maximum length fields in the string data type. Were the offsets to the length and maximum length fields to change in a future version of HLA, then the definitions in str.strRec would also change, so if you use str.strRec then recompiling your program would automatically make any necessary changes to your program.

To use the str.strRec data type properly, you must first load the string pointer into a 32-bit register, e.g., "mov( SomeString, EBX );". Once the pointer to the string data is in a register, you can coerce that register to the str.strRec data type using the HLA construct "(type str.strRec [EBX])". Finally, to access the length or maximum length fields, you would use either "(type str.strRec [EBX]).length" or "(type str.strRec [EBX]).MaxStrLen", (respectively). Although there is a little more typing involved (versus using simple offsets like "-4" or "-8"), these forms are far more descriptive and much safer than straight numeric offsets. The program in Listing 4-9 corrects the example in Listing 4-8 by using the str.strRec data type.

Listing 4-9: Correct Way to Access Length and MaxStrLen Fields of a String.

start example
 // Program to demonstrate accessing Length and Maxlength fields of a string. program LenMaxlenDemo; #include( "stdlib.hhf" ); static      theString:string := "String of length 19"; begin LenMaxlenDemo;      mov( theString, ebx ); // Get pointer to the string.      mov( (type str.strRec [ebx]).length, eax ); // Get current length      mov( (type str.strRec [ebx]).MaxStrLen, ecx ); // Get maximum length      stdout.put      (           "theString = '", theString, "'", nl,           "length( theString )= ", (type uns32 eax ), nl,           "maxLength( theString )= ", (type uns32 ecx ), nl      ); end LenMaxlenDemo; 
end example

A second way to manipulate strings in HLA is to allocate storage on the heap to hold string data. Because strings can't directly use pointers returned by malloc (because strings will access the eight bytes prior to the address), you shouldn't use malloc to allocate storage for string data. Fortunately, the HLA Standard Library memory module provides a memory allocation routine specifically designed to allocate storage for strings: stralloc. Like malloc, stralloc expects a single double word parameter. This value specifies the (maximum) number of characters needed for the string. The stralloc routine will allocate the specified number of bytes of memory, plus between 9 and 13 additional bytes to hold the extra string information.[14]

The stralloc routine will allocate storage for a string, initialize the maximum length to the value passed as the stralloc parameter, initialize the current length to zero, and store a zero (terminating byte) in the first character position of the string. After all this, stralloc returns the address of the zero terminating byte (that is, the address of the first character element) in the EAX register.

Once you've allocated storage for a string, you can call various string manipulation routines in the HLA Standard Library to manipulate the string. The next section will discuss the HLA string routines in detail; this section will introduce a couple of string related routines for the sake of example. The first such routine is the "stdin.gets( strvar );". This routine reads a string from the user and stores the string data into the string storage pointed at by the string parameter (strvar in this case). If the user attempts to enter more characters than the maximum the string allows, then stdin.gets raises the ex.StringOverflow exception. The program in Listing 4-10 demonstrates the use of stralloc.

Listing 4-10: Reading a String from the User.

start example
 // Program to demonstrate stralloc and stdin.gets. program strallocDemo; #include( "stdlib.hhf" ); static      theString:string; begin strallocDemo;      stralloc( 16 );           // Allocate storage for the string and store      mov( eax, theString );    // the pointer into the string variable.      // Prompt the user and read the string from the user:      stdout.put( "Enter a line of text (16 chars, max): " );      stdin.flushInput();      stdin.gets( theString );      // Echo the string back to the user:      stdout.put( "The string you entered was: ", theString, nl ); end strallocDemo; 
end example

If you look closely, you see a slight defect in the program above. It allocates storage for the string by calling stralloc, but it never frees the storage allocated. Even though the program immediately exits after the last use of the string variable, and the operating system will deallocate the storage anyway, it's always a good idea to explicitly free up any storage you allocate. Doing so keeps you in the habit of freeing allocated storage (so you don't forget to do it when it's important) and, also, programs have a way of growing such that an innocent defect that doesn't affect anything in today's program becomes a showstopping defect in tomorrow's version.

To free storage you allocate via stralloc, you must call the strfree routine, passing the string pointer as the single parameter. The program in Listing 4-11 is a correction of the program Listing 4-10 with this defect corrected.

Listing 4-11: Corrected Program That Reads a String from the User.

start example
 // Program to demonstrate stralloc, strfree, and stdin.gets. program strfreeDemo; #include( "stdlib.hhf" ); static      theString:string; begin strfreeDemo;      stralloc( 16 );            // Allocate storage for the string and store      mov( eax, theString );     // the pointer into the string variable.      // Prompt the user and read the string from the user:      stdout.put( "Enter a line of text (16 chars, max): " );      stdin.flushInput();      stdin.gets( theString );      // Echo the string back to the user:      stdout.put( "The string you entered was: ", theString, nl );      // Free up the storage allocated by stralloc:      strfree( theString ); end strfreeDemo; 
end example

When looking at this corrected program, please take note that the stdin.gets routine expects you to pass it a string parameter that points at an allocated string object. Without question, one of the most common mistakes beginning HLA programmers make is to call stdin.gets and pass it a string variable that they have not initialized. This may be getting old now, but keep in mind that strings are pointers! Like pointers, if you do not initialize a string with a valid address, your program will probably crash when you attempt to manipulate that string object. The call to stralloc plus moving the returned result into theString is how the programs above initialize the string pointer. If you are going to use string variables in your programs, you must ensure that you allocate storage for the string data prior to writing data to the string object.

Allocating storage for a string option is such a common operation that many HLA Standard Library routines will automatically do the allocation to save you the effort. Generally, such routines have an "a_" prefix as part of their name. For example, the stdin.a_gets combines a call to stralloc and stdin.gets into the same routine. This routine, which doesn't have any parameters, reads a line of text from the user, allocates a string object to hold the input data, and then returns a pointer to the string in the EAX register. Listing 4-12 presents an adaptation of the two programs in Listings 4-10 and 4-11 that uses stdin.a_gets.

Listing 4-12: Reading a String from the User with stdin.a_gets.

start example
 // Program to demonstrate strfree and stdin.a_gets. program strfreeDemo2; #include( "stdlib.hhf" ); static      theString:string; begin strfreeDemo2;      // Prompt the user and read the string from the user:      stdout.put( "Enter a line of text: " );      stdin.flushInput();      stdin.a_gets();      mov( eax, theString );      // Echo the string back to the user:      stdout.put( "The string you entered was: ", theString, nl );      // Free up the storage allocated by stralloc:      strfree( theString ); end strfreeDemo2; 
end example

Note that, as before, you must still free up the storage stdin.a_gets allocates by calling the strfree routine. One big difference between this routine and the previous two is the fact that HLA will automatically allocate exactly enough space for the string read from the user. In the previous programs, the call to stralloc only allocates 16 bytes. If the user types more than 16 characters, then the program raises an exception and quits. If the user types less than 16 characters, then some space at the end of the string is wasted. The stdin.a_gets routine, on the other hand, always allocates the minimum necessary space for the string read from the user. Because it allocates the storage, there is little chance of overflow.[15]

[13]Note that this scheme is not recommended. If you need to extract the length information from a string, use the routines provided in the HLA String Library for this purpose.

[14]Stralloc may allocate more than nine bytes for the overhead data because the memory allocated to an HLA string must always be double word aligned, and the total length of the data structure must be an even multiple of four.

[15]Actually, there are limits on the maximum number of characters that stdin.a_gets will allocate. This is typically between 1,024 bytes and 4,096 bytes; see the HLA Standard Library source listings and your operating system documentation for the exact value.




The Art of Assembly Language
The Art of Assembly Language
ISBN: 1593272073
EAN: 2147483647
Year: 2005
Pages: 246
Authors: Randall Hyde

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net