5.8 Functions and Function Results

Functions are procedures that return a result. In assembly language, there are very few syntactical differences between a procedure and a function, which is why HLA doesn't provide a specific declaration for a function. Nevertheless, although there is very little syntactical difference between assembly procedures and functions, there are considerable semantic differences. That is, although you can declare them the same way in HLA, you use them differently.

Procedures are a sequence of machine instructions that fulfill some task. The end result of the execution of a procedure is the accomplishment of that activity. Functions, on the other hand, execute a sequence of machine instructions specifically to compute some value to return to the caller. Of course, a function can perform some activity as well and procedures can undoubtedly compute some values, but the main difference is that the purpose of a function is to return some computed result; procedures don't have this requirement.

A good example of a procedure is the stdout.puti32 procedure. This procedure requires a single int32 parameter. The purpose of this procedure is to print the decimal conversion of this integer value to the standard output device. Note that stdout.puti32 doesn't return any kind of value that is usable by the calling program.

A good example of a function is the cs.member function. This function expects two parameters: The first is a character value, and the second is a character set value. This function returns true (1) in EAX if the character is a member of the specified character set. It returns false if the character parameter is not a member of the character set.

Logically, the fact that cs.member returns a usable value to the calling code (in EAX) while stdout.puti32 does not is a good example of the main difference between a function and a procedure. So, in general, a procedure becomes a function by virtue of the fact that you explicitly decide to return a value somewhere upon procedure return. No special syntax is needed to declare and use a function. You still write the code as a procedure.

5.8.1 Returning Function Results

The 80x86's registers are the most common place to return function results. The cs.member routine in the HLA Standard Library is a good example of a function that returns a value in one of the CPU's registers. It returns true (1) or false (0) in the EAX register. By convention, programmers try to return 8-, 16-, and 32-bit (non-real) results in the AL, AX, and EAX registers, respectively.^[7] For example, this is where most high level languages return these types of results.

Of course, there is nothing particularly sacred about the AL/AX/EAX register. You can return function results in any register if it is more convenient to do so. However, if you don't have a good reason for not using AL/AX/EAX, then you should follow the convention. Doing so will help others understand your code better because they will generally assume that your functions return small results in the AL/AX/EAX register set.

If you need to return a function result that is larger than 32 bits, you obviously must return it somewhere other than in EAX (which can hold values 32 bits or less). For values slightly larger than 32 bits (e.g., 64 bits or maybe even as many as 128 bits) you can split the result into pieces and return those parts in two or more registers. It is very common to see programs returning 64-bit values in the EDX:EAX register pair (e.g., the HLA Standard Library stdin.geti64 function returns a 64-bit integer in the EDX:EAX register pair).

If you need to return a really large object as a function result — say, an array of 1,000 elements — you obviously are not going to be able to return the function result in the registers. There are two common ways to deal with really large function return results: Either pass the return value as a reference parameter or allocate storage on the heap (using malloc) for the object and return a pointer to it in a 32-bit register. Of course, if you return a pointer to storage you've allocated on the heap, the calling program must free this storage when it is done with it.

5.8.2 Instruction Composition in HLA

Several HLA Standard Library functions allow you to call them as operands of other instructions. For example, consider the following code fragment:

 if( cs.member( al, {'a'..'z'}) ) then .      .      . endif;

As your high level language experience (and HLA experience) should suggest, this code calls the cs.member function to check to see if the character in AL is a lower case alphabetic character. If the cs.member function returns true then this code fragment executes the then section of the if statement; however, if cs.member returns false, this code fragment skips the if..then body. There is nothing spectacular here except for the fact that HLA doesn't support function calls as boolean expressions in the if statement (look back at Chapter 1 to see the complete set of allowable expressions). How then, does this program compile and run producing the intuitive results?

The very next section will describe how you can tell HLA that you want to use a function call in a boolean expression. However, to understand how this works, you need to first learn about instruction composition in HLA.

Instruction composition lets you use one instruction as the operand of another. For example, consider the mov instruction. It has two operands, a source operand and a destination operand. Instruction composition lets you substitute a valid 80x86 machine instruction for either (or both) operands. The following is a simple example:

                          mov( mov( 0, eax ), ebx );

Of course the immediate question is, "What does this mean?" To understand what is going on, you must first realize that most instructions "return" a value to the compiler while they are being compiled. For most instructions, the value they "return" is their destination operand. Therefore, "mov( 0, eax);" returns the string "eax" to the compiler during compilation because EAX is the destination operand. Most of the time, specifically when an instruction appears on a line by itself, the compiler ignores the string result the instruction returns. However, HLA uses this string result whenever you supply an instruction in place of some operand; specifically, HLA uses that string in place of the instruction as the operand. Therefore, the mov instruction above is equivalent to the following two instruction sequences:

      mov( 0, eax );      // HLA compiles interior instructions first.      mov( eax, ebx );

When processing composed instructions (that is, instruction sequences that have other instructions as operands), HLA always works in a "left-to-right then depthfirst (inside-out)" manner. To make sense of this, consider the following instructions:

      add( sub( mov( i, eax ), mov( j, ebx )), mov( k, ecx ));

To interpret what is happening here, begin with the source operand. It consists of the following:

      sub( mov( i, eax ), mov( j, ebx ))

The source operand for this instruction is "mov( i, eax )" and this instruction does not have any composition, so HLA emits this instruction and returns its destination operand (EAX) for use as the source to the sub instruction. This effectively gives us the following:

      sub( eax, mov( j, ebx ))

Now HLA compiles the instruction that appears as the destination operand ("mov( j, ebx )") and returns its destination operand (EBX) to substitute for this mov in the sub instruction. This yields the following:

      sub( eax, ebx )

This is a complete instruction, without composition, that HLA can compile. So it compiles this instruction and returns its destination operand (EBX) as the string result to substitute for the sub in the original add instruction. So the original add instruction now becomes:

      add( ebx, mov(i, ecx ));

HLA next compiles the mov instruction appearing in the destination operand. It returns its destination operand as a string that HLA substitutes for the mov, finally yielding the simple instruction:

      add( ebx, ecx );

The compilation of the original add instruction, therefore, yields the following instruction sequence:

      mov( i, eax );      mov( j, ebx );      sub( eax, ebx );      mov( k, ecx );      add( ebx, ecx );

Whew! It's rather difficult to look at the original instruction and easily see that this sequence is the result. As you can easily see in this example, overzealous use of instruction composition can produce nearly unreadable programs. You should be very careful about using instruction composition in your programs. With only a few exceptions, writing a composed instruction sequence makes your program harder to read.

Note that the excessive use of instruction composition may make errors in your program difficult to decipher. Consider the following HLA statement:

      add( mov( eax, i ), mov( ebx, j ) );

This instruction composition yields the 80x86 instruction sequence:

      mov( eax, i );      mov( ebx, j );      add( i, j );

Of course, the compiler will complain that you're attempting to add one memory location to another. However, the instruction composition effectively masks this fact and makes it difficult to comprehend the cause of the error message. Moral of the story: Avoid using instruction composition unless it really makes your program easier to read. The few examples in this section demonstrate how not to use instruction composition.

There are two main areas where using instruction composition can help make your programs more readable. The first is in HLA's high level language control structures. The other is in procedure parameters. Although instruction composition is useful in these two cases (and probably a few others as well), this doesn't give you a license to use extremely convoluted instructions like the add instruction in the previous example. Instead, most of the time you will use a single instruction or a function call in place of a single operand in a high level language boolean expression or in a procedure/function parameter.

While we're on the subject, exactly what does a procedure call return as the string that HLA substitutes for the call in an instruction composition? For that matter, what do statements like if..endif return? How about instructions that don't have a destination operand? Well, function return results are the subject of the very next section so you'll read about that in a few moments. As for all the other statements and instructions, you should check out the HLA Reference Manual. It lists each instruction and its "returns" value. The "returns" value is the string that HLA will substitute for the instruction when it appears as the operand to another instruction. Note that many HLA statements and instructions return the empty string as their "returns" value (by default, so do procedure calls). If an instruction returns the empty string as its composition value, then HLA will report an error if you attempt to use it as the operand of another instruction. For example, the if..endif statement returns the empty string as its "returns" value, so you may not bury an if..endif inside another instruction.

5.8.3 The HLA @RETURNS Option in Procedures

HLA procedure declarations allow a special option that specifies the string to use when a procedure invocation appears as the operand of another instruction: the @returns option. The syntax for a procedure declaration with the @returns option is as follows:

      procedure ProcName ( optional parameters ); @returns( string_constant );           << Local declarations >>      begin ProcName;           << procedure statements >>      end ProcName;

If the @returns option is not present, HLA associates the empty string with the @returns value for the procedure. This effectively makes it illegal to use that procedure invocation as the operand to another instruction.

The @returns option requires a single string expression surrounded by parentheses. HLA will substitute this string constant for the procedure call if it ever appears as the operand of another instruction. Typically this string constant is a register name; however, any text that would be legal as an instruction operand is okay here. For example, you could specify memory address or constants. For purposes of clarity, you should always specify the location of a function's return value in the @returns parameter.

As an example, consider the following boolean function that returns true or false in the EAX register if the single character parameter is an alphabetic character:^[8]

      procedure IsAlphabeticChar( c:char ); @returns( "EAX" );      begin IsAlphabeticChar;           // Note that cs.member returns true/false in EAX           cs.member( c, {'a'..'z', 'A'..'Z'} );      end IsAlphabeticChar;

Once you tack the @returns option on the end of this procedure declaration you can legally use a call to IsAlphabeticChar as an operand to other HLA statements and instructions:

      mov( IsAlphabeticChar( al ), EBX );           .           .           .      if( IsAlphabeticChar( ch ) ) then           .           .           .      endif;

The last example above demonstrates that, via the @returns option, you can embed calls to your own functions in the boolean expression field of various HLA statements. Note that the code above is equivalent to

      IsAlphabeticChar( ch );      if( EAX ) then           .           .           .      endif;

Not all HLA high level language statements expand composed instructions before the statement. For example, consider the following while statement:

      while( IsAlphabeticChar( ch ) ) do           .           .           .      endwhile;

This code does not expand to the following:

      IsAlphabeticChar( ch );      while( EAX ) do           .           .           .      endwhile;

Instead, the call to IsAlphabeticChar expands inside the while's boolean expression so that the program calls this function on each iteration of the loop.

You should exercise caution when entering the @returns parameter. HLA does not check the syntax of the string parameter when it is compiling the procedure declaration (other than to verify that it is a string constant). Instead, HLA checks the syntax when it replaces the function call with the @returns string.

So if you had specified "EAZ" instead of "EAX" as the @returns parameter for IsAlphabeticChar in the previous examples, HLA would not have reported an error until you actually used IsAlphabeticChar as an operand. Then of course, HLA complains about the illegal operand and it's not at all clear what the problem is by looking at the IsAlphabeticChar invocation. So take special care not to introduce typographical errors in the @returns string; figuring out such errors later can be very difficult.

^[7]In the next chapter, you'll see where most programmers return real results.

^[8]Before you run off and actually use this function in your own programs, note that the HLA Standard Library provides the char.isAlpha function that provides this test. See the HLA documentation for more details.