15.2 Mixing HLA and MASMGas Code in the Same Program

15.2 Mixing HLA and MASM/Gas Code in the Same Program

It may seem kind of weird to mix MASM or Gas and HLA code in the same program. After all, they're both assembly languages, and almost anything you can do with MASM or Gas can be done in HLA. So why bother trying to mix the two in the same program? Well, there are three reasons:

You already have a lot of code written in MASM or Gas and you don't want to convert it to HLA's syntax.
There are a few things MASM and Gas do that HLA cannot, and you happen to need to do one of those things.
Someone else has written some MASM or Gas code and s/he wants to be able to call code you've written using HLA.

In this section, we'll discuss two ways to merge MASM/Gas and HLA code in the same program: via in-line assembly code and through linking object files.

15.2.1 In-Line (MASM/Gas) Assembly Code in Your HLA Programs

As you're probably aware, the HLA compiler doesn't actually produce machine code directly from your HLA source files. Instead, it first compiles the code to a MASM or Gas-compatible assembly language source file and then it calls MASM or Gas to assemble this code to object code. If you're interested in seeing the MASM- or Gas output HLA produces, just edit the filename.asm file that HLA creates after compiling your filename.hla source file. The output assembly file isn't amazingly readable, but it is fairly easy to correlate the assembly output with the HLA source file.

HLA provides two mechanisms that let you inject raw MASM or Gas code directly into the output file it produces: the #asm..#endasm sequence and the #emit statement. The #asm..#endasm sequence copies all text between these two clauses directly to the assembly output file, e.g.,

 #asm      mov eax, 0      ;MASM/Gas syntax for MOV( 0, EAX );      add eax, ebx    ; "     "     "  ADD( ebx, eax ); #endasm

The #asm..#endasm sequence is how you inject in-line (MASM or Gas) assembly code into your HLA programs. For the most part there is very little need to use this feature, but in a few instances it is invaluable. Note, when using Gas, that HLA specifies the ".intel_syntax" directive, so you should use Intel syntax when supplying Gas code between #asm and #endasm.

For example, if you're writing structured exception handling code under Windows, you'll need to access the double word at address FS:[0] (offset zero in the segment pointed at by the 80x86's FS segment register). You can drop into MASM for a statement or two to handle memory accesses as follows:

 #asm      mov ebx, fs:[0]      ; Loads process pointer into EBX #endasm

At the end of this instruction sequence, EBX will contain the pointer to the process information structure that Windows maintains.

HLA blindly copies all text between the #asm and #endasm clauses directly to the assembly output file. HLA does not check the syntax of this code or otherwise verify its correctness. If you introduce an error within this section of your program, the assembler will report the error when HLA assembles your code by calling MASM or Gas.

The #emit statement also writes text directly to the assembly output file. However, this statement does not simply copy the text from your source file to the output file; instead, this statement copies the value of a (constant) string expression to the output file. The syntax for this statement is

      #emit( string_expression );

This statement evaluates the expression and verifies that it's a string expression. Then it copies the string data to the output file. Like the #asm/#endasm statement, the #emit statement does not check the syntax of the MASM/Gas statement it writes to the assembly file. If there is a syntax error, MASM or Gas will catch it later on when the assembler processes the output file.

One advantage of the #emit statement is that it lets you construct MASM or Gas statements under (compile time) program control. You can write an HLA compile time program that generates a sequence of strings and emits them to the assembly file via the #emit statement. The compile time program has access to the HLA symbol table; this means that you can extract the identifiers that HLA emits to the assembly file and use these directly, even if they aren't external objects.

When HLA compiles your programs into assembly language, it does not use the same symbols in the assembly language output file that you use in the HLA source files. There are several technical reasons for this, but the bottom line is this: You cannot easily reference your HLA identifiers in your in-line assembly code. The only exception to this rule is external identifiers. HLA external identifiers use the same name in the assembly file as in the HLA source file. Therefore, you can refer to external objects within your in-line assembly sequences or in the strings you output via #emit.

The @staticname compile time function returns the name that HLA uses to refer to most static objects in your program. The program in Listing 15-1 demonstrates a simple use of this compile time function to obtain the assembly name of an HLA procedure.

Listing 15-1: Using the @StaticName Compile Time Function.

 program emitDemo; #include( "stdlib.hhf" )      procedure myProc;      begin myProc;           stdout.put( "Inside MyProc" nl );      end myProc; begin emitDemo;      ?stmt:string := "call " + @StaticName( myProc );      #emit( stmt ); end emitDemo;

This example creates a string value (stmt) that contains something like "call ?741_myProc" and emits this assembly instruction directly to the source file ("?741_myProc" is typical of the type of name mangling that HLA does to static names it writes to the output file). If you compile and run this program, it should display "Inside MyProc" and then quit. If you look at the assembly file that HLA emits, you will see that it has given the myProc procedure the same name it appends to the call instruction.^[1]

The @StaticName function is only valid for static symbols. This includes static, readonly, and storage variables, and procedures. It does not include var objects, constants, macros, or methods.

You can access var variables by using the [EBP+offset] addressing mode, specifying the offset of the desired local variable. You can use the @offset compile time function to obtain the offset of a var object or a parameter. Listing 15-2 demonstrates how to do this:

Listing 15-2: Using the @Offset Compile Time Function.

 program offsetDemo; #include( "stdlib.hhf" ) var      i:int32; begin offsetDemo;      mov( -255, i );      ?stmt := "mov eax, [ebp+(" + string( @offset( i )) + ")]";      #print( "Emitting '", stmt, "'" )      #emit( stmt );      stdout.put( "eax = ", (type int32 eax), nl ); end offsetDemo;

This example emits the statement "mov eax, [ebp+(-8)]" to the assembly language source file. It turns out that -8 is the offset of the i variable in the offsetDemo program's activation record.

Of course, the examples of #emit up to this point have been somewhat ridiculous because you can achieve the same results by using HLA statements. One very useful purpose for the #emit statement, however, is to create some instructions that HLA does not support. For example, at one time HLA did not support the les instruction because you can't really use it under most 32-bit operating systems.^[2] However, if you found a need for this instruction, you could easily write a macro to emit this instruction and appropriate operands to the assembly source file. Using the #emit statement gives you the ability to reference HLA objects, something you cannot do with the #asm..#endasm sequence.

15.2.2 Linking MASM/Gas-Assembled Modules with HLA Modules

Although you can do some interesting things with HLA's in-line assembly statements, you'll probably never use them. Further, future versions of HLA may not even support these statements, so you should avoid them as much as possible even if you see a need for them. Of course, HLA does most of the stuff you'd want to do with the #asm/#endasm and #emit statements anyway, so there is very little reason to use them at all. If you're going to combine MASM/Gas (or other assembler) code and HLA code together in a program, most of the time this will occur because you've got a module or library routine written in some other assembly language and you would like to take advantage of that code in your HLA programs. Rather than convert the other assembler's code to HLA, the easy solution is to simply assemble that other code to an object file and link it with your HLA programs.

Once you've compiled or assembled a source file to an object file, the routines in that module are callable from almost any machine code that can handle the routines' calling sequences. If you have an object file that contains a sqrt function, for example, it doesn't matter whether you compiled that function with HLA, MASM, TASM, NASM, Gas, or even a high level language; if it's object code and it exports the proper symbols, you can call it from your HLA program.

Compiling a module in MASM or Gas and linking that with your HLA program is little different than linking other HLA modules with your main HLA program. In the assembly source file, you will have to export some symbols (using the PUBLIC directive in MASM or the .GLOBAL directive in Gas), and in your HLA program you have to tell HLA that those symbols appear in a separate module (using the @external option).

Because the two modules are written in assembly language, there is very little language-imposed structure on the calling sequence and parameter passing mechanisms. If you're calling a function written in MASM or Gas from your HLA program, then all you've got to do is to make sure that your HLA program passes parameters in the same locations where the MASM/Gas function is expecting them.

About the only issue you've got to deal with is the case of identifiers in the two programs. By default, Gas is case sensitive and MASM is case insensitive. HLA, on the other hand, enforces case neutrality (which, essentially, means that it is case sensitive). If you're using MASM, there is a MASM command line option ("/Cp") that tells MASM to preserve case in all public symbols. It's a real good idea to use this option when assembling modules you're going to link with HLA so that MASM doesn't mess with the case of your identifiers during assembly.

Of course, because MASM and Gas process symbols in a case-sensitive manner, it's possible to create two separate identifiers that are the same except for alphabetic case. HLA enforces case neutrality so it won't let you (directly) create two different identifiers that differ only in case. In general, this is such a bad programming practice that one would hope you never encounter it (and God forbid you actually do this yourself). However, if you inherit some MASM or Gas code written by a C hacker, it's quite possible the code uses this technique. The way around this problem is to use two separate identifiers in your HLA program and use the extended form of the @external directive to provide the external names. For example, suppose that in MASM you have the following declarations:

           public AVariable           public avariable                .                .                .           .data AVariable dword     ? avariable byte      ?

If you assemble this code with the "/Cp" or "/Cx" (total case sensitivity) command line options, MASM will emit these two external symbols for use by other modules. Of course, were you to attempt to define variables by these two names in an HLA program, HLA would complain about the duplicate-symbol definition. However, you can connect two different HLA variables to these two identifiers using code like the following:

 static      AVariable: dword; external( "AVariable" );      AnotherVar: byte; external( "avariable" );

HLA does not check the strings you supply as parameters to the @external clause. Therefore, you can supply two names that are the same except for case, and HLA will not complain. Note that when HLA calls MASM to assemble its output file, HLA specifies the "/Cp" option that tells MASM to preserve case in public and global symbols. Of course, you would use this same technique in Gas if the Gas programmer has exported two symbols that are identical except for case.

The programs in Listings 15-3 and 15-4, respectively, demonstrate how to call a MASM subroutine from an HLA main program:

Listing 15-3: Main HLA Program to Link with a MASM Program.

 // To compile this module and the attendant MASM file, use the following // command line: // //      ml -c masmupper.masm //      hla masmdemo1.hla masmupper.obj program MasmDemo1; #include( "stdlib.hhf" )      // The following external declaration defines a function that      // is written in MASM to convert the character in AL from      // lower case to upper case.      procedure masmUpperCase( c:char in al ); @external( "masmUpperCase" ); static      s: string := "Hello World!"; begin MasmDemo1;      stdout.put( "String converted to uppercase: '" );      mov( s, edi );      while( mov( [edi], al ) <> #0 ) do           masmUpperCase( al );           stdout.putc( al );           inc( edi );      endwhile;      stdout.put( "'" nl ); end MasmDemo1;

Listing 15-4: Calling a MASM Procedure from an HLA Program: MASM Module.

 ; MASM source file to accompany the MasmDemo1.HLA source ; file. This code compiles to an object module that ; gets linked with an HLA main program. The function ; below converts the character in AL to upper case if it ; is a lower case character.                .586                .model flat, pascal                .code                public masmUpperCase masmUpperCase  proc   near32                .if al >= 'a' && al <= 'z'                and al, 5fh                .endif                ret masmUpperCase  endp                end

It is also possible to call an HLA procedure from a MASM or Gas program (this should be obvious because HLA compiles its source code to an assembly source file and that assembly source file can call HLA procedures such as those found in the HLA Standard Library). There are a few restrictions when calling HLA code from some other language. First of all, you can't easily use HLA's exception handling facilities in the modules you call from other languages (including MASM or Gas). The HLA main program initializes the exception handling system. This initialization is probably not done by your non-HLA assembly programs. Further, the HLA main program exports a couple of important symbols needed by the exception handling subsystem; again, it's unlikely your non-HLA main assembly program provides these public symbols. Until you get to the point you can write code in MASM or Gas to properly set up the HLA exception handling system, you should not execute any code that uses the try..endtry, raise, or any other exception handling statements.

Caution

A large percentage of the HLA Standard Library routines include exception handling statements or call other routines that use exception handling statements. Unless you've set up the HLA exception handling subsystem properly, you should not call any HLA Standard Library routines from non-HLA programs.

Other than the issue of exception handling, calling HLA procedures from standard assembly code is really easy. All you've got to do is put an @external prototype in the HLA code to make the symbol you wish to access public and then include an EXTERN (or EXTERNDEF) statement in the MASM/Gas source file to provide the linkage. Then just compile the two source files and link them together.

About the only issue you need concern yourself with when calling HLA procedures from assembly is the parameter passing mechanism. Of course, if you pass all your parameters in registers (the best place), then communication between the two languages is trivial. Just load the registers with the appropriate parameters in your MASM/Gas code and call the HLA procedure. Inside the HLA procedure, the parameter values will be sitting in the appropriate registers (sort of the converse of what happened in Listing 15-4).

If you decide to pass parameters on the stack, note that HLA normally uses the Pascal language-calling model. Therefore, you push parameters on the stack in the order they appear in a parameter list (from left to right), and it is the called procedure's responsibility to remove the parameters from the stack. Note that you can specify the Pascal calling convention for use with MASM's INVOKE statement using the ".model" directive. For example:

 .586 .model flat, pascal      .      .      .

Of course, if you manually push the parameters on the stack yourself, then the specific language model doesn't really matter. Gas users, of course, don't have the INVOKE statement, so they have to manually push the parameters themselves anyway.

This section is not going to attempt to go into gory details about MASM or Gas syntax. Presumably, you already know that syntax if you're wanting to combine HLA with MASM or Gas code. An alternative is to read a copy of the DOS/16-bit edition of this text (available on the accompanying CD-ROM) that uses the MASM assembler. That text describes MASM syntax in much greater detail, albeit from a 16-bit perspective. Finally, this section isn't going to go into any further detail because, quite frankly, the need to call MASM or Gas code from HLA (or vice versa) just isn't that great. After all, most of the stuff you can do with MASM and Gas can be done directly in HLA so there really is little need to spend much more time on this subject. Better to move on to more important questions, such as, "How do you call HLA routines from C or Pascal?"

^[1]HLA may assign a different name than "?741_myProc" when you compile the program. The exact symbol HLA chooses varies from version to version of the assembler (it depends on the number of symbols defined prior to the definition of myProc. In this example, there were 741 static symbols defined in the HLA Standard Library before the definition of myProc.

^[2]Support was added for this instruction in HLA 1.33; we'll pretend for the sake of example that HLA still does not support this instruction.