HLA v102 Beta Release

Randall Hyde · March 27, 2008, 04:21:25 PM

Hi All,
Well, Webster is being fussy again, so I can't get a beta version of HLA up on Webster. So I will publish it on my .Mac page instead. The beta version of HLA v1.102 can be found here:

http://homepage.mac.com/randyhyde/FileSharing2.html

This is a zip file containing the Win32 versions of HLA.EXE and HLAPARSE.EXE only.
DO NOT REPLACE AN EXISTING VERSION OF HLA WITH THIS CODE!

HLA v1.102 is a beta release and although it has passed all of my regression and instruction test suites, all that this means is that my test suites are not good enough to catch the defects present in the system. This beta version was created in order to allow the HLA user community to help me track down the problems in the code prior to an official release.

It is important to realize that I've completely rewritten somewhere between 1/3 and 1/2 of the compiler. This is not a "minor tweak" of HLA v1.101. It is, effectively, a brand-new system (most other people would have updated the major version number for a release such as this). Though this rewrite has fixed a large number of previously-existing defects, it has (undoubtedly) introduced a whole host of new defects. Worse, the previous defects were in code that wasn't executed very often (else they would have been caught and fixed by now); the new defects are probably evenly distributed throughout the new code. Moral of the story -- don't simply replace an older version of HLA with this new version and go about your business. Save your old executables (hla.exe and hlaparse.exe) so you can restore them when you're done playing around with HLA v1.102. I really do *not* recommend using HLA v1.102 for any mission-critical work (of course, this suggestion applies to the entire v1.x prototype series, but it especially applies to HLA v1.102). Use the HLA v1.102 for testing purposes, and then build your real code with an older version.

Here's what you get with HLA v1.102:

1) The behavior of the FADD(); FSUB(); FSUBR(); FMUL(); FDIV(); and FDIVR(); instructions (with no operands) has changed. Previously, these instructions, without operands, generated opcodes for faddp, fsubp, fsubrp, fmulp, fdivp, and fdivrp, respectively. This behavior is inconsistent with Intel documentation (and most other assemblers). This has been changed in HLA v1.102 so that these instructions, without operands, are equivalent to fadd( st1, st0 ); fsub( st1, st0 ); fsubr( st1, st0 ); fmul( st1, st0 ); fdiv( st1, st0 ); and fdivr( st1, st0 ); (i.e., no pop after the arithmetic operation). THIS WILL BREAK ANY EXISTING CODE THAT DEPENDED ON THESE INSTRUCTIONS POPPING ST0. If you use any of these instructions, you should add the explicit "p" suffix to them (i.e., faddp, fsubp, fsubrp, fmulp, fdivp, and fdivrp). This fix is portable across all versions of HLA (that is, older versions will also generate the pop version of the instruction with the "p" suffix). As all versions of HLA starting with v1.102 will use these new semantics, this is a change you should make to your source code regardless of whether you intend to use the beta version of HLA v1.102.

Another alternative is to create a macro that will produce the original semantics. Such a macro might look something like this for "fadd":

#id( fadd ) //Turn FADD into an identifier
#macro fadd( _args_[] );

#if( @elements( _args_ ) = 0 ) // zero operands?

faddp()

#elseif( @elements( _args_ ) = 1 ) // one operand

~fadd( @text( _args_[0] ) ) // Emit fadd( xxx ); instruction

#elseif( @elements( _args_ ) = 2 )

~fadd( @text( _args_[0]), @text( _args_[1] )) // emit fadd( xxx, yyy ); instruction

#else

#error( "Illegal # of arguments to fadd" )

#endif

#endmacro

(this was typed into the mail program, forgive me for any errors or omissions.)

2) HLA v1.102, by default, generates hexadecimal encodings for almost all instructions. If you produce a source file (e.g., MASM-compatible output), mostly what you will see are DB/DW/DD statements. Not very readable. If you supply the "-test" command-line parameter to HLA, HLA will emit comments into this output that describe the encoded instructions. Encoding instructions in hexadecimal speeds up the back-end assembler a tiny amount and (more importantly), allows me to ignore encoding errors found in various back-end assemblers (and they all seem to have a few).

3) HLA v1.102 supports a new command-line parameter: "-sourcemode". This argument tells HLA to synthesize human-readable machine instruction statements, as much as possible, rather than encoding the instructions in hexadecimal form (which is the default). In a sense, the "-sourcemode" command-line parameter tells HLA to behave like it used to (v1.101 and earlier) -- emitting source statements to be compiled by the back-end assembler. However, it is important to realize that the source code generation module has been completely rewritten and the format of the source code HLA v1.102 produces is quite different (and, hopefully, far more readable) than the source code produced by previous versions. The whole point of "-sourcemode" is to produce code that is human-readable, so a bit of effort has gone into making the source output as readable as possible. Note that not all assemblers support all machine instructions in source form, and some assemblers have some encoding defects, so *some* instructions will still be emitted in hexadecimal form if the back-end assembler cannot handle the instruction. Supplying the "-test" command-line parameter will attach a comment to such instructions so you can still see what was originally present. Do not confuse "-sourcemode" with the "-s*" command-line parameters. "-sourcemode" tells HLA to produce source code rather than hexadecimal instruction encodings. "-sourcemode" does *NOT* tell HLA to stop processing once it has compiled the HLA source file to a source file compatible with some back-end assembler. That is, if the only command-line argument is "-sourcemode", HLA.EXE will run the back-end assembler and linker to produce an executable file.

4) There is a new "-sh" command-line parameter. This tells HLA to produce a source output file using "pseudo-HLA" syntax. The output isn't exactly HLA-compatible (that is, you cannot run it back through HLA), but the purpose of this output is not to be processed by machine. Instead, it is intended for human consumption only. The purpose for "-sh" is to let you see macro and HLL-like statement expansions in a low-level form. For example, if you're interesting in translating HLA HLL-like code in your source files to low-level assembly, you can use the "-sh" option to see what the compiler is doing with those statements. You can also use this option to help you optimize your code or learn about how the HLA compiler operates.

5) There is a command-line option "-sx". This is similar to "-sg" (produce Gas source file) except that the output file is formatted in the syntax required by the Mac OSX version of Gas. This option is intended for use in developing the Mac OSX version of HLA. Currently, there are no "-cx" or "-xx" command-line options; someday (along with the Mac OSX version of HLA), these will appear. Note that HLA does not currently run under Mac OSX. If you want to produce an assembly file for use under Mac OSX you would have to create a Gas file under Windows and then assemble that output on the Mac. This is painful and error-prone, so I don't recommend this unless you *really* know what you are doing (this option exists so I can compile the HLAASM.HLA source file, which is part of the HLA compiler, for use on the Mac).

6) There are a new set of command-line options "-sn", "-cn", and "-xn" that tell HLA to produce NASM v2.02 output and compile the output (-cn/-xn) using NASM.

7) HLA v1.102 produces the same "object-code signature" as the back-end assembler you use. That is, if you're using FASM, it produces object code that is identical to FASM's code emission; if you're using MASM, it produces the same object code that MASM would produce, etc. (and no, these assemblers do *not* all produce the same code). Largely, this is done for testing purposes (it's convenient to test HLA's code emission by comparing its output against that produced by several other assemblers). However, there are some cryptographic/hacker/anti-hacker reasons why you might want the same object code signature as another assembler.

For this beta test, I would really appreciate it if you would compile some of your existing HLA projects (especially larger ones) in both sourcemode and hexadecimal mode. Here is a typical test procedure I follow (assume file is named "test1.hla" ):

hla -sourcemode -xm test1
dumpbin -disassembly test1.exe >test1sm.txt
hla -xm test1
dumpbin -disasm test1.exe >test1hm.txt
fc test1sm.txt test1hm.txt

The above uses MASM as the back-end assembler (the "-xm" option). You can use any of -xm, -xn, -xf, or -xt (depending on what back-end assemblers you have available).

Note that this command sequence compiles the program twice -- once in source mode, once in hexadecimal mode. It then disassembles the executables and compares the disassembles ("fc" is "file compare").

For the most part, if there are any differences then there is something to be concerned about. However, sometimes there are differences that we don't need to worry about. Specifically, if you have two labels at the same address in memory, it could turn out that one disassembly uses one symbol and the second disassembly uses the other symbol. Obviously, we can ignore such errors. (indeed, if you supply the -release option to link/polink, you can eliminate all the symbol information from the executable; the easy way to do this is to edit the ".link" file that HLA produces and add a "-release" line to the linker file).

It would be really cool if you could compile your program across multiple back-end assemblers (if you have them available). But I'll take what I can get.

As the NASM code generation is a brand-new feature, I'd love to see some people compiling their projects with NASM to see if we can find any problems in that code generation.

Note that you *cannot* compare object/executable files produced by one back-end assembler with an executable/object file produced by some other back-end assembler. Each assembler has it's own unique "object code generation signature" and the code will *not* be the same, even if it is semantically equivalent.

My plan is to put the final HLA v1.102 release up on Webster in a couple of weeks. In the meantime, if you find any defects, post a notice to the official HLA defect tracker on SourceForge (http://sourceforge.net/projects/hlav1)

hLater,
Randy Hyde

Evenbit · March 28, 2008, 12:25:29 AM

Here is a complete? macro include file:

Code Select


//  Macros to be used with HLA versions 1.102 and later

//  so these FPU instructions match the behavior

//  described in existing AoA and HLA Ref. material.

//  March 27, 2008



#id( fadd )

#macro fadd( _args_[] );



  #if( @elements( _args_ ) = 0 )



    faddp()



  #elseif( @elements( _args_ ) = 1 )



    ~fadd( @text( _args_[0] ) )



  #elseif( @elements( _args_ ) = 2 )



    ~fadd( @text( _args_[0]), @text( _args_[1] ))



  #else



    #error( "Illegal # of arguments to fadd" )



  #endif



#endmacro



#id( fsub )

#macro fsub( _args_[] );



  #if( @elements( _args_ ) = 0 )



    fsubp()



  #elseif( @elements( _args_ ) = 1 )



    ~fsub( @text( _args_[0] ) )



  #elseif( @elements( _args_ ) = 2 )



    ~fsub( @text( _args_[0]), @text( _args_[1] ))



  #else



    #error( "Illegal # of arguments to fsub" )



  #endif



#endmacro



#id( fsubr )

#macro fsubr( _args_[] );



  #if( @elements( _args_ ) = 0 )



    fsubrp()



  #elseif( @elements( _args_ ) = 1 )



    ~fsubr( @text( _args_[0] ) )



  #elseif( @elements( _args_ ) = 2 )



    ~fsubr( @text( _args_[0]), @text( _args_[1] ))



  #else



    #error( "Illegal # of arguments to fsubr" )



  #endif



#endmacro



#id( fmul )

#macro fmul( _args_[] );



  #if( @elements( _args_ ) = 0 )



    fmulp()



  #elseif( @elements( _args_ ) = 1 )



    ~fmul( @text( _args_[0] ) )



  #elseif( @elements( _args_ ) = 2 )



    ~fmul( @text( _args_[0]), @text( _args_[1] ))



  #else



    #error( "Illegal # of arguments to fmul" )



  #endif



#endmacro



#id( fdiv )

#macro fdiv( _args_[] );



  #if( @elements( _args_ ) = 0 )



    fdivp()



  #elseif( @elements( _args_ ) = 1 )



    ~fdiv( @text( _args_[0] ) )



  #elseif( @elements( _args_ ) = 2 )



    ~fdiv( @text( _args_[0]), @text( _args_[1] ))



  #else



    #error( "Illegal # of arguments to fdiv" )



  #endif



#endmacro



#id( fdivr )

#macro fdivr( _args_[] );



  #if( @elements( _args_ ) = 0 )



    fdivrp()



  #elseif( @elements( _args_ ) = 1 )



    ~fdivr( @text( _args_[0] ) )



  #elseif( @elements( _args_ ) = 2 )



    ~fdivr( @text( _args_[0]), @text( _args_[1] ))



  #else



    #error( "Illegal # of arguments to fdivr" )



  #endif



#endmacro

Nathan.

News:

HLA v102 Beta Release

Randall Hyde

Evenbit