HLA v1.90 Progress Report

Randall Hyde · March 14, 2007, 02:01:48 PM

Hi All,
Just a brief notification about the upcoming HLA v1.90 release.

HLA v1.90 will combine FASM with HLAPARSE to produce a single executable that converts HLA source files directly to OBJ files (.o files under Linux). I have converted the FASM source code to HLA syntax (so I can more easily maintain it myself from here on out) and I've done a bit of refactoring of the code. Largely, the refactoring has consisted of moving functions and procedures from the main FASM source body into HLA procedures (quite an undertaking as the FASM procedures are not well-defined at all -- you have procedures in the middle of larger ones, procedures that overlap, and sections of one procedure that turn out to be instructions used by another procedure as well; some might call this "assembly programming", I call it unstructured). In a few cases I've converted some obvious assembly sequences to HLL-like control structures, if the code generation is roughly identical. I was also able to convert one of Tomasz' procedures into an HLA iterator (I was actually impressed that he used the equivalent of an interator at one point in his source code).

Currently, FASM consumes a *tremendous* amount of memory when maintaining the symbol table. Tomasz uses a hash tree that winds up consuming a *lot* of memory if you have a fair number of symbols. For example, when compiling FASM under itself, this tree structure alone consume over four megabytes. As FASM has less than 10,000 symbols, this works out to about 430 bytes per symbol, just for the hash tree entries (i.e., not including the actual symbol table entry). This is a bit excessive, but it's not my main consider (who can't afford 4 MB these days?) The big problem is that this number is not bounded and compiling a source file with a lot of symbols (e.g., typical HLA output) could wind up consuming a lot more memory. This unbounded memory usage is not wise if FASM is going to run along side another application (i.e., HLA) that is already a memory hog in its own right. So over the next couple of days I intend to replace the hash tree that FASM uses with a more traditional hash table whose size I can control (I'm thinking of fixing it at 256K). I'll report all that back here when that happens.

HLA v1.89 (an experimental version, http://webster.cs.ucr.edu/hla.zip and http://webster.cs.ucr.edu/hla.tar.gz) paved the way for merging HLAPARSE and FASM. HLA v1.89 writes out a single .asm file rather than a multitude of asm and inc files (as previous versions of HLA have done). Currently, HLA v1.89 "writes" the entire source file to memory and then writes this file to disk in one I/O operation. This will dovetail with FASM quite well. FASM currently reads the entire source file into memory and then processes the memory-based image. Therefore, to combine HLA and FASM, the main thing I will be doing is disabling the code on the HLA side that writes the file to disk, disabling the code on the FASM side the reads the file from disk, and passing a pointer to FASM to the memory-based file that HLA has produced (okay, it's not quite that simple, but you get the idea).

I'm planning on working on the FASM memory usage a bit over the next couple of days and then I'm going to attempt to merge HLAPARSE and FASM over the weekend. With luck, I should have a version I can post sometime next week.
Cheers,
Randy Hyde

gedumer · March 16, 2007, 06:36:36 PM

I haven't followed HLA recently, so perhaps I missed the leadup to this version, but does this mean that you're taking on the responsibility of upgrading your internal version of FASM in parallel with Tomasz's standalone version? I also thought this concept wasn't supposed to occur until v.2.0? This is essentially what I've been waiting for, i.e. HLA with it's own builtin assembler. BTW... weren't you going to write your own? This seems a much more prudent approach though and gives you a proven and fairly well debugged product.

I just thought of something else... a successful release of 1.90 and upward spells the end to your support of MASM correct? I certainly don't see this as a bad thing at all:)

Good luck. I'm really looking forward to this version.

DarkWolf · March 17, 2007, 02:45:57 AM

Is GAS still used in the Linux version ?
I was not aware of a Linux version of FASM or did you write that support too ?
Is HLA still dependent on binutils/glibc or has this been replaced ?

Randall Hyde · March 18, 2007, 05:43:12 AM

Quote from: gedumer on March 16, 2007, 06:36:36 PM
I haven't followed HLA recently, so perhaps I missed the leadup to this version, but does this mean that you're taking on the responsibility of upgrading your internal version of FASM in parallel with Tomasz's standalone version?

Yes, I have created a branch. I've branced off FASM v1.66. I'll probably do the diffs to bring it (mostly) in sync with the latest version before too much longer, but for HLA purposes keeping in sync isn't all that important. After all, I only use a small set of the features in FASM as it is. Other than bug fixes, I won't really "need" anything new added to FASM. OTOH, I am spinning the "C-callable" version of FASM off as a sort of separate product, so maintaining that will be important if anyone else decides to use it.

Quote
I also thought this concept wasn't supposed to occur until v.2.0?

Despite what a lot of people think, HLA v2.0 is *not* about generating object code directly. That's something that HLA v2.0 will definitely do, but that's not the purpose of HLA v2.0. The purpose is to completely rewrite HLA so that it's a commercial-qualiity product. There are some *real* design problems with HLA v1.x -- not generating object code is *not* one of those defects. Generating object code directly seems to be a big issue with the ALA crowd, but for actual HLA users what differerence will that really make? Oh, it might give them "bragging rights". But the past argument about "HLA is not an assembler because it uses FASM/MASM/Gas/TASM as a back end" is sooooo weak. It's like, those people can't figure out the *real* problems with HLA so they have to invent synthetic problems. Believe me, when HLA v1.90 appears, you won't hear the end of the complaints about HLA -- the people who are doing all the complaining will find something else to complain about. It goes with the territory.

Quote
This is essentially what I've been waiting for, i.e. HLA with it's own builtin assembler.

Seriously, why? What difference will it make to you that FASM is moved into HLAPARSE? Perhaps you'll compile your code a few milliseconds faster, but other than that it won't really make much of a difference. You'll still feed HLA a source file and you'll still get an EXE file out of it. How HLA accomplishes the task really doesn't make a bit of difference.

Quote
BTW... weren't you going to write your own? This seems a much more prudent approach though and gives you a proven and fairly well debugged product.

Sure, HLA v2.0 will still be written (someday) with it's own native code generator.

BTW, one thing you ought to be aware of is that HLA is already doing a fair amount of direct code generation. To support assemblers like MASM and TASM that don't support the new SSE/3 (and whatever) instructions, I've had to put the code generation facilities into HLA to handle those instructions. If I really wanted to take the time, it wouldn't be that hard to extend this to all the machine instructions. Then the back-end assemblers would really be there simply for the object file format processing.

And now, btw, having gone through a fair percentage of the process, I'm not sure that "prudent" and "fairly well debugged" are going to apply. First of all, I've refactored the code to make it easier to maintain. No doubt, that has introduced several defects that I've yet to find. And I will say this: it may be fairly well tested and debugged, but it's not easy to modify (that's what drove me to refactor it in the first place). To say that the code is "unstructured" is being nice. I've cleaned up some of that, and added a bunch of comments (there were almost none in the original source code) to make it easier to work on, but there's still a lot of cleaning up and commenting left to be done.

Quote
I just thought of something else... a successful release of 1.90 and upward spells the end to your support of MASM correct? I certainly don't see this as a bad thing at all:)

Not at all. HLA will continue to support MASM, FASM, TASM, and Gas. Indeed, I've recently made a few changes to the compiler to clean up some TASM problems. MASM continues to be absolutely essential because MASM is the only back-end that provides symbolic debugging information for OllyDbg. What will change is that the standard release will automatically produce OBJ files; to use MASM, you'll have to run an "mhla" program (like you run fhla now). Actually, what I'm going to do is the following:

1) create a new environment variable, maybe "hlabackend" that controls the default back-end processing.
2) Modify HLA.EXE (hla under Linux) so that it looks at the program name (e.g., hla.exe, fhla.exe, mhla.exe, ghla.exe, or thla.exe) to determine what processing should take place. This way, all you have to do is rename the program to get the behavior you want -- you won't need to keep multiple copies of the program lying around (I use this trick, for example, to set the language levels; you can rename hla.exe to mla.exe, lla.exe, or vlla.exe to have the same effect as specifying the "-level=x" command-line parameter).

Quote
Good luck. I'm really looking forward to this version.

Me too. Should be a couple of days before I have an experimental version up.
Look for my cFASM announcement elsewhere.
Cheers,
Randy Hyde

Randall Hyde · March 18, 2007, 05:44:47 AM

Quote from: DarkWolf on March 17, 2007, 02:45:57 AM
Is GAS still used in the Linux version ?

Yes.

Quote
I was not aware of a Linux version of FASM or did you write that support too ?

Yes, FASM supports Linux. And HLA v1.90 will take advantage of that support, though it will still support Gas as well.

Quote
Is HLA still dependent on binutils/glibc or has this been replaced ?

It's never been dependent on glibc. It does use as and ld from the binutils package and even if you elect to go with FASM support, you'll still need to use ld (the linker).
Cheers,
Randy Hyde

DarkWolf · March 24, 2007, 03:03:20 AM

Strange, it must have been a flaw in RedHat.
The first Linux distro I used HLA in responded that glibc 2.3.3 or newer was needed.
Must have been another link in dependency hell (Linux's answer to DLL Hell) :toothy

I look forward to the ability to natively generate object code for the same reason I liked HIDE.
The environment is already built, all the tools are available, we know everything works.
Other wise it's like what I have in Linux right now where I have to make sure I got everything right or it all craps out on me :(

(What we have here is a failure to communicate) :)

News:

HLA v1.90 Progress Report

Randall Hyde

gedumer

DarkWolf

Randall Hyde

Randall Hyde

DarkWolf