News:

MASM32 SDK Description, downloads and other helpful links
MASM32.com New Forum Link
masmforum WebSite

BSS assembling speed

Started by Rockoon, February 02, 2007, 07:53:00 PM

Previous topic - Next topic

Rockoon


I'd like to begin by saying "Hello World!" This is my first post on this forum.

I used to bang out a lot of asm back in the old days (DOS ERA with TASM) and recently I've tried to pick the habit back up for some Win32 development.

I currently have MASM v8.whatever and MASM32 (which is v6.whatever) and both of these assemblers seem to take an irrational amount of time assembling "large" BSS segment. By "large" I mean only a few symbols totaling a dozen megabytes or so.

Now, I don't have the fastest development machine on the block but its no slouch either. AMD64x2 3800+.

Starting with a naked BSS (well, aother included modules/libs have small BSS's), my machine assembles a file in the blink of an eye.

But just adding:

.data?
surface    dd 307200 dup (?)

..causes both MASM6 and MASM8 it to take over a second .. I'm like WTF this can't possibly take that long.. its just a single BSS symbol with a length for christ sakes!

I realize that I can allocate the memory in other ways, but its nice to have memory auto-allocated by the OS loader.. is there a way around this speed issue while still using something fully masm compatible (preferably with support for SSE3 instructions?)
When C++ compilers can be coerced to emit rcl and rcr, I *might* consider using one.

hutch--

Its a well known limitation in every version of MASM. They have never fixed it mainly because there is almost no point in allocating memory in that manner, it just enlarges the executable size for no purpose. You would normally allocate dynamic memory which is limited only by what you have in your system.
Download site for MASM32      New MASM Forum
https://masm32.com          https://masm32.com/board/index.php

tenkey

In case you still want to preallocate it...

If the new linkers can still invoke the OMF to COFF converter, generate OMF object files.

The other option is to separate large data blocks into a separate ASM file and generate a separate OBJ file in COFF format. Use EXTERNDEF in the defining file to make it accessible to other files, and use EXTERNDEF in the files that reference the data. You can use nmake or some other "make" program to ensure it builds the large block only when the defining file is changed.
A programming language is low level when its programs require attention to the irrelevant.
Alan Perlis, Epigram #8

gfalen

Try rhe COMM directive.

data?
COMM surface   dword:307200

Rockoon

Quote from: hutch-- on February 02, 2007, 09:53:18 PM
Its a well known limitation in every version of MASM. They have never fixed it mainly because there is almost no point in allocating memory in that manner, it just enlarges the executable size for no purpose. You would normally allocate dynamic memory which is limited only by what you have in your system.

BSS data does not significantly effect the EXE size (small overhead per reference.. I suspect it ammortizes to 4 bytes per reference), and that leads to the point in allocating it that way. The OS loader will fix up all references to it, allowing implicit reference..

mov ecx, 307199 ; top of array
@@do:
mov eax, [surface + 4 * ecx]

; process it

mov [surface + 4 * ecx], eax
dec ecx
jns @@do

..without the overhead of having the whole array stored in the EXE (as with DGROUP)

Note that the loop counter doubles as an index into the table. This is infact one of AMD's latest optimisation tips for its 64's and better. The alternatives require more registers or more incrementing/decrementing.
When C++ compilers can be coerced to emit rcl and rcr, I *might* consider using one.

Rockoon

Quote from: gfalen on February 02, 2007, 10:26:08 PM
Try rhe COMM directive.

data?
COMM surface   dword:307200


works! thanks

cept the syntax is:

COMM surface:dword:307200
When C++ compilers can be coerced to emit rcl and rcr, I *might* consider using one.

hutch--



> BSS data does not significantly effect the EXE size (small overhead per reference.

Yes you are right, its the initialised .DATA section that enlarges the exe size. MASM is still slow in the uninitialised data section with large allocations and there seems to be little motivation to fix it as its rarely used in production code. I would still prefer dynamic memory to memory allocated in an executable program as it is only limited by available memory from the system.
Download site for MASM32      New MASM Forum
https://masm32.com          https://masm32.com/board/index.php

Rockoon

Quote from: hutch-- on February 03, 2007, 08:33:58 AM
I would still prefer dynamic memory to memory allocated in an executable program as it is only limited by available memory from the system.

Ah to each his own. One of the other benefits of using BSS is that there is no need for gracefull failure when there isnt enough memory available.

There are some downsides of the 'comm' (communal) directive such as that memory ordering apparently cannot be garanteed.

.data?
comm foo:dword:1024
comm bar:dword:512

There is no garantee here that bar will follow right behind foo in memory and it may infact come before foo!

Also, the symbol is obviously global to the project instead of private to the module so hello funky long naming.

Anyways.. thanks for the help guys!
When C++ compilers can be coerced to emit rcl and rcr, I *might* consider using one.