News:

MASM32 SDK Description, downloads and other helpful links
MASM32.com New Forum Link
masmforum WebSite

Tricky problem with CodeAlign macro

Started by jj2007, February 16, 2010, 11:25:14 PM

Previous topic - Next topic

dedndave

the best way to fix it is to tear into the assembler and patch it in the parser
but - that is a lot of work - maybe i will play with it this summer
it is not a huge bug - but sometimes, those little ones will make you pull your hair out

jj2007

#31
I got it working for this particular example only, but it is soooo odd...! Compare the two db nn dup(90h) lines: The only difference is that the successful one has nopct as counter while the failing one has the 5 - and nopct is obviously 5.
Mysteries of Masm ::)

As an extra service for Dave and Lingo who both like inserting bytes before prog start instead of aligning before the loop, the macro now signals how many bytes had to be inserted :bg

QuoteCodeAlign MACRO ct
LOCAL nopct
  nopct = ct-($-start) mod ct      ; we take start as reference
  if ct eq nopct
   nopct = 0
  endif
  tmp$ CATSTR <Line >, %@Line, <: >, %nopct, < nops inserted>
  % echo tmp$
  if nopct eq 5
   db nopct dup(90h)   ; succeeds with ml and Jwasm
   ; db 5 dup(90h)      ; fails with ml and Jwasm
   ; db 2Eh, 8Dh, 44h, 20h, 0h         ; lea eax, cs:[eax], fails
  else
   align ct      ; standard
  endif
ENDM
[/b]

qWord

Quote from: jj2007 on February 17, 2010, 11:17:04 PMThe only difference is that the successful one has nopct as counter while the failing one has the 5 - and nopct is obviously 5.
thats masm as we love it  :bg
FPU in a trice: SmplMath
It's that simple!

dedndave

i have no idea how you figured that out, Jochen
both you guys did great   :U
i find it odd that it was adding 5 extra nops and the nop we are having trouble with is the 5 byte version
probably just a coincidence
the issue seems to be when it expands the macro - that would seem the logical place for the miscount

jj2007

Bad news: It's not working. I added a proc to test that insert bytes before proc start feature, and found one more oddity...

Line 52: 7 nops inserted
Line 72: 3 nops inserted
Line 89: 4 nops inserted

if you look at the code in Olly, you'll find 7, 3 and ...8 nops :red
For testing, I changed the treatment from align ct to db nopct dup
The macro echoes 4 nops, and inserts 8. Long live Masm :toothy

Quote  tmp$ CATSTR <Line >, %@Line, <: >, %nopct, < nops inserted>
  % echo tmp$
  if nopct eq 5
   db nopct dup(90h)   ; succeeds with ml and Jwasm
   ; db 5 dup(90h)      ; fails with ml and Jwasm
   ; db 2Eh, 8Dh, 44h, 20h, 0h         ; lea eax, cs:[eax], fails
  else
   db nopct dup(90h)   ; succeeds with ml and Jwasm
;   align ct      ; standard
  endif


dedndave

maybe you could write the macro so that it inserts fixed instruction "strings"
then, decrements the ct value (by the length added) and loops until done...

90                      nop                        1-byte nop
8BFF                    mov edi,edi                2-byte nop
8D4900                  lea ecx,[ecx+00]           3-byte nop
8D642400                lea esp,[esp+00]           4-byte nop
368D642400              lea esp,ss:[esp+00]        5-byte nop (corrected)
8D9B00000000            lea ebx,[ebx+00000000]     6-byte nop
8DA42400000000          lea esp,[esp+00000000]     7-byte nop
368DA42400000000        lea esp,ss:[esp+00000000]  8-byte nop (added)

it seems that the "variable-dup" directive is not behaving as expected inside a macro
also - the user (programmer) could put whatever instuctions in there he likes   :bg
of course, we can run comparisons to see which seem to work best

japheth

Quote from: dedndave on February 18, 2010, 09:21:16 AM
it seems that the "variable-dup" directive is not behaving as expected inside a macro
also - ...

Sounds somewhat unlikely. With the -EP cmdline option one will get a source version with all macros resolved - then try to assemble this version. I would be surprised if the results differ.

hutch--

Dave,

I have seen on at least some hardware that the very long instructions used for alignment produce stalls and at times bad ones that effect the timing of the algorithm. There are instances where filling up and alignment with bare nops or int3 calls works better than a single large instruction and if the execution of multiple nops is a problem, jumping to the aligned location is still a reasonable option.
Download site for MASM32      New MASM Forum
https://masm32.com          https://masm32.com/board/index.php

dedndave

Andreas - we have more or less verified that masm has problems inserting the right number of elements
macros aren't expanded as one would expect
if it makes you feel any better, we have also verified that JWasm doesn't exhibit the problem   :bg

Hutch - that table is more or less what masm inserts, with one correction and one addition
i am not saying it is the best way to go
but, i would think carefully chosen instructions could prevent dependancy stalls
not that alignment code should be executed several times, anyways

at this point, the problem seems to be use of the DUP operator
you can stick whatever bytes you like into the DB's

dedndave

what would be really nice   :P
would be a directive pair like this....

labelX  PAD_BYTES_HERE directive

;a bunch of code here

        ALIGN_AT labelX AlignmentSize directive

the assembler inserts pad bytes at labelX until the code after ALIGN_AT is aligned to AlignmentSize
labelX can be a place in the code that isn't executed, or is executed only once   :bg

jj2007

Quote from: dedndave on February 18, 2010, 10:03:56 AM
what would be really nice   :P
would be a directive pair like this....
Difficult because it implies forward referencing: error A2006:undefined symbol : lblMyLoop


QuotePad MACRO arg
LOCAL opa, nopct
  opa = (opattr(arg)) AND 127
  if opa eq 36
   tmp$ CATSTR <align ct=>, <arg>
   % echo tmp$
   nopct = arg-(LastLabel-start) mod arg      ; we take start as reference
   tmp$ CATSTR <NopCount=>, %nopct
   % echo tmp$
  else
   LastLabel CATSTR <lbl>, <arg>
   % echo LastLabel
   lbl&arg&:
   tmp$ CATSTR <label inserted: lbl>, <arg>
   % echo tmp$
  endif
ENDM
...
QuotePad 16
MyTest proc arg1:DWORD, arg2:DWORD
  print "Hello from MyTest", 13, 10
  mov ecx, 888
  Pad MyLoop
  .Repeat
   dec ecx
  .Until Sign?
  ret
MyTest endp

dedndave

more like - "difficult if we can't get regular align to work right"   :P
i knew it was a tricky one
but, i think it is do-able if you were writing your own assembler
it might get sticky if there were nested alignments
the assembler would have to test for invalid conditions to prevent the thing from hanging

qWord

jj,
for me this simplified version works (also no unnecessary nops):
CodeAlign MACRO ct
  db (ct-($-start) mod ct)*(1 AND ((($-start) mod ct) NE 0)) dup(90h)
ENDM

I've test it in your latest test program.
FPU in a trice: SmplMath
It's that simple!

dedndave

that one seems to work ok, qWord
i will test it more a little later   :bg

dedndave

i used the forum search tool to find this older thread (2005)
this post by MichaelW is of particular interest...

http://www.masm32.com/board/index.php?topic=1622.msg13124#msg13124