Inspired by DednDave, I am trying to implement a CodeAlign macro:
QuoteCodeAlign MACRO ct
int 3
MyLabel LABEL NEAR
oml = MyLabel
divl = oml ; transfer, no error
; tmp$ CATSTR <MyDivl=>, %oml ; error, oml undefined
; % echo tmp$
; divl = divl mod ct ; error, divl undefined
mov eax, offset MyLabel ; works fine
mov eax, THIS NEAR ; works fine, loads EIP
ENDM
[/b]
Apparently,
oml = MyLabel does create a result, an offset of 16h relative to the start of the code segment. That would be a basis for calculating the necessary number of nops. However, the variable oml, while containing the right value as shown below, cannot be
used as a normal integer variable. Any idea why??
00000015 CC 1 int 3
00000016 1 MyLabel LABEL NEAR
= 00000016 1 oml = MyLabel
= 00000016 1 divl = oml ; transfer, no error
1 ; tmp$ CATSTR <MyDivl=>, %oml ; error, oml undefined
1 ; % echo tmp$
1 ; divl = divl mod ct ; error, divl undefined
00000016 B8 00000016 R 1 mov eax, offset MyLabel ; works fine
0000001B B8 0000001B R 1 mov eax, THIS NEAR ; works fine, loads EIP
Full code:
.nolist
include \masm32\include\masm32rt.inc
.listall
CodeAlign MACRO ct
int 3
MyLabel LABEL NEAR
oml = MyLabel
divl = oml ; transfer, no error
; tmp$ CATSTR <MyDivl=>, %oml ; error, oml undefined
; % echo tmp$
; divl = divl mod ct ; error, divl undefined
mov eax, offset MyLabel ; works fine
mov eax, THIS NEAR ; works fine, loads EIP
ENDM
.code
start:
print "Masm32 is easy!", 13, 10
nop
CodeAlign 4
.Repeat
inc eax
.Until eax!=123
; inkey "Press any key"
invoke ExitProcess, 0
; .err
end start
Do not ask my why, but before using labels for arithmetic calculations you have to form a differenc (or sum) of two:
(OFFSET B - OFFSET A)
if label B is at position 0 (in segment) you've get the current relative offset in segment.
I've upload an macro in this thread (http://www.masm32.com/board/index.php?topic=13388.msg104447#msg104447)
Thanks, qWord. In the meantime I discovered that odd behaviour myself - $-start works fine:
.nolist
include \masm32\include\masm32rt.inc
CodeAlign MACRO ct
LOCAL nops
nops = ct-($-start) mod ct ; we take the start: label as reference
if nops ne ct
REPEAT nops
nop ; for testing, we just insert nops
ENDM
endif
tmp$ CATSTR <MyNops=>, %nops ; watch in output window
% echo tmp$
ENDM
.code
start:
REPEAT 10
nop
ENDM
CodeAlign 16
; nop ; use for testing
mov eax, $ ; same as THIS NEAR
and eax, 15
print str$(eax), ": should be zero", 13, 10
REPEAT 6
nop
ENDM
CodeAlign 16
mov eax, THIS NEAR ; same as $
and eax, 15
inkey str$(eax), ": should be zero", 13, 10
exit
end start
The code assembles, and the first alignment produces the desired result, but the second one doesn't. Probably something really stupid, but right now I am too tired... good night everybody :8)
get a good nights sleep, Jochen
this one has me baffled - lol
but, you and qWord will make it work :8)
I am making progress, and there is good and bad news. First, the good news:
It works perfectly, and it is so simple:QuoteCodeAlign MACRO ct
LOCAL nops, diff
diff = $-start
nops = ct-($-start) mod ct ; we take the start: label as reference
if nops ne ct
REPEAT nops
nop ; for testing, we just insert nops
ENDM
endif
tmp$ CATSTR <MyNops=>, %nops, <, d=>, %diff ; watch in output window
% echo tmp$
ENDM
[/b]
Now the bad news: It works only for JWasm, because ML.exe has a feature
TM that occasionally adds several bytes to the location counter, in a not so predictable way. If the fathers of this design are still around, they should show up now, please :8)
ummmmmm - JWam probably doesn't have the problem, to begin with
if it does, we can get Japheth to fix it :bg
nice macro, though :U
but, i think qWord is on to something
if you combine his macro with the assemblers ALIGN, you might have something that works in all cases
for example:
a secondary routine in a library is one place where he says it won't work
but, if you use a non-executed ALIGN directive at the beginning of that routine,
you might be able to make the macro method work because it has a known reference point - i dunno
i.e., use an ALIGNed label as a reference point instead of Start:
Quote from: dedndave on February 17, 2010, 09:57:20 AM
ummmmmm - JWam probably doesn't have the problem, to begin with
if it does, we can get Japheth to fix it :bg
JWasm works perfectly, it's only ml that has a problem
Quote
i.e., use an ALIGNed label as a reference point instead of Start:
start: is aligned, unless you have put yourself code before the label. In that case,
align 16
start:
will provide you again with a perfectly aligned reference point.
The issue is not the reference, it's that ml arbitrarily calculates the program counter in a wrong way. Have a look at the testbed in Olly...
QuoteQuote from: dedndave on Today at 01:57:20 AM
ummmmmm - JWam probably doesn't have the problem, to begin with
if it does, we can get Japheth to fix it BigGrin
JWasm works perfectly, it's only ml that has a problem
my point was that JWasm doesn't need the CodeAlign macro :red
Quote from: dedndave on February 17, 2010, 10:15:10 AM
my point was that JWasm doesn't need the CodeAlign macro :red
That's true, good point :bg
00401032 B832104000 mov eax, 00401032
00401037 BA37104000 mov edx, 00401037
0040103C 83EA05 sub edx, 005 ;5 accounts for "mov edx,immed32"
0040103F 3BC2 cmp eax, edx
shouldn't that be sub edx,10 ?
mov eax,immed32 = 5 bytes
mov edx,immed32 = 5 bytes
total = 10 bytes
EDIT - btw - i like the trick with inkey :bg
Quote from: dedndave on February 17, 2010, 10:39:18 AM
shouldn't that be sub edx,10 ?
No, it just tests what would have been the result if edx had been used in the same position as eax, so it's 5 bytes.
mov eax, THIS NEAR ; loads... what?
mov edx, $ ; loads EIP
sub edx, 5 ; account for mov edx, $
00401032 B832104000 mov eax, 00401032
00401037 BA37104000 mov edx, 00401037
0040103C 83EA05 sub edx, 005
0040103F 3BC2 cmp eax, edx
00401041 7414 je 00401057
00401043 6827304000 push 00403027 ;(StringData)"$ and THIS NEAR seem to be different"
00401048 E8BB000000 call 00401108
0040104D 684C304000 push 0040304C ;(StringData)" <cr><lf>"
00401052 E8B1000000 call 00401108
;this is the macro expansion - it does not appear to have generated the correct number of NOP's
00401057 90 nop
00401058 90 nop
00401059 90 nop
0040105A 90 nop
0040105B 90 nop
0040105C 90 nop
0040105D B85D104000 mov eax, 0040105D
00401062 83E007 and eax, 007
00401065 684F304000 push 0040304F
0040106A 50 push eax
0040106B E830000000 call 004010A0
00401070 684F304000 push 0040304F
00401075 E88E000000 call 00401108
0040107A 6864304000 push 00403064 ;(StringData)": should be zero <cr><lf>"
0040107F E884000000 call 00401108
00401084 E8B7000000 call 00401140
it should only generate 1 NOP
it generated 6
6-1 = 5
Quote from: dedndave on February 17, 2010, 11:25:09 AM
;this is the macro expansion - it does not appear to have generated the correct number of NOP's
Yes, for ML.exe it fails because it assumes a much higher offset, 131 instead of 88 bytes. Jwasm handles it correctly...
i see that one nop is put in there by you (not sure why)
so - there should be only one nop
i am still not understanding the problem completely, other than it is generating the wrong number of nops
it seems to be something along the line of - when the macro gets expanded, masm is assigning a different PC value than we expect
EDIT - i seem to recall having seen this problem before
but i may have a way around it - let me try it
well - my novice macro skills are a major limitation :bg
but, maybe my results will give you a clue...
here is the macro i tried
CodeAlign MACRO ct
LOCAL nearlabel,nops
nearlabel:
nops = ct-((nearlabel-start) and (ct-1))
echo ct
echo start
echo nearlabel
echo nops
if nops
REPEAT nops
nop ; for testing, we just insert nops
ENDM
endif
ENDM
here is the assembly-time output
8
start
??0020
??0021
i can't get it to show me the address of "start"
it's probably because start has not been defined when the macro is
the hope was to figure out how masm PC behaves when the macro is expanded and compensate for it with a math expression
don't know why, but the problem seems the 'start'-label. This works:
CodeAlign MACRO ct
LOCAL nops, diff
IFNDEF align_mark_code
_TEXT SEGMENT
align_mark_code:
org 0
algin_zero_lbl_code:
org align_mark_code
_TEXT ENDS
ENDIF
nops = ct-(($-algin_zero_lbl_code) mod ct)
if nops ne ct
REPEAT nops
nop
ENDM
endif
ENDM
its generating a label in code-segment at pos 0 (org 0) - generally this is better than the start-label because not all people use a label named 'start' (or they use a proc as entry). Also there may be code before the start-label e.g. through includes.
EDIT - error corrected - thx2 dedndave :)
and - it is defined before the macro that references it :U
shouldn't it be this ?
CodeAlign MACRO ct
LOCAL nops
IFNDEF align_mark_code
_TEXT SEGMENT
align_mark_code:
org 0
align_zero_lbl_code:
org align_mark_code
_TEXT ENDS
ENDIF
nops = ct-(($-align_zero_lbl_code) mod ct)
if nops ne ct
REPEAT nops
nop
ENDM
endif
ENDM
Quote from: dedndave on February 17, 2010, 02:09:17 PM
shouldn't it be this ?
yes , your right :bg
I've also followed one of your suggestions and modified my macro: now its use masm's align except for a padding of 5
The problem is not the start label, it yields exactly the same offset as Lbl0 in this case. The problem is that ml.exe shoots in the dark when it comes to calculating the current offset...
Echos from JWasm:
Nops_S=7, diff_S=1
Nops_L=7, diff_L=1
Main code: diff=95 ; correct - we need exactly one nop to arrive at 96
Nops_S=1, diff_S=95
Nops_L=1, diff_L=95
Echos from ml.exe:
Nops_S=7, diff_S=1
Nops_L=7, diff_L=1
Main code: diff=138 ; rubbish, this is far from the true position
Nops_S=6, diff_S=138
Nops_L=6, diff_L=138
CodeAlign MACRO ct
LOCAL nops, diff
IFNDEF Current_ORG
Current_ORG: ; save origin by creating a label
ORG 0 ; set origin to zero
Lbl0:
ORG Current_ORG ; restore origin
ENDIF
diff_S = $-start
diff_L = $-Lbl0
nops_L = ct-(($-Lbl0) mod ct) ; we take Lbl0 as reference
nops_S = ct-($-start) mod ct ; we take start as reference
nops = nops_L
if nops ne ct
REPEAT nops
nop ; for testing, we just insert nops
ENDM
endif
tmp$ CATSTR <Nops_S=>, %nops_S, <, diff_S=>, %diff_S ; watch in output window
% echo tmp$
tmp$ CATSTR <Nops_L=>, %nops_L, <, diff_L=>, %diff_L
% echo tmp$
ENDM
I am also not sure why JWasm insists that the first THIS NEAR is different from $.
Well jj, trying to fix one bug ends up finding an another one :toothy
Quote from: jj2007 on February 17, 2010, 03:27:21 PMThe problem is that ml.exe shoots in the dark when it comes to calculating the current offset...
the very strange is, that it works fine if you replace the loop (REPEAT nops) with 'db nops dup (090h)'. Eventually such offset expressions are not evaluated in the same pass as the rest of the macro (???).
well - then it isn't a problem, as we intend to use if/then constructs to place our code
the repeat nop's was for test only :bg
EDIT - in fact - most of the masm alignment length combinations are ok
we only need to replace the ones that use add eax,0
that could simplify the macro logic considerably
Quote from: dedndave on February 17, 2010, 05:44:17 PMin fact - most of the masm alignment length combinations are ok
we only need to replace the ones that use add eax,0
that could simplify the macro logic considerably
that what the macro does I've upload in the post above. Here an simplified version without .data/.data? support:
code_align macro nAlign:req
IFNDEF algin_zero_lbl_code
align_mark_code:
org 0
algin_zero_lbl_code:
org align_mark_code
ENDIF
;; check if power of two
align_msk = 2
align_flag = 0
REPEAT 31
IF align_msk EQ nAlign
align_flag = 1
EXITM
ENDIF
align_msk = align_msk SHL 1
ENDM
IF align_flag EQ 0
.err <code_align: ALIGN only to power of 2: 2,4,8,16,32,64...>
EXITM
ENDIF
;; generate the bit mask
align_msk = nAlign-1
align_fill = 090h
IFIDNI @CurSeg,<_TEXT>
IF nAlign LE 16
IF (nAlign-(align_msk AND ($-OFFSET algin_zero_lbl_code))) EQ 5 ; correct masm's align bug
nop
db 08Dh,74h,26h,0 ; lea esi,[esi+imm8(0)]
ELSE
align nAlign
ENDIF
ELSE ;; align > 16
db (nAlign-(align_msk AND ($-OFFSET algin_zero_lbl_code)))*(1 AND ((align_msk AND ($-OFFSET algin_zero_lbl_code)) NE 0)) dup (align_fill)
ENDIF
ELSE
.err <current segemnt not supported by code_align-macro>
ENDIF
endm
i created a code segment with PAGE align-type
it seems ALIGN 256 will use 7-byte NOP's (lea esp,[esp+00000000]) to fill in most of the space
they could have used Michael's SS: override to make them 8-byte NOP's - but oh well - it works without modifying flags
i guess it wouldn't make much difference in terms of speed
so - if the (align distance MOD 7) = 5, you need to take over alignment
otherwise, the ALIGN directive works ok
you could just add the 5-byte NOP, then use ALIGN :bg (oops - that logic doesn't fly)
ok - i see you just added a NOP instruction
i guess if you use executed alignment to 256, you aren't too concerned about speed or dependancies - lol
i glued your macro into Jochen's test program...
0: should be zero
5: should be zero
same result - not sure how to fix it
Quote from: dedndave on February 17, 2010, 09:17:48 PM
i glued your macro into Jochen's test program...
for me it works (ml 10.x) - I'm using my friend Olly for checking. Are you building it as release?
yes - i have no debugger installed on this hard drive
i am about to rebuild, so haven't taken the time to set it up
let me attach...
If the alignment will require more than a small number of nop instructions it would be faster to jump around them than to execute them. GAS does this starting with a 15-byte nop.
dedndave, with your example it doesn't work for me too. Currently it seems for me, that it is not possible to fix this align bug :'( - may jj get it.
However, my original macro (using simply db x dup()) seems to work proper.
qWord
the best way to fix it is to tear into the assembler and patch it in the parser
but - that is a lot of work - maybe i will play with it this summer
it is not a huge bug - but sometimes, those little ones will make you pull your hair out
I got it working
for this particular example only, but it is soooo odd...! Compare the two db nn dup(90h) lines: The only difference is that the successful one has nopct as counter while the failing one has the 5 - and nopct is obviously 5.
Mysteries of Masm ::)
As an extra service for Dave and Lingo who both like inserting bytes before prog start instead of aligning before the loop, the macro now signals how many bytes had to be inserted :bg
QuoteCodeAlign MACRO ct
LOCAL nopct
nopct = ct-($-start) mod ct ; we take start as reference
if ct eq nopct
nopct = 0
endif
tmp$ CATSTR <Line >, %@Line, <: >, %nopct, < nops inserted>
% echo tmp$
if nopct eq 5
db nopct dup(90h) ; succeeds with ml and Jwasm
; db 5 dup(90h) ; fails with ml and Jwasm
; db 2Eh, 8Dh, 44h, 20h, 0h ; lea eax, cs:[eax], fails
else
align ct ; standard
endif
ENDM
[/b]
Quote from: jj2007 on February 17, 2010, 11:17:04 PMThe only difference is that the successful one has nopct as counter while the failing one has the 5 - and nopct is obviously 5.
thats masm as we love it :bg
i have no idea how you figured that out, Jochen
both you guys did great :U
i find it odd that it was adding 5 extra nops and the nop we are having trouble with is the 5 byte version
probably just a coincidence
the issue seems to be when it expands the macro - that would seem the logical place for the miscount
Bad news: It's not working. I added a proc to test that
insert bytes before proc start feature, and found one more oddity...
Line 52: 7 nops inserted
Line 72: 3 nops inserted
Line 89: 4 nops inserted
if you look at the code in Olly, you'll find 7, 3 and ...8 nops :red
For testing, I changed the treatment from align ct to db nopct dup
The macro
echoes 4 nops, and
inserts 8. Long live Masm :toothy
Quote tmp$ CATSTR <Line >, %@Line, <: >, %nopct, < nops inserted>
% echo tmp$
if nopct eq 5
db nopct dup(90h) ; succeeds with ml and Jwasm
; db 5 dup(90h) ; fails with ml and Jwasm
; db 2Eh, 8Dh, 44h, 20h, 0h ; lea eax, cs:[eax], fails
else
db nopct dup(90h) ; succeeds with ml and Jwasm
; align ct ; standard
endif
maybe you could write the macro so that it inserts fixed instruction "strings"
then, decrements the ct value (by the length added) and loops until done...
90 nop 1-byte nop
8BFF mov edi,edi 2-byte nop
8D4900 lea ecx,[ecx+00] 3-byte nop
8D642400 lea esp,[esp+00] 4-byte nop
368D642400 lea esp,ss:[esp+00] 5-byte nop (corrected)
8D9B00000000 lea ebx,[ebx+00000000] 6-byte nop
8DA42400000000 lea esp,[esp+00000000] 7-byte nop
368DA42400000000 lea esp,ss:[esp+00000000] 8-byte nop (added)
it seems that the "variable-dup" directive is not behaving as expected inside a macro
also - the user (programmer) could put whatever instuctions in there he likes :bg
of course, we can run comparisons to see which seem to work best
Quote from: dedndave on February 18, 2010, 09:21:16 AM
it seems that the "variable-dup" directive is not behaving as expected inside a macro
also - ...
Sounds somewhat unlikely. With the -EP cmdline option one will get a source version with all macros resolved - then try to assemble this version. I would be surprised if the results differ.
Dave,
I have seen on at least some hardware that the very long instructions used for alignment produce stalls and at times bad ones that effect the timing of the algorithm. There are instances where filling up and alignment with bare nops or int3 calls works better than a single large instruction and if the execution of multiple nops is a problem, jumping to the aligned location is still a reasonable option.
Andreas - we have more or less verified that masm has problems inserting the right number of elements
macros aren't expanded as one would expect
if it makes you feel any better, we have also verified that JWasm doesn't exhibit the problem :bg
Hutch - that table is more or less what masm inserts, with one correction and one addition
i am not saying it is the best way to go
but, i would think carefully chosen instructions could prevent dependancy stalls
not that alignment code should be executed several times, anyways
at this point, the problem seems to be use of the DUP operator
you can stick whatever bytes you like into the DB's
what would be really nice :P
would be a directive pair like this....
labelX PAD_BYTES_HERE directive
;a bunch of code here
ALIGN_AT labelX AlignmentSize directive
the assembler inserts pad bytes at labelX until the code after ALIGN_AT is aligned to AlignmentSize
labelX can be a place in the code that isn't executed, or is executed only once :bg
Quote from: dedndave on February 18, 2010, 10:03:56 AM
what would be really nice :P
would be a directive pair like this....
Difficult because it implies forward referencing: error A2006:undefined symbol : lblMyLoop
QuotePad MACRO arg
LOCAL opa, nopct
opa = (opattr(arg)) AND 127
if opa eq 36
tmp$ CATSTR <align ct=>, <arg>
% echo tmp$
nopct = arg-(LastLabel-start) mod arg ; we take start as reference
tmp$ CATSTR <NopCount=>, %nopct
% echo tmp$
else
LastLabel CATSTR <lbl>, <arg>
% echo LastLabel
lbl&arg&:
tmp$ CATSTR <label inserted: lbl>, <arg>
% echo tmp$
endif
ENDM
...
QuotePad 16
MyTest proc arg1:DWORD, arg2:DWORD
print "Hello from MyTest", 13, 10
mov ecx, 888
Pad MyLoop
.Repeat
dec ecx
.Until Sign?
ret
MyTest endp
more like - "difficult if we can't get regular align to work right" :P
i knew it was a tricky one
but, i think it is do-able if you were writing your own assembler
it might get sticky if there were nested alignments
the assembler would have to test for invalid conditions to prevent the thing from hanging
jj,
for me this simplified version works (also no unnecessary nops):
CodeAlign MACRO ct
db (ct-($-start) mod ct)*(1 AND ((($-start) mod ct) NE 0)) dup(90h)
ENDM
I've test it in your latest test program.
that one seems to work ok, qWord
i will test it more a little later :bg
i used the forum search tool to find this older thread (2005)
this post by MichaelW is of particular interest...
http://www.masm32.com/board/index.php?topic=1622.msg13124#msg13124