News:

MASM32 SDK Description, downloads and other helpful links
MASM32.com New Forum Link
masmforum WebSite

JWASM 2.02 win64 bug ?

Started by rags, January 22, 2010, 10:46:25 PM

Previous topic - Next topic

rags

Japeth,
On Win7 Home premium, this simple test piece fails when compiled using the -win64 switch:


option casemap :none      ; case sensitive

_WIN64 EQU 1
  include \jwasm\Win32Inc200\Include\windows.inc     ;Japeth's window.inc
includelib \jwasm\Win32Inc200\Lib64\kernel32.lib
includelib \jwasm\Win32Inc200\Lib64\user32.lib

.data
    align 8
    szTitle db "64 Bit MessageBox",0
    szMsg   db "Hey this works!!",0

.code
main:
  ;  int 3       ; for the debugger use
   ; sub rsp, 8h
    invoke MessageBox, NULL, addr szMsg, addr szTitle, MB_OK
   ; add rsp, 8h

   ; sub rsp, 8h
    invoke ExitProcess, 0
   ; add rsp, 8h

end main

It seems that JWASM 2.02 when compiling the invoke statement,
should reserve another 8 bytes on the stack to allow room for the push of
the return address.

When the sub/add rsp code is uncommented, making the neccessary adjustment
to the stack before the call and after the call, it works as expected.
God made Man, but the monkey applied the glue -DEVO

japheth

Quote from: rags on January 22, 2010, 10:46:25 PM
It seems that JWASM 2.02 when compiling the invoke statement,
should reserve another 8 bytes on the stack to allow room for the push of
the return address.

the invoke directive assumes that the stack is aligned. It subtracts 20h, 30h, 40h, ... from RSP, fills parameters and emits a CALL. IMO it would be a very bad idea if invoke has to implement some runtime stack alignment tests.

BogdanOntanu

Quote from: japheth on January 23, 2010, 07:53:54 AM
the invoke directive assumes that the stack is aligned.

That is corect.

Quote
It subtracts 20h, 30h, 40h, ... from RSP, fills parameters and emits a CALL.

This is incorrect.

You must also consider the return address pushed by the CALL when you calculate the value to substract / add to RSP. Hence you must substract 28h, 30h, 38h, 40h, 48h ... etc depending on the even/odd count of invoke parameters.

Think about it... if you do not compensate for the 8 bytes of the return address THEN inside the PROC that you invoke the stack will be unbalanced by 8 and you will have to do it again anyway. This is one of the subtle issues with the 64bit ABI standard.

Quote
IMO it would be a very bad idea if invoke has to implement some runtime stack alignment tests.

Yes, this is also my opinion.

However in some minor circumstances it might be useful as an option. For example when mixing with other call conventions that can not guarantee a stack alignment at run time (like stdcall) or when debugging to do a fast check if your application has stack alignment issues ;)

Ambition is a lame excuse for the ones not brave enough to be lazy.
http://www.oby.ro

japheth

Quote
Quote
It subtracts 20h, 30h, 40h, ... from RSP, fills parameters and emits a CALL.

This is incorrect.

I just verified what JWasm currently does:

for 0 - 4 parameters, 20h is subtracted from RSP
for 5 - 6 parameters, 30h is subtracted from RSP
for 7 - 8 parameters, 40h is subtracted from RSP
...

IMO this behavior complies to the win64 ABI.

Quote
You must also consider the return address pushed by the CALL when you calculate the value to substract / add to RSP. Hence you must substract 28h, 30h, 38h, 40h, 48h ... etc depending on the even/odd count of invoke parameters.
From what I did understand when reading the win64 ABI the one thing which is important is that RSP is 16-byte aligned just before the CALL instruction.

Quote
Think about it... if you do not compensate for the 8 bytes of the return address THEN inside the PROC that you invoke the stack will be unbalanced by 8 and you will have to do it again anyway. This is one of the subtle issues with the 64bit ABI standard.

Yes, on entry the stack is - always - unbalanced. AFAIU it's the job of the procedure to balance it again if necessary.

Igor

Quote from: japheth on January 23, 2010, 02:02:11 PM
I just verified what JWasm currently does:

for 0 - 4 parameters, 20h is subtracted from RSP
for 5 - 6 parameters, 30h is subtracted from RSP
for 7 - 8 parameters, 40h is subtracted from RSP
This seems very odd, what happens when you call MessageBox API?
*MessageBox is one of the functions which crash immediately if stack is missaligned

drizz

I'm still not convinced that jwasm balances the stack correctly.
Whats worse is that windows ( xp 64 in vmware machine ) silently ignores access violation and resumes execution!

lets take this modified WinCUI1 example:
option casemap:none

.nolist
.nocref
WIN32_LEAN_AND_MEAN equ 1
include windows.inc
.list
.cref

.DATA
align 16
txt db "Hello World",0
.CODE

main proc c uses rbx rsi rdi
LOCAL var1:qword,var2:qword
sub esp,16
movdqa xmm0,[esp]; create access violation if unaligned read
add esp,16
invoke MessageBox,0,addr txt,0,0
ret
main endp

mainCRTStartup proc
and rsp,-16
call main
invoke ExitProcess, eax
mainCRTStartup endp

END mainCRTStartup


If I run it directly the message box appears. If I run it through windbg the access violation pops out because rsp is not 16-byte aligned.
Can anyone confirm this behavior on a normal system? Here is the exe file.


The truth cannot be learned ... it can only be recognized.

drizz

And yes, it's the same with FRAME option.

option casemap:none
option frame:auto
include windows.inc

.DATA
align 16
txt db "Hello World",0
.CODE

main proc FRAME uses rbx rsi rdi
LOCAL var1:qword
sub esp,16
movdqa xmm0,[esp]; create access violation if unaligned read
add esp,16
invoke MessageBox,0,addr txt,0,0
ret
main endp

mainCRTStartup proc
and rsp,-16
call main
invoke ExitProcess, eax
mainCRTStartup endp

END mainCRTStartup


Even though the manual states:
QuoteThe PROC's FRAME attribute ensures that the stack is correctly aligned after the prologue is done.
The truth cannot be learned ... it can only be recognized.

BlackVortex

That test.exe has "stopped working" in my real win7 x64.   :eek

Access violation at :
000000014000100E         67660F6F0424         movdqa xmm0,oword [esp] ; [000000000012FF08]=00000000000000000000000000000000


japheth

Quote from: drizz on March 01, 2010, 06:41:55 PM
And yes, it's the same with FRAME option.
...
Even though the manual states:
QuoteThe PROC's FRAME attribute ensures that the stack is correctly aligned after the prologue is done.

You're right, the stack alignment calculations were probably a bit too lazy.

But I tried an improvement. It may work better...  :wink

drizz

Japheth, thanks for looking into it. Works as expected now (2.03). There is just this tiny issue (or non issue) of "add rsp,0" being generated.

Cheers!
The truth cannot be learned ... it can only be recognized.