Stack-parameter confusion

Seb · October 15, 2005, 05:51:09 PM

Hello!

I've got a problem in understanding how function parameters on the stack work. Let's say I have a C function that's called "generate" and I want to translate it to Assembly;

Code Select


typedef struct {
int foo;
unsigned char *bar;
} generate_struct;

int generate(generate_struct *g, int x, int y, long *res);

How would I access the structure members and the function parameters? I know that you access the function parameters through ESP, but after ESP there's a number which I'm not sure of (I've seen different codes and some start at 4, some start at 8 and some start at 16 :eek). Could anyone explain?

Code Select


MOV EAX, [ESP+4] ; EAX now points to "generate_struct *g"?
; Would I now access the structure members through EAX like below?
MOV foo, [EAX+4] ; an "int" is 4 bytes long
MOV bar, [EAX+8] ; a pointer is 4 bytes long (?)

bozo · October 15, 2005, 10:59:53 PM

QuoteHow would I access the structure members and the function parameters? I know that you access the function parameters through ESP, but after ESP there's a number which I'm not sure of (I've seen different codes and some start at 4, some start at 8 and some start at 16 Eek). Could anyone explain?

Assembly was the first language i learned, because funnily enough, it was alot more straight forward to me than HLL like C/C++,
i got confused from the different names for same types and casting..among other cryptic text available to me.

didn't have a good enough book, or access to proper resources, so assembly was what i learned.

There is no standard size for data types in any HLL.
in 32-bits, an INT is 4 bytes, (or a DWORD) and a *pointer is just another DWORD with address of some data.

Code Select

generate_struct STRUCT
      foo      DWORD   ?
      bar      DWORD   ?
generate_struct ENDS

thats how the structure would look in assembly, a type double would usually be a QWORD on 32-bit CPU
(correct me if wrong guys, i only know main types in MASM)

this

Code Select


   char buffer[32];

could become

Code Select


buffer   BYTE   32   dup   (?)

an array of characters..

Code Select


char   string[]="ABCDE";

Code Select


string   BYTE   "ABCDE",00h

the prototype for

Code Select

int generate(generate_struct *g, int x, int y, long *res);

could be

Code Select


generate PROTO :DWORD, :DWORD, :DWORD, :DWORD

Code Select


generate PROC g:DWORD, x:DWORD, y:DWORD, res:DWORD
      
      mov   esi, dword ptr[g]
      mov   eax, dword ptr [esi][generate_struct.foo]
      mov   ebx, dword ptr [esi][generate_struct.bar]
      ....
      ret
generate ENDP

just keep in mind that majority parameters on 32-bit stack are 4 bytes, or 1 DWORD, unless dealing with floating point numbers,
or MMX/SSE/2/3 registers.

if i were to call generate routine in old style syntax

Code Select


.data

g_ptr   generate_struct   <?>

x_num   DWORD   1
y_num   DWORD   2

lpRes     DWORD   ?

.code
 ..........

    push   offset lpRes
    push   dword ptr [y]
    push   dword ptr [x]
    push   offset g_ptr
    call   generate

   ........

with INVOKE

Code Select


   invoke generate,addr g_ptr, x, y,addr lpRes

Seb · October 16, 2005, 01:04:44 AM

Thanks a lot for your answer Kernel_Gaddafi! Even though it wasn't exactly what I meant, it still also cleared out a few things for me. Let's say I want to access a variable which has been pushed by a function on a stack; how to find it? I've noticed that some programs uses, e.g "[ESP+4]" etc. What does 4 mean? Is 4 the size of the variable type (pointer/DWORD)? Does alignment make any difference?

This is a good example to show you what I mean. This small code snippet is from the Monkey's Audio SDK:

Code Select


;
; void  Adapt ( short* pM, const short* pAdapt, int nDirection, int nOrder )
;
;   [esp+16]    nOrder
;   [esp+12]    nDirection
;   [esp+ 8]    pAdapt
;   [esp+ 4]    pM
;   [esp+ 0]    Return Address

            align 16
proc        Adapt

            mov  eax, [esp +  4]                ; pM
            mov  ecx, [esp +  8]                ; pAdapt
            mov  edx, [esp + 16]                ; nOrder
            shr  edx, 4
            ...

How does the author know that "pM" will be located at "[esp+4]" - is it an Assembler rule or what indicates that "4" is the unique number used to find "pM" on the stack pointer? The only thing I can think of is that the size of "pM" is 4 (pointer = DWORD = 4 bytes in 32-bit systems), but is this really true?

Thanks!

Regards,
Seb

hutch-- · October 16, 2005, 01:39:59 AM

Seb,

Knowing where a stack argument is located in memory depends on how you set up the stack. If you use a stack frame as with a normal procedure, the first argument starts at [ebp+8]. Now it gets a bit more complicated if you don't use a stack frame because while the first argument starts at [esp+4], you have to calculate and PUSH instructions and add that to the argument location as push and pop change the ESP location.

If for example you had a proc with no stack frame that has 3 registers preserved,

push ebx
push esi
push edi

You must ADD 12 bytes to the ESP location for each argument.

The other thing that is CRITICAL is to use the form of RET that has a trailing number after it as this corrects the stack for you.

If you have 2 x DWORD arguments pushed onto the stack and are using STDCALL calling convention, at the exit of the procedure you use,

RET 8

to balance the stack.

If you call the procedure using the C calling convention you exit with a RET but correct the stack directly after the calling code with something like,

ADD ESP, 8

bozo · October 16, 2005, 01:59:24 AM

QuoteHow does the author know that "pM" will be located at "[esp+4]" - is it an Assembler rule or what indicates that "4" is the unique number used to find "pM" on the stack pointer? The only thing I can think of is that the size of "pM" is 4 (pointer = DWORD = 4 bytes in 32-bit systems), but is this really true?

somebody can correct me at some point if i make a mistake, so i don't explain this incorrectly.

the best thing you can do Seb, is write some assembly code with push/pop instructions and debug
it watching how esp changes with each instruction, and after a call to a STDCALL routine,
thats how i understand it.

ESP just points to a block of memory..like that returned with HeapAlloc or LocalAlloc
When you push a DWORD on the stack, ESP is reduced by 4 bytes.

You don't have to use PUSH and POP to manage the stack, but it is bad practice
to do it manually.

Code Select



   push   0
   push   offset szTitle
   push   offset szMessage
   push   0
   call   MessageBoxA

   ; you could write the above code like this:

   sub   esp, 4*4
   and   dword ptr [esp], 0                          ; first arguement
   mov   dword ptr [esp + 4], offset szMessage  ; second
   mov   dword ptr [esp + 8], offset szTitle       ; third
   and   dword ptr [esp + 12], 0                      ; fourth
   call   MessageBoxA

   ; not neccessarily in that order, btw:i didn't test that code, so apologies if it 
   ; crashes
   ;
   ; when MessageBoxA is called, the next address is pushed on the stack
   ; imagine below to be MessageBoxA routine..

MessageBoxA:
      push   ebp                         ; save ebp on stack (subtract by 4)
      mov   ebp, esp                   ; move esp into ebp for local variables.
      
      ; dword ptr [esp + 0] = old EBP
      ; dword ptr [esp + 4] = 0
      ; dword ptr [esp + 8] = offset szMessage
      ; dword ptr [esp + 12] = offset szTitle
      ; dword ptr [esp + 16] = 0

tenkey · October 16, 2005, 04:44:13 AM

Quote from: Seb on October 16, 2005, 01:04:44 AM
Code Select Expand
; ; void Adapt ( short* pM, const short* pAdapt, int nDirection, int nOrder ) ; ; [esp+16] nOrder ; [esp+12] nDirection ; [esp+ 8] pAdapt ; [esp+ 4] pM ; [esp+ 0] Return Address align 16 proc Adapt mov eax, [esp + 4] ; pM mov ecx, [esp + 8] ; pAdapt mov edx, [esp + 16] ; nOrder shr edx, 4 ...
How does the author know that "pM" will be located at "[esp+4]" - is it an Assembler rule or what indicates that "4" is the unique number used to find "pM" on the stack pointer? The only thing I can think of is that the size of "pM" is 4 (pointer = DWORD = 4 bytes in 32-bit systems), but is this really true?

What you are seeing is the result of calling conventions. If you want to mix assembly with high-level languages, you follow the conventions.

We know from the way CALL works, that when a subroutine starts, the last item on the stack is the return address, and it is a DWORD (occupying 4 bytes).

Everything else is convention. We know that Win32 C compilers, by convention, define int and pointer as 4-byte values in Win32. We know, by compiler convention, that arguments are put on the stack. We know, by compiler convention, that each argument will occupy 4 bytes or some multiple of that. We know, by compiler convention, in what order they will appear in the stack.

In assembly language code, we do not need to follow these conventions. Some of us follow these conventions because it's consistent, well understood, and allows us to combine code modules without demanding to know how the other modules handle argument passing. Those of us who don't either want to be unconventional, or don't want the overhead associated with the conventions. (or don't know there are conventions)

News:

Stack-parameter confusion

Seb

bozo

Seb

hutch--

bozo

tenkey