News:

MASM32 SDK Description, downloads and other helpful links
MASM32.com New Forum Link
masmforum WebSite

confused about ebp and esp

Started by joemc, March 04, 2010, 08:40:17 AM

Previous topic - Next topic

joemc

From i have read
ebp = frame pointer
esp = stack pointer

esp is simple enough for me to understand.  it decreases as the stack is enlarged.

ebp seems to be a pointer to a location on the stack where the function started (during normal prologue and epilogue) so that you can refer to local variables from a single position.

Now finally to my questions :)  since all local variables are required to  be defined at beginning of function that makes them easy to locate based off of ebp.  But shouldn't it be fairly simple to locate them based off of esp?  The only time i could see that they would not be obvious to find would be when values are pushed based on conditional statements and not popped until outside of that conditional statement but in another one?  or are there are cases that i am just not thinking of?  I am asking since i would like to just use epilogue none because i imagine i can keep track of local variables off of ESP fairly easily. 

jj2007

What you propose is possible, but complex and rarely worth the effort, both speed- and size-wise. Below a "simple" example, attached a more elaborated way to do it ebp-less using macros.

QuoteOPTION PROLOGUE:NONE
OPTION EPILOGUE:NONE

;
usage: invoke MyProc, chr$("Arg1"), chr$("Arg2"), 123
; if you want to use invoke MyProc, ..., you need this line:
; MyProc   PROTO: DWORD, :DWORD, :DWORD

MyProc[/color] proc arg1_:DWORD, arg2_:DWORD, arg3_:DWORD
args=   3
savedregs=   4
EspOff
equ   esp+4*savedregs
arg1
equ   [EspOff+1*4]
arg2 equ   [EspOff+2*4]
arg3 equ   [EspOff+3*4]
   push edi   
   ; all registers preserved, except eax ecx edx
   push esi
   push ebx
   push ebp      ; change savedregs if you do not need ebp

   ; int 3      ; you may check with Olly what you get here; do not trust Olly's arg.x
   mov edi, arg1   ; e.g. lpDest
   mov esi, arg2   ; e.g. lpSrc
   mov ebx, arg3   ; e.g. count
;   print edi, 13, 10
;   print esi, 13, 10
;   print str$(ebx), 13, 10
;   mov ebp, 12345h

   pop ebp
   pop ebx
   pop esi         ; all registers preserved, except eax ecx edx
   pop edi
  ret
4*args
MyProc endp

hutch--

Joe,

It is routine in MASM to write both normal stack frames and in places where it matters you can write procedures that have no stack frame. It is usually with procedures that are in that twilight zone between small enough to directly inline yet not so big that the stack overhead does not matter. One of the main reasons for writing a no stack frame procedure is to get the extra register EBP which can be very useful in high speed algos but it comes at a price that you must track the stack carefully for pushes and pops and function calls.

For every 32 bit PUSH, the stack address goes up by 4 bytes, for every POP it drops 4 bytes and it can be highly UNintuitive code to write. You can justify the extra work for regularly re-used code but its often not worth the effort for a single procedure.
Download site for MASM32      New MASM Forum
https://masm32.com          https://masm32.com/board/index.php

clive

This technique is refer to by Microsoft as Frame Pointer Omission (FPO). It works well in C because the compiler keeps track of the stack. In some cases it will let pushed parameters build up on the stack and then recover the space later in a single operation. Personally I would not use this method, it requires a lot of unnecessary work to keep track of things that the compiler/assembler would normally do for you. If you need/want to use EBP in a section of code that is not referencing local variables, you can simply push it onto the stack, and pop it later. If you use EBP in an FPO routine you *still* need to preserve it because the calling function presumes EBP, EBX, ESI and EDI are preserved across the call.

Here is an example of what the C compiler does when generating code with and without the frame pointer. Saves you 2 bytes, the [esp+x] encoding is longer, processing a few 100 bytes in this routine, the benefit would be approaching zero.

-Clive

WORD CRC16_C(DWORD Size, BYTE *Buffer)
{
  WORD Crc; // 16-bits

  static const WORD CrcTable[16]= { // Don't need to copy constants to stack
    0x0000,0x1021,0x2042,0x3063,0x4084,0x50A5,0x60C6,0x70E7,
    0x8108,0x9129,0xA14A,0xB16B,0xC18C,0xD1AD,0xE1CE,0xF1EF };

  Crc = 0;

  while(Size--) // For all bytes in the buffer
  {
    Crc = Crc ^ (*Buffer++ << 8); // Apply the data once, all 8-bits, xor's will cascade

    Crc = (Crc << 4) ^ CrcTable[Crc >> 12]; // Presumes 16-bit register, shift providing 4-bit masking

    Crc = (Crc << 4) ^ CrcTable[Crc >> 12]; // Next 4-bits
  }

  return(Crc);
}


With -Ox -Oy- Disable Frame Pointer Omission

00406CE0                    _CRC16_C:                   ; Xref 0040100F
00406CE0 55                     push    ebp
00406CE1 8BEC                   mov     ebp,esp
00406CE3 8B4D08                 mov     ecx,[ebp+8]
00406CE6 33C0                   xor     eax,eax
00406CE8 8BD1                   mov     edx,ecx
00406CEA 49                     dec     ecx
00406CEB 85D2                   test    edx,edx
00406CED 744A                   jz      loc_00406D39
00406CEF 56                     push    esi
00406CF0 8D7101                 lea     esi,[ecx+1]
00406CF3 8B4D0C                 mov     ecx,[ebp+0Ch]
00406CF6 57                     push    edi
00406CF7                    loc_00406CF7:               ; Xref 00406D35
00406CF7 33D2                   xor     edx,edx
00406CF9 8A31                   mov     dh,[ecx]
00406CFB 33C2                   xor     eax,edx
00406CFD 41                     inc     ecx
00406CFE 8BD0                   mov     edx,eax
00406D00 8BF8                   mov     edi,eax
00406D02 81E2FFFF0000           and     edx,0FFFFh
00406D08 C1EA0C                 shr     edx,0Ch
00406D0B C1E704                 shl     edi,4
00406D0E 668B04551CA04000       mov     ax,[off_0040A01C+edx*2]
00406D16 6633C7                 xor     ax,di
00406D19 8BD0                   mov     edx,eax
00406D1B 81E2FFFF0000           and     edx,0FFFFh
00406D21 C1EA0C                 shr     edx,0Ch
00406D24 C1E004                 shl     eax,4
00406D27 668B14551CA04000       mov     dx,[off_0040A01C+edx*2]
00406D2F 6633D0                 xor     dx,ax
00406D32 4E                     dec     esi
00406D33 8BC2                   mov     eax,edx
00406D35 75C0                   jnz     loc_00406CF7
00406D37 5F                     pop     edi
00406D38 5E                     pop     esi
00406D39                    loc_00406D39:               ; Xref 00406CED
00406D39 5D                     pop     ebp
00406D3A C3                     ret

0040A01C                    off_0040A01C:               ; Xref 00406D0E 00406D27
0040A01C 00 00 21 10 42 20 63 30 - 84 40 A5 50 C6 60 E7 70  ..!.B c0.@.P.`.p
0040A02C 08 81 29 91 4A A1 6B B1 - 8C C1 AD D1 CE E1 EF F1  ..).J.k.........


With -Ox (-Ogityb1 /Gs) Enable Frame Pointer Omission (FPO)

00401140                    _CRC16_C:                   ; Xref 0040100F
00401140 8B4C2404               mov     ecx,[esp+4]
00401144 33C0                   xor     eax,eax
00401146 8BD1                   mov     edx,ecx
00401148 49                     dec     ecx
00401149 85D2                   test    edx,edx
0040114B 744B                   jz      loc_00401198
0040114D 56                     push    esi
0040114E 8D7101                 lea     esi,[ecx+1]
00401151 8B4C240C               mov     ecx,[esp+0Ch]
00401155 57                     push    edi
00401156                    loc_00401156:               ; Xref 00401194
00401156 33D2                   xor     edx,edx
00401158 8A31                   mov     dh,[ecx]
0040115A 33C2                   xor     eax,edx
0040115C 41                     inc     ecx
0040115D 8BD0                   mov     edx,eax
0040115F 8BF8                   mov     edi,eax
00401161 81E2FFFF0000           and     edx,0FFFFh
00401167 C1EA0C                 shr     edx,0Ch
0040116A C1E704                 shl     edi,4
0040116D 668B04551CA04000       mov     ax,[off_0040A01C+edx*2]
00401175 6633C7                 xor     ax,di
00401178 8BD0                   mov     edx,eax
0040117A 81E2FFFF0000           and     edx,0FFFFh
00401180 C1EA0C                 shr     edx,0Ch
00401183 C1E004                 shl     eax,4
00401186 668B14551CA04000       mov     dx,[off_0040A01C+edx*2]
0040118E 6633D0                 xor     dx,ax
00401191 4E                     dec     esi
00401192 8BC2                   mov     eax,edx
00401194 75C0                   jnz     loc_00401156
00401196 5F                     pop     edi
00401197 5E                     pop     esi
00401198                    loc_00401198:               ; Xref 0040114B
00401198 C3                     ret
It could be a random act of randomness. Those happen a lot as well.

joemc

thank you to all 3 replies, put together there is everything i could ever need to know :)  Unfortunately this is not the only reason a C compiler will write better code than me, and probably not the most important reason.  I will try to catch up to it :)

hutch--

Joe,

It just comes with practice but be aware that some algorithms reach their memory imposed speed limit well before they reach their most efficient encodings. Compilers get there a lot of the time because of this factor, with practice you will learn what can and what can't be improved and put your work where it matters.
Download site for MASM32      New MASM Forum
https://masm32.com          https://masm32.com/board/index.php