Generation of push ecx instruction after the prolog of function

Vineel Kumar Reddy Kovvuri · January 09, 2012, 04:48:56 PM

Hi,

Many times I saw "push ecx" generated after the prolog of each function
and then the room created by the above instruction on the stack is accessed as [ebp - 4] for the local variable access.
My question here is, Is there any special reason of this instruction or it is generated
as a short hand of making space for the local variable (instead of using sub esp -4)

Thanks in advance.

dedndave · January 09, 2012, 04:58:28 PM

Quoteit is generated as a short hand of making space for the local variable (instead of using sub esp -4)

:U

which brings up a trick...
initialize ECX, and you have initialized your local :bg

donkey · January 09, 2012, 05:39:27 PM

The PUSH ECX is (usually) the result of a USE ECX directive in the PROC declaration. It is used to preserve the value of ECX across calls to the procedure, there is a subsequent POP ECX added to the RET macro.

dedndave · January 09, 2012, 05:56:05 PM

read the post carefully, Edgar :P

donkey · January 09, 2012, 06:17:59 PM

Ah, missed that, I once did that to pass a result in EAX, put in a USE EAX then pointed the RESULT label to that space on the stack, thought it was pretty cool but pretty much pointless. If you only need 4 bytes of stack space you might save a couple of bytes with the push but its hardly worth it.

NoCforMe · January 09, 2012, 06:24:54 PM

QuotePUSH ECX

Are you sure about this? I've never seen this instruction generated in my code. Under what conditions is this generated?

I can't even see what the purpose of this would be, unless you're using ECX in the subroutine. ECX has nothing to do with the stack frame--that's what EBP is used for, no? Or am I missing something here?

clive · January 09, 2012, 06:49:12 PM

Quote from: NoCforMeAre you sure about this? I've never seen this instruction generated in my code. Under what conditions is this generated?

No Soup for you, it's something MSVC does, without optimization even, and thus would be present in just about any commercial application or driver you care to look at.

Quote#include <stdio.h>

int test(void)
{
int i;

return(i);
}

int main(int argc, char **argv)
{
test();

return(1);
}

QuoteDisassembly

00000030 _test: ; Xref 0000003E
00000030 55 push ebp
00000031 8BEC mov ebp,esp
00000033 51 push ecx
00000034 8B45FC mov eax,[ebp-4]
00000037 8BE5 mov esp,ebp
00000039 5D pop ebp
0000003A C3 ret

0000003B _main:
0000003B 55 push ebp
0000003C 8BEC mov ebp,esp
0000003E E8EDFFFFFF call _test
00000043 B801000000 mov eax,1
00000048 5D pop ebp
00000049 C3 ret

NoCforMe · January 09, 2012, 07:08:11 PM

Ah, I see. Not generated by any assembler, and just a way of moving the stack pointer.

So is a PUSH cheaper/faster than a SUB of SP? I guess I'll have to look at Hutch's opcode help file to find out.

Can I get my soup now?

Vineel Kumar Reddy Kovvuri · January 09, 2012, 07:12:47 PM

Quote from: donkey on January 09, 2012, 05:39:27 PM
The PUSH ECX is (usually) the result of a USE ECX directive in the PROC declaration. It is used to preserve the value of ECX across calls to the procedure, there is a subsequent POP ECX added to the RET macro.

There is no POP ECX in the generated code in the function

Code Select


_main	PROC
	push	ebp
	mov	ebp, esp
	push	ecx
	mov	DWORD PTR [ebp-4], 4660		; 00001234H
	mov	eax, 22136					; 00005678H
	mov	esp, ebp
	pop	ebp
	ret	0

for

Code Select


int main()
{
	int x = 0x1234;
	return 0x5678;
}

compiled as

cl /Zi /GS- test.c /Fasc

dedndave · January 09, 2012, 07:16:58 PM

the MASM epilogue generally uses LEAVE, thus no balancing POP is required
if you guys are depending on this thread for lunch, you'll go hungry :(

clive · January 09, 2012, 07:20:27 PM

Quote from: Vineel Kumar Reddy Kovvuri
There is no POP ECX in the generated code in the function

No, because the stack frame is effectively collapsed with the MOV ESP,EBP and the content of local/automatic variables is lost as the scope disappears.

Now if the code used EBX,ESI,EDI, you might find it does something different.

You'll also note that LEAVE doesn't track what the prologue code, or ENTER, does.

jj2007 · January 09, 2012, 08:21:44 PM

I took the liberty to time this MSVC-optimised code.

Code Select

Intel(R) Celeron(R) M CPU        420  @ 1.60GHz (SSE3)
499     cycles for 100*MSVC
500     cycles for 100*ByHand

499     cycles for 100*MSVC
500     cycles for 100*ByHand

By the way: "ByHand" is exactly one byte shorter. Guess why :bg

clive · January 09, 2012, 09:41:15 PM

Quote from: dedndavewhich brings up a trick... initialize ECX, and you have initialized your local

Or PUSH 0 or PUSH 01234h, which might been even cleverer. Larger tables in reverse of course.

Though typically what C will do is allocate the space, and then copy in the initializers. This can get highly inefficient, with say a large table of constants like CRC tables, where the programmer really should have chosen static const so it would be stored in the code section or in a ROM.

ByHand doesn't maintain the same stack pointer. The example was more of a quick hack to demonstrate a case where PUSH ECX existed, vs the ADD ESP,-4 which would normally occur as the frame is created.

Code Select

AMD Phenom(tm) II X6 1055T Processor (SSE3)
720     cycles for 100*MSVC
499     cycles for 100*ByHand

710     cycles for 100*MSVC
500     cycles for 100*ByHand

jj2007 · January 09, 2012, 09:56:10 PM

In any case the push ecx in Clive's disassembled C example does ... nothing, absolutely nothing useful:

ByHand:
   push ebp
   mov ebp, esp
   ~~push ecx~~ ; you don't need this one
   mov eax, [ebp-4]
if 0
   ~~leave~~ ; pardon, too easy and too short :wink
else
   mov esp, ebp
   pop ebp
endif
   ret

Unless you use Dave's trick, but 1. that does not appear to be the purpose of the C example and 2. it could easily be achieved with a mov eax, ecx...

Antariy · January 10, 2012, 02:40:34 AM

Quote from: jj2007 on January 09, 2012, 09:56:10 PM
In any case the push ecx in Clive's disassembled C example does ... nothing, absolutely nothing useful:

Actually, yes, but that was a "strange" example just for showing the point - the code have no sense - the compiler will even complain about that function working with the uninitialized parameter (because code is unpredictable - some kind of random numbers generator).
But, if the code would be changed a bit, for example:

Code Select


int test(void)
{
  int i;
  printf("qwe\n"); // <<<

  return(i);
}

int main(int argc, char **argv)
{
  test();

  return(1);
}

then without "push ecx" the place for local "i" would not be allocated, i.e. esp=ebp, [ebp-4] is the next [esp] value - the address of the string, passed to printf.

Code Select


push ebp
mov ebp,esp     ; [ebp] = [esp]
; push ecx      ; <<< if remove it from the real working code then
push CTXT("qwe"); [ebp-4] = [esp]
call _printf
mov eax,[ebp-4] ; if printf was not changed its local - pointer to the string
mov esp,ebp     ; - then eax would be pointer to the string "qwe"
pop ebp
ret

I.e. without "push ecx" there is a chance for local "i" to be overwritten. Compiler, which follows simple rule "to be straightforward and robust as it is possible", just producing this local allocation (it might use "sub esp,4" - but it is 3 times longer than "push reg") without any assumptions about usefulness of the code or local variable itself :green2

News:

Generation of push ecx instruction after the prolog of function