Generation of push ecx instruction after the prolog of function

Started by Vineel Kumar Reddy Kovvuri, January 09, 2012, 04:48:56 PM

Previous topic - Next topic

Antariy

Also, sometimes it it very funny to see some pieces of code. For example, MSVC++ passing object reference to the class member functions via ECX. And the start of the functions may begin with such a code:


push ebp
mov ebp,esp
push ecx ; ECX is the ptr to the object
mov [ebp-4],ecx ; very cool...


Maybe, this is the limitations of optimized code generation due to EH (it needs [ebp-4] place if it is present).



Intel(R) Celeron(R) CPU 2.13GHz (SSE3)
1281    cycles for 100*MSVC
1502    cycles for 100*ByHand

1285    cycles for 100*MSVC
1499    cycles for 100*ByHand

clive

Quote from: jj2007In any case the push ecx in Clive's disassembled C example does ... nothing, absolutely nothing useful

Indeed, it was a minimal contrived test case that generates the PUSH ECX being queried by the OP, in fact the compiler can also generate a pair of PUSH ECX's instead of an ADD ESP,-8

The OP in fact had an example where the space for the local variable was created by the PUSH ECX, and subsequently initialized. And as I also noted that might more efficiently be achieved with an immediate PUSH of the constant itself.

The point was that it is a fairly common code construct, despite others never seeing it before.
It could be a random act of randomness. Those happen a lot as well.

jj2007

Clive,
No offense intended, and sorry that I hijacked this to demonstrate that compilers lack intelligence :bg
A propos,
   mov esp, ebp
   pop ebp

equals
   leave
Two bytes shorter, same speed, at least on a P4.
:thumbu

clive

Ok, I was trying the understand the goal of timing it, you'd probably want to examine a more practical example.

The whole ENTER/LEAVE concept has become quite murky, I've not looked at how this performs across different micro-architectures recently. I've not even seen the more complex ENTER forms in a very long while.
It could be a random act of randomness. Those happen a lot as well.

dedndave

i think ENTER is slow enough that it is better to "manually" create the stack frame
LEAVE, on the other hand makes for a nice little shortcut
if you have no locals and a balanced stack, you can just POP EBP, though   :P

jj2007

Clive & Dave,
For you:
Intel(R) Celeron(R) M CPU        420  @ 1.60GHz (SSE3)
1203    cycles for 100*TestFrames Enter
600     cycles for 100*TestFrames Leave
1203    cycles for 100*TestFrames Enter & Leave
499     cycles for 100*TestFrames Push & Pop

1203    cycles for 100*TestFrames Enter
600     cycles for 100*TestFrames Leave
1204    cycles for 100*TestFrames Enter & Leave
499     cycles for 100*TestFrames Push & Pop

12      bytes for TestFramesE
9       bytes for TestFramesL
10      bytes for TestFramesEL
11      bytes for TestFramesPP


Which confirms Dave's view: Enter is a no-brainer, leave is quite ok unless you have the strange habit of calling time-critical code in an innermost loop instead of inlining it. Leave is one cycle faster on my Celeron, zero cycles on the P4.