Is there any performance reason to place other instructions between push and pop instructions when possible or, conversely, to keep pushes and pops consecutive when possible?
Oh, and as an aside, does anyone have any recommendation on The Software Optimization Cookbook that Intel publishes? Looking at its table of contents, it seems to push the Vtune optimizer, but I'm wondering if it contains anything that cannot be gleaned from the free Intel docs or other online sources.
Try to pair up unrelated instructions. For example, if on one line you access memory, do a register-only operation on the next line involving different registers.
So for instance
xor al,al
pop edi
pop esi
would be better as
pop edi
xor al,al
pop esi
Posit,
The only real way to find out is to write a test piece. I suggest that it does not make much difference and when you can find another way using MOV its faster than PUSH/POP.
Is doing something like this a faster replacement for a push:
mov [esp-4],esi
It seems like the subtraction in the address calculation would outweigh the increase speed of the mov instruction.