While reading Agner Fog's recent manual updates, I was struck by an idea. I'm sure I am missing something, but why would this be a bad idea for the entry to a loop? Code is in goasm format.
(some code here ....)
jmp TopOfLoop
align 16
TopOfLoop:
(loop code here ....)
jne < TopOfLoop
Specifically, I am thinking of the jump over the align. This would give "good" alignment for the loop, and negate the need for the processor to decode the nops (or whatever) the assembler puts in to get the right alignment.
Thoughts?
Greg
Greg,
Jumping into a loop is not a bad idea if you can get a gain out of the alignment you need for the loop start label, usually at the start of a loop it does not matter much as nothing is yet in the branch prediction buffers and will not be for a few iterations of the loop. Make sure you time any aligned labels within an intensive loop as you can often get stung with the aligment code.
Thanks for the reply hutch.
I am not sure I get your answer - could it be better (talking in general terms) to jump over the "align 16", or just let the processor plow through the nops?
Greg
Sometimes there will only be one or two nops to get the alignment, so it would be faster to just execute them. But I think what Hutch was saying is time your loop. It may be faster for the loop to be offset one or more bytes from perfect alignment due to the instructions that are in the loop.
With prefetch instruction queues, alignment nops usually does not cost any uP time.
Boundary alignment is for making memory paging easier to accomphish in the addressing unit.
This would be an interesting question to post at Intel developers site.
Regards, P1 :8)