News:

MASM32 SDK Description, downloads and other helpful links
MASM32.com New Forum Link
masmforum WebSite

jmp over nops?

Started by bushpilot, December 02, 2006, 11:56:37 PM

Previous topic - Next topic

bushpilot

While reading Agner Fog's recent manual updates, I was struck by an idea.  I'm sure I am missing something, but why would this be a bad idea for the entry to a loop?  Code is in goasm format.

  (some code here ....)
  jmp TopOfLoop
align 16
TopOfLoop:
  (loop code here ....)
  jne < TopOfLoop


Specifically, I am thinking of the jump over the align.  This would give "good" alignment for the loop, and negate the need for the processor to decode the nops (or whatever) the assembler puts in to get the right alignment.

Thoughts?

Greg

hutch--

Greg,

Jumping into a loop is not a bad idea if you can get a gain out of the alignment you need for the loop start label, usually at the start of a loop it does not matter much as nothing is yet in the branch prediction buffers and will not be for a few iterations of the loop. Make sure you time any aligned labels within an intensive loop as you can often get stung with the aligment code.
Download site for MASM32      New MASM Forum
https://masm32.com          https://masm32.com/board/index.php

bushpilot

Thanks for the reply hutch. 

I am not sure I get your answer - could it be better (talking in general terms) to jump over the "align 16", or just let the processor plow through the nops?

Greg

Jimg

Sometimes there will only be one or two nops to get the alignment, so it would be faster to just execute them.  But I think what Hutch was saying is time your loop.  It may be faster for the loop to be offset one or more bytes from perfect alignment due to the instructions that are in the loop.

P1

With prefetch instruction queues, alignment nops usually does not cost any uP time.

Boundary alignment is for making memory paging easier to accomphish in the addressing unit.

This would be an interesting question to post at Intel developers site.

Regards,  P1  :8)