The MASM Forum Archive 2004 to 2012

General Forums => The Campus => Topic started by: bushpilot on December 02, 2006, 11:56:37 PM

Title: jmp over nops?
Post by: bushpilot on December 02, 2006, 11:56:37 PM
While reading Agner Fog's recent manual updates, I was struck by an idea.  I'm sure I am missing something, but why would this be a bad idea for the entry to a loop?  Code is in goasm format.

  (some code here ....)
  jmp TopOfLoop
align 16
TopOfLoop:
  (loop code here ....)
  jne < TopOfLoop


Specifically, I am thinking of the jump over the align.  This would give "good" alignment for the loop, and negate the need for the processor to decode the nops (or whatever) the assembler puts in to get the right alignment.

Thoughts?

Greg
Title: Re: jmp over nops?
Post by: hutch-- on December 03, 2006, 12:02:26 AM
Greg,

Jumping into a loop is not a bad idea if you can get a gain out of the alignment you need for the loop start label, usually at the start of a loop it does not matter much as nothing is yet in the branch prediction buffers and will not be for a few iterations of the loop. Make sure you time any aligned labels within an intensive loop as you can often get stung with the aligment code.
Title: Re: jmp over nops?
Post by: bushpilot on December 03, 2006, 12:40:50 AM
Thanks for the reply hutch. 

I am not sure I get your answer - could it be better (talking in general terms) to jump over the "align 16", or just let the processor plow through the nops?

Greg
Title: Re: jmp over nops?
Post by: Jimg on December 03, 2006, 12:55:07 AM
Sometimes there will only be one or two nops to get the alignment, so it would be faster to just execute them.  But I think what Hutch was saying is time your loop.  It may be faster for the loop to be offset one or more bytes from perfect alignment due to the instructions that are in the loop.
Title: Re: jmp over nops?
Post by: P1 on December 04, 2006, 02:16:49 PM
With prefetch instruction queues, alignment nops usually does not cost any uP time.

Boundary alignment is for making memory paging easier to accomphish in the addressing unit.

This would be an interesting question to post at Intel developers site.

Regards,  P1  :8)