How does align works?

n00b! · October 31, 2008, 04:49:14 PM

Hello,
here's directly another question since I do not understand that keyword, too.
I found some explanations but I don't think that they were really adequate...

It seems that 'align' is something like a macro, which the assembler replaces with "lea edi, [edi]" or something like that.
I noticed that the code does become a little bit faster, but when do I know, that I have to use it to get the code faster.

Is there someone that could write a little explanation, how to use it and when you exactly know, that it'd be time for using it?

Much thanks in advance!

Mirno · October 31, 2008, 05:10:47 PM

Alignment is an issue because of the way the processor fetches memory.
Think of it like a book, and pages within the book - if the information is split over two pages, you keep having to flip between them.
However if you as the author of the book think to yourself "my reader is likely to want to look at all of this at once", you can just leave blank space before your important bit, so it all fits on the one page...
So you'll waste a bit of space (the lea edi, [edi] does nothing - it's like a longer version of nop (there are several "do nothing" instructions, and masm will insert the best ones to fill the space in as short a number of clock cycles as possible)), but the important bit of information is all on one "page".

Putting code on an aligned address (which is what align does), is good when you expect to need to use the same data over and over again - like loops. The code when you loop is repeatedly used, so it's good to start it on a boundary.

To be honest though, you don't really need to worry about it. Very rarely will things break if the data or code isn't aligned, and it will be very explictly documented. When you're learning, things like alignment are not important, only when you get a bit further in - and by then it tends to make more sense because you'll understand the processor a bit better.

Mirno

n00b! · October 31, 2008, 05:29:52 PM

Ah, ok.
I think I understand a bit, thanks :U

'lea edi, [edi]' is faster than together arranged 'nops'?

And... is there some lowlevel rule how to know, that code isn't on a address which is a power of 2 (impair) and would run aligned on an address based on the 2 (on a new book page) faster?

Mirno · October 31, 2008, 05:41:33 PM

Yes, the first point "lea edi, edi" is just a longer "nop" - each instruction must be executed, even if it does nothing. So a "long nop" is better than two or three "nop" instructions.

There aren't really any hard and fast rules, and you can only test to find out. The rules tend to change with different processors. It used to be fine to align to 4 or 8 bytes (bottom 2 or 3 bits of the address being zero). Newer processors like certain things to be aligned to 16 bytes, and I wouldn't be surprised if 32 byte alignment didn't benefit some SSE things in the future.

Generally the thing to do is write your code without any alignment, time it, then when you're done add ALIGNS for the critical loops and time it again. Any code or data that you use repeatedly, tends to work better if it's aligned to some degree - how much of a benefit can only be determined with testing though.

Mirno

n00b! · October 31, 2008, 06:46:18 PM

Should it be used within code that is executed or outwith.
for example:

Code Select

align 4
foo proc
mov eax, 5
ret
foo endp

or:

Code Select

foo proc
align 4
mov eax, 5
ret
foo endp

MichaelW · October 31, 2008, 07:49:33 PM

I tried to make this demonstration of the ALIGN directive easy to understand, but I'm not sure how well I succeeded.

Code Select


; «««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««
    include \masm32\include\masm32rt.inc
; «««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««
    .data
      pad0 db 88h
      align 16
      db 1
      org 32
      pad1 db 88h
      align 8
      db 1
      org 32*2
      pad2 db 88h
      align 4
      db 1
      org 32*3
      pad3 db 88h
      align 2
      db 1
      org 32*4
      pad4 db 88h
      align 1
      db 1
      db 100 dup(0)
      buffer db 128 dup(0)
    .code
; «««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««
start:
; «««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««
    print "data section start at "
    print uhex$(OFFSET pad0),"h",13,10,13,10

    print "align 16",13,10
    invoke HexDump, ADDR pad0, 32, ADDR buffer
    print ADDR buffer,13,10

    print "align 8",13,10
    invoke HexDump, ADDR pad1, 32, ADDR buffer
    print ADDR buffer,13,10

    print "align 4",13,10
    invoke HexDump, ADDR pad2, 32, ADDR buffer
    print ADDR buffer,13,10

    print "align 2",13,10
    invoke HexDump, ADDR pad3, 32, ADDR buffer
    print ADDR buffer,13,10

    print "align 1",13,10
    invoke HexDump, ADDR pad4, 32, ADDR buffer
    print ADDR buffer,13,10,13,10

    inkey "Press any key to exit..."
    exit
; «««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««
end start

Code Select


data section start at 00403000h

align 16
88 00 00 00 00 00 00 00 - 00 00 00 00 00 00 00 00
01 00 00 00 00 00 00 00 - 00 00 00 00 00 00 00 00

align 8
88 00 00 00 00 00 00 00 - 01 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00 - 00 00 00 00 00 00 00 00

align 4
88 00 00 00 01 00 00 00 - 00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00 - 00 00 00 00 00 00 00 00

align 2
88 00 01 00 00 00 00 00 - 00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00 - 00 00 00 00 00 00 00 00

align 1
88 01 00 00 00 00 00 00 - 00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00 - 00 00 00 00 00 00 00 00

The ALIGN directive aligns the next variable or instruction on a byte address that is a multiple of the specified alignment. The assembler does this by adding zero or more padding bytes ahead of the variable or instruction. I used the data section instead of the code section because in the data section the padding bytes are zeros, where in the code section they can be any of a number of instructions or instruction combinations. I displayed the start address of the data section to make it clear that the address is already aligned to any alignment that can be specified with the ALIGN directive. The pad byte (value 88h) is used to shift the starting alignment so alignment values greater than 1 will have an effect. Put another way, without the pad byte, the aligned byte (value 1) would always be in the first byte position. The ORG directive is used to advance the location counter to the next multiple of 32, so that each demonstration starts in a fresh area of the data section.

Mark Jones · November 01, 2008, 04:50:57 PM

Here, "foo proc" can be directly thought of as "fooproc:" meaning that "foo" is the next logical code offset. So then, placing the align right before or right after "foo proc" does nothing, because "foo proc" becomes essentially like a jump label.

Also, many of the advanced math instructions (MMX,SSE,etc) require values to be aligned to 4,8, or 16 bytes.

Also, the align command is singular; meaning the second item below will NOT be aligned on a 16-byte boundary:

Code Select


ALIGN 16
    MyString DB "Hello World!!",0
    YourString DB "This starts 15 bytes from the align because MyString is 14 bytes long!!",0

News:

How does align works?

n00b!

Mirno

n00b!

Mirno

n00b!

MichaelW

Mark Jones