About local variables.

hyperon · December 19, 2006, 01:38:30 PM

To create local variables, it can be done by declaring variables in some LOCAL statements, as the following:

TestProc proc

    LOCAL var1:byte
    LOCAL var2:word
    LOCAL var3:dword
    mov al,var1
    mov ax,var2
    mov eax,var3
    ret

TestProc endp

After assembling, above code, it is translated as the following code:

Section _text

0040101A: 55 push ebp
0040101B: 8BEC mov ebp, esp
0040101D: 83C4F8 add esp, FFFFFFF8
00401020: 8A45FF mov al, [ebp-01]
00401023: 668B45FC mov ax, [ebp-04]
00401027: 8B45F8 mov eax, [ebp-08]
0040102A: C9 leave
0040102B: C3 ret

Question 1. Why does assembler use 'add esp, FFFFFFF8', not 'sub esp,7'?
Question 2. Why does 'mov ax,var2' be translated as 'mov ax, [ebp-04]', not 'mov ax, [ebp-3]'?

u · December 19, 2006, 02:24:42 PM

masm keeps the local WORDs 2-byte aligned, and DWORDs 4-byte aligned.
ESP should always be dword-aligned, thus the "add esp,-8" . On older cpus, "add" was faster than "sub", iirc.

hyperon · December 20, 2006, 02:36:59 PM

Thanks, Ultrano.
About Question1, I suspect that "add" is faster than "sub" on older CPUs.
Maybe your recall losts something, I look up in the instruction cycle of 'add' and 'sub', whatever CPU, they are the same, for examples:

.8086
add sp,0FFF8h; 4 cycles
sub sp,8 ; 4 cycles

.186
add sp,0FFF8h; 4 cycles
sub sp,8 ; 4 cycles

.286
add sp,0FFF8h; 3 cycles
sub sp,8 ; 3 cycles

.386
add esp,0FFFFFFF8h; 2 cycles
sub esp,8 ; 2 cycles

u · December 20, 2006, 03:00:06 PM

Oh, I don't mean cycles. I mean micro-ops and picoseconds, due to the a bit higher complexity of a subtraction ALU unit :toothy.
Anyway, it doesn't really matter.

Draakie · December 20, 2006, 03:09:14 PM

LOL :bg

seriously [NOT] ... probably because A[DD] is alphabetically superior to UBB ...

who cares is actually probably "THE" most correct cycle wize answer ...

but I bet some-one will tell you why... eventually...

as to Ultrano 's initial original post in this regard - It's fact...

EduardoS · December 20, 2006, 11:56:54 PM

Less uOps? But to perform a sub didn't we just change an input in the ALU from 0 to 1?

hyperon · December 21, 2006, 02:46:11 PM

To Draakie:

LOL, seriously, perhaps you are right, 'add' is alphabetically superior to 'sub', so macro assembler uses 'add'. But other assembler doesn't think so, for instance, coding the following codes by Fasm:

proc TestProc stdcall
local var1:BYTE,var2:WORD,var3:DWORD
mov al,[var1]
mov ax,[var2]
mov eax,[var3]
ret
endp

they are translated as below:

Section _text

00402006: 55 push ebp
00402007: 89E5 mov ebp, esp
00402009: 83EC08 sub esp, 00000008;fasm uses 'sub'!
0040200C: 8A45F8 mov al, [ebp-08]
0040200F: 668B45F9 mov ax, [ebp-07]
00402013: 8B45FB mov eax, [ebp-05]
00402016: C9 leave
00402017: C3 ret

Does anyone make experiment under other assemblers?

MaynardG_Krebs · December 21, 2006, 03:28:32 PM

I am not sureif it has anything to do with it, but I believe the cpu actually carries out a SUB instruction by first negating the number to be subtracted and then adding it anyway. :dazzled:

u · December 21, 2006, 04:02:46 PM

It's even worse: http://www.ece.rice.edu/Courses/422/1996/bomb/alu.html
X - Y = X + (~Y +1) . But actually this extra "+1" can be directly fed as a previous carry on the lowest bit's unit. So, for SUB you'd be just adding a XOR element right after Y, and control whether it inverts the input (Y). Thus, it will be really the same timing of add and sub if they share the unit in this way.

Though, cpu manufacturers probably have come up with faster solutions to their ADD/SUB units, I haven't gotten my hands on specific schematics yet ^^.

MaynardG_Krebs · December 21, 2006, 04:32:11 PM

QuoteIt's even worse: http://www.ece.rice.edu/Courses/422/1996/bomb/alu.html

That will sure give me a new appreciation for twos complement numbers!

u · December 21, 2006, 05:18:34 PM

Accidentally stumbled on this nice set of lectures, in lab 1 describes an add/sub unit, with that "input carry on sub" and xor optional negation of input. And the optimization of the unit by partitioning it into a "bypass addition unit" schematic.
http://6004.csail.mit.edu/Spring98/

EduardoS · December 21, 2006, 11:35:36 PM

Try http://www.ti.com logic->arithmetic and logic units ;)

hyperon · December 22, 2006, 03:28:45 AM

Thanks, Ultrano.
Although the data on the website(http://6004.csail.mit.edu/Spring98/) is detailed, I don't understand the data fully, so I can't judge whether it is true.
Let us stop the further argument, and see the following:

TestClass struct

var1 byte ?
var2 word ?
var3 dword ?

TestClass ends

TestProc proc

LOCAL @foo:TestClass
lea esi,@foo.var1; esi = offset @foo
assume esi:ptr TestClass
mov al,[esi].var1
mov ax,[esi].var2
mov eax,[esi].var3
assume esi:nothing
ret

TestProc endp

After assembling, above codes are translated as the following codes:

0040101A: 55 push ebp
0040101B: 8BEC mov ebp, esp
0040101D: 83C4F8 add esp, FFFFFFF8
00401020: 8D75F8 lea esi, [ebp-08]
00401023: 8A06 mov al, [esi]
00401025: 668B4601 mov ax, [esi+01]
00401029: 8B4603 mov eax, [esi+03]
0040102C: C9 leave
0040102D: C3 ret

If MASM keeps the local WORDs 2-byte aligned, and DWORDs 4-byte aligned, then 'mov ax, [esi+01]' should be 'mov ax,[esi+2]', and 'mov eax, [esi+03]' should be 'mov eax, [esi+4]'. Why does macro assembler translate them as the non-aligned codes?

u · December 22, 2006, 04:36:38 AM

For flexibility reasons. We'd be tearing hair if we didn't have that, in some cases, related to compatibility with C-compiler-generated code.

hyperon · December 22, 2006, 11:46:11 AM

Thanks. Appending question, does MASM also handle similar cases for other compilers by the same way?

News:

About local variables.

hyperon

u

hyperon

u

Draakie

EduardoS

hyperon

MaynardG_Krebs

u

MaynardG_Krebs

u

EduardoS

hyperon

u

hyperon