To create local variables, it can be done by declaring variables in some LOCAL statements, as the following:
TestProc proc
LOCAL var1:byte
LOCAL var2:word
LOCAL var3:dword
mov al,var1
mov ax,var2
mov eax,var3
ret
TestProc endp
After assembling, above code, it is translated as the following code:
Section _text
0040101A: 55 push ebp
0040101B: 8BEC mov ebp, esp
0040101D: 83C4F8 add esp, FFFFFFF8
00401020: 8A45FF mov al, [ebp-01]
00401023: 668B45FC mov ax, [ebp-04]
00401027: 8B45F8 mov eax, [ebp-08]
0040102A: C9 leave
0040102B: C3 ret
Question 1. Why does assembler use 'add esp, FFFFFFF8', not 'sub esp,7'?
Question 2. Why does 'mov ax,var2' be translated as 'mov ax, [ebp-04]', not 'mov ax, [ebp-3]'?
masm keeps the local WORDs 2-byte aligned, and DWORDs 4-byte aligned.
ESP should always be dword-aligned, thus the "add esp,-8" . On older cpus, "add" was faster than "sub", iirc.
Thanks, Ultrano.
About Question1, I suspect that "add" is faster than "sub" on older CPUs.
Maybe your recall losts something, I look up in the instruction cycle of 'add' and 'sub', whatever CPU, they are the same, for examples:
.8086
add sp,0FFF8h; 4 cycles
sub sp,8 ; 4 cycles
.186
add sp,0FFF8h; 4 cycles
sub sp,8 ; 4 cycles
.286
add sp,0FFF8h; 3 cycles
sub sp,8 ; 3 cycles
.386
add esp,0FFFFFFF8h; 2 cycles
sub esp,8 ; 2 cycles
Oh, I don't mean cycles. I mean micro-ops and picoseconds, due to the a bit higher complexity of a subtraction ALU unit :toothy.
Anyway, it doesn't really matter.
LOLĀ :bg
seriously [NOT] ... probably because A[DD] is alphabetically superior to UBB ...
who cares is actually probably "THE" most correct cycle wize answer ...
but I bet some-one will tell you why... eventually...
as to Ultrano 's initial original post in this regard - It's fact...
Less uOps? But to perform a sub didn't we just change an input in the ALU from 0 to 1?
To Draakie:
LOL, seriously, perhaps you are right, 'add' is alphabetically superior to 'sub', so macro assembler uses 'add'. But other assembler doesn't think so, for instance, coding the following codes by Fasm:
proc TestProc stdcall
local var1:BYTE,var2:WORD,var3:DWORD
mov al,[var1]
mov ax,[var2]
mov eax,[var3]
ret
endp
they are translated as below:
Section _text
00402006: 55 push ebp
00402007: 89E5 mov ebp, esp
00402009: 83EC08 sub esp, 00000008;fasm uses 'sub'!
0040200C: 8A45F8 mov al, [ebp-08]
0040200F: 668B45F9 mov ax, [ebp-07]
00402013: 8B45FB mov eax, [ebp-05]
00402016: C9 leave
00402017: C3 ret
Does anyone make experiment under other assemblers?
I am not sureif it has anything to do with it, but I believe the cpu actually carries out a SUB instruction by first negating the number to be subtracted and then adding it anyway. :dazzled:
It's even worse: http://www.ece.rice.edu/Courses/422/1996/bomb/alu.html
X - Y = X + (~Y +1) . But actually this extra "+1" can be directly fed as a previous carry on the lowest bit's unit. So, for SUB you'd be just adding a XOR element right after Y, and control whether it inverts the input (Y). Thus, it will be really the same timing of add and sub if they share the unit in this way.
Though, cpu manufacturers probably have come up with faster solutions to their ADD/SUB units, I haven't gotten my hands on specific schematics yet ^^.
QuoteIt's even worse: http://www.ece.rice.edu/Courses/422/1996/bomb/alu.html
That will sure give me a new appreciation for twos complement numbers!
Accidentally stumbled on this nice set of lectures, in lab 1 describes an add/sub unit, with that "input carry on sub" and xor optional negation of input. And the optimization of the unit by partitioning it into a "bypass addition unit" schematic.
http://6004.csail.mit.edu/Spring98/
Try http://www.ti.com logic->arithmetic and logic units ;)
Thanks, Ultrano.
Although the data on the website(http://6004.csail.mit.edu/Spring98/) is detailed, I don't understand the data fully, so I can't judge whether it is true.
Let us stop the further argument, and see the following:
TestClass struct
var1 byte ?
var2 word ?
var3 dword ?
TestClass ends
TestProc proc
LOCAL @foo:TestClass
lea esi,@foo.var1; esi = offset @foo
assume esi:ptr TestClass
mov al,[esi].var1
mov ax,[esi].var2
mov eax,[esi].var3
assume esi:nothing
ret
TestProc endp
After assembling, above codes are translated as the following codes:
0040101A: 55 push ebp
0040101B: 8BEC mov ebp, esp
0040101D: 83C4F8 add esp, FFFFFFF8
00401020: 8D75F8 lea esi, [ebp-08]
00401023: 8A06 mov al, [esi]
00401025: 668B4601 mov ax, [esi+01]
00401029: 8B4603 mov eax, [esi+03]
0040102C: C9 leave
0040102D: C3 ret
If MASM keeps the local WORDs 2-byte aligned, and DWORDs 4-byte aligned, then 'mov ax, [esi+01]' should be 'mov ax,[esi+2]', and 'mov eax, [esi+03]' should be 'mov eax, [esi+4]'. Why does macro assembler translate them as the non-aligned codes?
For flexibility reasons. We'd be tearing hair if we didn't have that, in some cases, related to compatibility with C-compiler-generated code.
Thanks. Appending question, does MASM also handle similar cases for other compilers by the same way?
Well, MASM certainly provides the flexibility necessary :)