Having programmed Motorola assebler, it confuses me a bit that the distinguishing of immediate mode and absolute (direct, displacement only) mode does not depend on operators, but on the context, as far as I understand.
So,
mov ax,10 ;is always immediate
but
mov ax,<symbol> ;depends on how <symbol> is declared
Not so?
Now, I found the following code in masm32\tutorial\console\demo7\complex.asm:
.data
; --------------------------
; initialise 10 DWORD values
; --------------------------
itm0 dd 0
(8 lines removed)
itm9 dd 9
; ---------------------------------
; put their addresses into an array
; ---------------------------------
array dd itm0,itm1,itm2,itm3,itm4
dd itm5,itm6,itm7,itm8,itm9
(several lines removed)
mov ebx, array ; put BASE ADDRESS of array in EBX
My question is, WHAT does ebx contain at this point - the address to "array" (the value of that label) as the comment insinuates, or the contents at that address, which would be the address to "itm0"? Does it depend on how and where the label whas declared, or is it possible to distinguish by means of an operator or keyword, such as "offset"? Would the "offset" keyword make any difference in this case?
Ok, I suppose you will have to be patient with me, because I'm a very theoretical person, not the "trial and error" kind of programmer, so I didn't even bother testing the program and try to determine by its output. :snooty: (Unless I can expect that a program will behave according to my intentions, I see no point in assembling it at all... But hey, that's just me! :toothy)
"mov ebx,array" is treated as "mov ebx,DWORD PTR [array]" in every case (that I can think of at this very moment :P)
"mov ebx,OFFSET array" then it will give you address of the symbol 'array'
I would generally choose to be explicit where it's not obvious, but obviously you need to know the difference for reading others' code.
The code mov ebx, array ; put BASE ADDRESS of array in EBX
actually loads the first member of array i.e. the address of itm0. It is just
a fluke that itm0...itm9 are DWORDS, so ESI*4 works in this case. I think the correct syntax should be mov ebx, OFFSET array ; put BASE ADDRESS of array
Examples: :lol
array's members ; 0 1 2 3
members addresses ;400020h 400024h 400028h 40002Ch
members content ; 5 8 2 7
A. Retrieving Addresses of the array's members
mov ebx, OFFSET array ; ebx->address of the 1st array's member ;ebx=400020h
or lea ebx, array ; ebx->address of the 1st array's member ;ebx=400020h
mov eax, offset array + 0*4 ; eax->address of the 1st array's member ;eax=400020h
mov eax, offset array + 1*4 ; eax->address of the 2nd array's member ;eax=400024h
Note:
4 means each array's member is DWORD->4 bytes
If each array's member is WORD->2 bytes we have 2 (not 4}:
mov eax, offset array + 0*2 ; eax->address of the 1st array's member ;eax=400020h
mov eax, offset array + 1*2 ; eax->address of the 2nd array's member ;eax=400022h
If each array's member is BYTE we have 1 (not 2 or 4):
mov eax, offset array + 0*1 ; eax->address of the 1st array's member ;eax=400020h
mov eax, offset array + 1*1 ; eax->address of the 2nd array's member ;eax=400021h
B. Retrieving Contents of the array's members
mov ebx, array +0*4 ; ebx->value of the 1st array's member ;ebx=5
mov ebx, array +1*4 ; ebx->value of the 2nd array's member ;ebx=8
or
mov ebx, dword ptr ds:[400020h] ; ebx->value of the 1st array's member ;ebx=5
mov ebx, dword ptr ds:[400024h] ; ebx->value of the 2nd array's member ;ebx=8
or
mov eax, offset array + 0*4 ; eax->address of the 1st array's member ;eax=400020h
mov ecx, 1*4 ; ecx = 2nd member X 4 bytes
mov ebx, dword ptr [ecx+eax] ; ebx->value of the 2nd array's member ;ebx=8
or
mov eax, offset array + 0*4 ; eax->address of the 1st array's member ;eax=400020h
mov ecx, 1 ; ecx = 2nd member
mov ebx, dword ptr [ecx*4+eax] ; ebx->value of the 2nd array's member ;ebx=8
or
mov ecx, 1 ; ecx = 2nd member
lea eax, [array + ecx*4] ; eax->address of the 2nd array's member ;eax=400024h
mov ebx, dword ptr [eax] ; ebx->value of the 2nd array's member ;ebx=8
Regards,
Lingo
Quote from: lingo on March 31, 2007, 01:37:21 PM
Regards,
Lingo
Oh! That was exhaustive!
But things got clear now I think: A label which points out at specific data or code statement always represents the address of that statement if there is no "offset" keyword.
(The example code that I quoted was kind of careless written, I suppose.)
mercifier,
About using OFFSET, it is good to remember that ADDR also performs this function. It is usually used in the INVOKE statement. So when you run into it in some example at some point, you will know what it is doing.
Lingo's example may have been a bit long but it is very well written. It is probably a good idea for you to make a printout of his post so you will always have it to hand as a reference. Eventually, you will know them by heart but for right now, a cribsheet makes a lot of sense to me.
I have a background in 6502, 6800, 6809 and 68000, also. It is my belief that once you get the hang of the differing conventions, programming the x86 is a lot easier and a lot more fun. Don't get confused by the brackets. MASM allows them by ignoring them.
Paul
Yeah, the syntax takes at bit getting-used-to, but I seem to adopt the "destination first, source second" format faster than I ever thought I would... (Actually it makes some sense compared to the 6502 Assemby language: LDA #x -> MOV A,x) I guess I will cope with context-based addressing modes soner or later too...
"always represents the address of that statement if there is no "offset" keyword."
Depends on instruction: MOV or LEA :lol
My last example:
mov ecx, 1 ; ecx = 2nd member
lea eax, [array + ecx*4] ; eax->address of the 2nd array's member ;eax=400024h
mov ebx, dword ptr [eax] ; ebx->value of the 2nd array's member ;ebx=8
With LEA we just compute contents inside the parentheses(offset of the array plus 1*4 bytes)
and save the result in eax.Hence, [400020h +1*4] = 400024h
Here (in the parentheses) array is equal of the OFFSET array == 400020h
lea eax, [array + ecx*4] ; eax->address of the 2nd array's member ;eax=400024h
and lea eax, [offset array + ecx*4] ; eax->address of the 2nd array's member ;eax=400024h
are equal
But
With MOV we get the contents of the memory cell
with address computed inside the parentheses.
Here, inside the parentheses we compute address of the sell:
[array + ecx*4] == [400020h +1*4] == 400024h ; where offset of array=400020h and ecx=1
mov eax, [array + ecx*4] ; eax->contents of the 2nd array's member ;eax=8
or
mov eax, [OFFSET array + ecx*4] ; eax->contents of the 2nd array's member ;eax=8
or
mov eax, ds:[400024h] ; eax->contents of the 2nd array's membr ;eax=8
and
mov eax, [array] ; eax->contents of the 1st array's member ;eax=5
is equal to
mov eax, array ; eax->contents of the 1st array's member ;eax=5
Regards,
Lingo
Damn! Why not just use the '#'-operator for immediate mode, like in Motorola assembly?
:bg
mercifer,
It would be too easy, then. :bg By the way, though, is not the # only used for immediate values with the $ to distinguish hex from decimal? It has been years but that is what just bubbled up in this thing I use for a brain.
Paul
mercifier ,
You are running into confusion because INTEL did not understand the Queen's English when they selected their mnemonics. A constant or address into a destination should be called a LOAD. A register contents or memory contents into a destination should be called a COPY. X86 does not really have a MOVE instruction, because that would mean the source gets copied into the destination, and then the source gets deleted. Obviously that does not happen, so MOV is a misnomer as far as its functionality is concerned. Ratch
In Motorola (and 6502) assembly, the '#' operator is used to differentiate absolute addressing (i.e. direct/displacement only) from immediate mode. The '$' operator means the following value is hexadecimal. These can be used independently of each other:
move.l 12345678,d0 ;load d0 with contents at decimal address 12345678
move.l #12345678,d0 ;load d0 with decimal (immediate) value of 12,345,678
move.l $12345678,d0 ;load d0 with contents at 12345678h
move.l #$12345678,d0 ;load 20 with hexadecimal (immediate) value of 12345678h
Ratch,
You got a point there. Data can never move, only be copied. Even the immediate constant remains unchanged, but in Motorola assembly the source of "move" comes first, which makes more sense to the syntax: move d0,d1 means "move (or copy) contents of d0 to d1". If the Intel copy instruction was named load instead, it would make more sense: mov ax,bx = "LOAD ax FROM bx" !!
(Maybe I'll just make a macro synonym of the MOV instruction...)
mercifier,
> (Maybe I'll just make a macro synonym of the MOV instruction...)
Do yourself a favour and DON'T as the MOV family of mnemonics is easy enough to get the swing of with practice. There is a guy on Usenet who converted x86 to pseudo 68k notation with macros but he is forever suspended in out of date hardware where x86 is reasonably dynamic in terms of adaption.
mercifier ,
I include a EQU statement like @ EQU OFFSET in a my code. Then I can code like INVOKE EAX,@ LABELX,EBX,etc. You can do a # EQU OFFSET if you prefer. By the way, how do you like the way the Motorola 68000 series can do a direct memory to memory copy instead of having to run the contents through a register or stack first? Ratch
Quote from: hutch-- on April 02, 2007, 09:21:22 PM
Do yourself a favour and DON'T...
OK, I'll follow that advice!
Quote from: Ratch on April 02, 2007, 09:23:48 PM
I include a EQU statement like @ EQU OFFSET in a my code.
Sound reasonable, but as Hutch said, it's probably best to stick with common syntax, even if it's sometimes awkward.
Quote from: Ratch on April 02, 2007, 09:23:48 PM
By the way, how do you like the way the Motorola 68000 series can do a direct memory to memory copy instead of having to run the contents through a register or stack first? Ratch
Well, most of the time you load data into the processor because you want to process it in some way, not just store a copy of it somewhere else. But indeed, there are times when a memory to memory move would be more traightforward. I guess that the old repeatet string operations were supposed to make up for that. However, I have learned that compact and versatile instructions does not always mean that the program will run faster.