The MASM Forum Archive 2004 to 2012

General Forums => The Campus => Topic started by: mercifier on March 31, 2007, 10:39:52 AM

Title: Constants, addresses and their contents
Post by: mercifier on March 31, 2007, 10:39:52 AM
Having programmed Motorola assebler, it confuses me a bit that the distinguishing of immediate mode and absolute (direct, displacement only) mode does not depend on operators, but on the context, as far as I understand.

So,
mov ax,10                             ;is always immediate
but
mov ax,<symbol>                   ;depends on how <symbol> is declared
Not so?

Now, I found the following code in masm32\tutorial\console\demo7\complex.asm:

    .data
    ; --------------------------
    ; initialise 10 DWORD values
    ; --------------------------
      itm0  dd 0

(8 lines removed)
      itm9  dd 9
    ; ---------------------------------
    ; put their addresses into an array
    ; ---------------------------------
      array dd itm0,itm1,itm2,itm3,itm4
            dd itm5,itm6,itm7,itm8,itm9

(several lines removed)
    mov ebx, array              ; put BASE ADDRESS of array in EBX

My question is, WHAT does ebx contain at this point - the address to "array" (the value of that label) as the comment insinuates, or the contents at that address, which would be the address to "itm0"? Does it depend on how and where the label whas declared, or is it possible to distinguish by means of an operator or keyword, such as "offset"? Would the "offset" keyword make any difference in this case?

Ok, I suppose you will have to be patient with me, because I'm a very theoretical person, not the "trial and error" kind of programmer, so I didn't even bother testing the program and try to determine by its output.  :snooty: (Unless I can expect that a program will behave according to my intentions, I see no point in assembling it at all... But hey, that's just me!  :toothy)
Title: Re: Constants, addresses and their contents
Post by: Tedd on March 31, 2007, 10:56:12 AM
"mov ebx,array" is treated as "mov ebx,DWORD PTR [array]" in every case (that I can think of at this very moment :P)
"mov ebx,OFFSET array" then it will give you address of the symbol 'array'

I would generally choose to be explicit where it's not obvious, but obviously you need to know the difference for reading others' code.
Title: Re: Constants, addresses and their contents
Post by: sinsi on March 31, 2007, 11:07:41 AM
The code    mov ebx, array              ; put BASE ADDRESS of array in EBXactually loads the first member of array i.e. the address of itm0. It is just
a fluke that itm0...itm9 are DWORDS, so ESI*4 works in this case. I think the correct syntax should be    mov ebx, OFFSET array              ; put BASE ADDRESS of array
Title: Re: Constants, addresses and their contents
Post by: lingo on March 31, 2007, 01:37:21 PM
Examples:  :lol 
array's members         ; 0           1          2         3 
members addresses       ;400020h   400024h    400028h   40002Ch 
members content         ; 5           8          2         7                 

A.  Retrieving Addresses  of the array's members     

    mov ebx, OFFSET array    ; ebx->address of the 1st array's member ;ebx=400020h
or  lea ebx, array            ; ebx->address of the 1st array's member ;ebx=400020h
                               

    mov eax, offset array + 0*4     ; eax->address of the 1st array's member ;eax=400020h
    mov eax, offset array + 1*4     ; eax->address of the 2nd array's member ;eax=400024h

Note:
    4 means each array's member is DWORD->4 bytes
   If  each array's member is WORD->2 bytes we have 2 (not 4}:

    mov eax, offset array + 0*2     ; eax->address of the 1st array's member ;eax=400020h
    mov eax, offset array + 1*2     ; eax->address of the 2nd array's member ;eax=400022h

   If  each array's member is BYTE we have 1 (not 2 or 4):

    mov eax, offset array + 0*1     ; eax->address of the 1st array's member ;eax=400020h
    mov eax, offset array + 1*1     ; eax->address of the 2nd array's member ;eax=400021h


B.  Retrieving Contents of the array's members     

    mov ebx, array +0*4     ; ebx->value of the 1st array's member ;ebx=5
    mov ebx, array +1*4     ; ebx->value of the 2nd array's member ;ebx=8 

or
    mov ebx, dword ptr ds:[400020h] ; ebx->value of the 1st array's member ;ebx=5
    mov ebx, dword ptr ds:[400024h] ; ebx->value of the 2nd array's member ;ebx=8

or
   mov eax, offset array + 0*4     ; eax->address of the 1st array's member ;eax=400020h
   mov ecx, 1*4                     ; ecx = 2nd member X 4 bytes    
   mov ebx, dword ptr [ecx+eax]    ; ebx->value of the 2nd array's member ;ebx=8     

or
   mov eax, offset array + 0*4     ; eax->address of the 1st array's member ;eax=400020h
   mov ecx, 1                        ; ecx = 2nd member    
   mov ebx, dword ptr [ecx*4+eax]  ; ebx->value of the 2nd array's member ;ebx=8   

or
   mov ecx, 1                       ; ecx = 2nd member  
   lea eax, [array + ecx*4]        ; eax->address of the 2nd array's member ;eax=400024h
   mov ebx, dword ptr [eax]        ; ebx->value of the 2nd array's member   ;ebx=8

 
Regards,
Lingo  

Title: Re: Constants, addresses and their contents
Post by: mercifier on March 31, 2007, 09:09:44 PM
Quote from: lingo on March 31, 2007, 01:37:21 PM

Regards,
Lingo   

Oh! That was exhaustive!

But things got clear now I think: A label which points out at specific data or code statement always represents the address of that statement if there is no "offset" keyword.

(The example code that I quoted was kind of careless written, I suppose.)
Title: Re: Constants, addresses and their contents
Post by: PBrennick on March 31, 2007, 10:04:28 PM
mercifier,

About using OFFSET, it is good to remember that ADDR also performs this function. It is usually used in the INVOKE statement. So when you run into it in some example at some point, you will know what it is doing.

Lingo's example may have been a bit long but it is very well written. It is probably a good idea for you to make a printout of his post so you will always have it to hand as a reference. Eventually, you will know them by heart but for right now, a cribsheet makes a lot of sense to me.

I have a background in 6502, 6800, 6809 and 68000, also. It is my belief that once you get the hang of the differing conventions, programming the x86 is a lot easier and a lot more fun. Don't get confused by the brackets. MASM allows them by ignoring them.

Paul
Title: Re: Constants, addresses and their contents
Post by: mercifier on March 31, 2007, 10:16:01 PM
Yeah, the syntax takes at bit getting-used-to, but I seem to adopt the "destination first, source second" format faster than I ever thought I would... (Actually it makes some sense compared to the 6502 Assemby language: LDA #x -> MOV A,x) I guess I will cope with context-based addressing modes soner or later too...
Title: Re: Constants, addresses and their contents
Post by: lingo on March 31, 2007, 10:28:25 PM
"always represents the address of that statement if there is no "offset" keyword."

Depends on instruction: MOV or LEA   :lol

My last example:
   mov ecx, 1                          ; ecx = 2nd member    
   lea   eax, [array + ecx*4]      ; eax->address of the 2nd array's member ;eax=400024h
   mov ebx, dword ptr [eax]     ; ebx->value of the 2nd array's member   ;ebx=8

With LEA we just compute contents inside the  parentheses(offset of the array plus 1*4 bytes)
and save the result in eax.Hence, [400020h +1*4] = 400024h

Here (in the parentheses) array is equal of the OFFSET array == 400020h

        lea eax, [array + ecx*4]          ; eax->address of the 2nd array's member ;eax=400024h
and   lea eax, [offset array + ecx*4] ; eax->address of the 2nd array's member ;eax=400024h
are equal

But

With MOV we get the contents of the memory cell
with address computed inside the  parentheses.

Here,  inside the  parentheses we compute address of the sell:
[array + ecx*4] == [400020h +1*4] == 400024h  ; where offset of array=400020h and ecx=1


  mov eax, [array + ecx*4]        ; eax->contents of the 2nd array's member ;eax=8
or
  mov eax, [OFFSET array + ecx*4] ; eax->contents of the 2nd array's member ;eax=8
or
  mov eax, ds:[400024h]           ; eax->contents of the 2nd array's membr ;eax=8

and
  mov eax, [array]        ; eax->contents of the 1st array's member ;eax=5
is equal to
  mov eax, array          ; eax->contents of the 1st array's member ;eax=5

Regards,
Lingo
Title: Re: Constants, addresses and their contents
Post by: mercifier on April 02, 2007, 06:51:43 PM
Damn! Why not just use the '#'-operator for immediate mode, like in Motorola assembly?
:bg
Title: Re: Constants, addresses and their contents
Post by: PBrennick on April 02, 2007, 08:05:44 PM
mercifer,

It would be too easy, then.  :bg By the way, though, is not the # only used for immediate values with the $ to distinguish hex from decimal? It has been years but that is what just bubbled up in this thing I use for a brain.

Paul
Title: Re: Constants, addresses and their contents
Post by: Ratch on April 02, 2007, 08:41:43 PM
mercifier ,

     You are running into confusion because INTEL did not understand the Queen's English when they selected their mnemonics.  A constant or address into a destination should be called a LOAD.  A register contents or memory contents into a destination should be called a COPY.  X86 does not really have a MOVE instruction, because that would mean the source gets copied into the destination, and then the source gets deleted.   Obviously that does not happen, so MOV is a misnomer as far as its functionality is concerned.  Ratch
Title: Re: Constants, addresses and their contents
Post by: mercifier on April 02, 2007, 08:49:56 PM
In Motorola (and 6502) assembly, the '#' operator is used to differentiate absolute addressing (i.e. direct/displacement only) from immediate mode. The '$' operator means the following value is hexadecimal. These can be used independently of each other:


move.l 12345678,d0 ;load d0 with contents at decimal address 12345678
move.l #12345678,d0 ;load d0 with decimal (immediate) value of 12,345,678
move.l $12345678,d0 ;load d0 with contents at 12345678h
move.l #$12345678,d0 ;load 20 with hexadecimal (immediate) value of 12345678h
Title: Re: Constants, addresses and their contents
Post by: mercifier on April 02, 2007, 08:55:33 PM
Ratch,

You got a point there. Data can never move, only be copied. Even the immediate constant remains unchanged, but in Motorola assembly the source of "move" comes first, which makes more sense to the syntax: move d0,d1 means "move (or copy) contents of d0 to d1". If the Intel copy instruction was named load instead, it would make more sense: mov ax,bx = "LOAD ax FROM bx" !!

(Maybe I'll just make a macro synonym of the MOV instruction...)
Title: Re: Constants, addresses and their contents
Post by: hutch-- on April 02, 2007, 09:21:22 PM
mercifier,

> (Maybe I'll just make a macro synonym of the MOV instruction...)

Do yourself a favour and DON'T as the MOV family of mnemonics is easy enough to get the swing of with practice. There is a guy on Usenet who converted x86 to pseudo 68k notation with macros but he is forever suspended in out of date hardware where x86 is reasonably dynamic in terms of adaption.
Title: Re: Constants, addresses and their contents
Post by: Ratch on April 02, 2007, 09:23:48 PM
mercifier ,

    I include a EQU statement like @ EQU OFFSET in a  my code.  Then I can code like INVOKE EAX,@ LABELX,EBX,etc.  You can do a # EQU OFFSET if you prefer.  By the way, how do you like the way the Motorola 68000 series can do a direct memory to memory copy instead of having to run the contents through a register or stack first?  Ratch
Title: Re: Constants, addresses and their contents
Post by: mercifier on April 05, 2007, 07:17:55 PM
Quote from: hutch-- on April 02, 2007, 09:21:22 PM
Do yourself a favour and DON'T...

OK, I'll follow that advice!

Quote from: Ratch on April 02, 2007, 09:23:48 PM
I include a EQU statement like @ EQU OFFSET in a  my code.

Sound reasonable, but as Hutch said, it's probably best to stick with common syntax, even if it's sometimes awkward.

Quote from: Ratch on April 02, 2007, 09:23:48 PM
By the way, how do you like the way the Motorola 68000 series can do a direct memory to memory copy instead of having to run the contents through a register or stack first?  Ratch

Well, most of the time you load data into the processor because you want to process it in some way, not just store a copy of it somewhere else. But indeed, there are times when a memory to memory move would be more traightforward. I guess that the old repeatet string operations were supposed to make up for that. However, I have learned that compact and versatile instructions does not always mean that the program will run faster.