News:

MASM32 SDK Description, downloads and other helpful links
MASM32.com New Forum Link
masmforum WebSite

Memory Addressing

Started by Glenn9999, November 06, 2008, 04:28:54 AM

Previous topic - Next topic

Glenn9999

I'm trying to learn assembler (I'm primarily on 16-bit DOS stuff right now, but want to branch out to 32-bit stuff for Windows), and I seem to keep getting stuck on memory referencing.  I can do simple stuff (which MOV takes care of), but I seem to keep getting stuck when it comes time to do other things (strings, pointer references and so forth).

Is there a good reference that describes these things well?  I collected a few online references and picked up enough to do some simple things, but they seem to be mostly disorganized things thrown on the page and are lacking in thoroughness in many respects (like this one).

Can anyone suggest a good reference that is thorough on the basics?  For example (besides this post), the specific usages and functions of registers, fundamental differences between similar ops (SUB vs. SBB for example - I know the mechanical difference, but why use one compared to the other, wouldn't you always want to borrow/carry, and so forth?).  More or less, stuff the references I have found don't quite fully explain.

MichaelW

Try the MASM manuals available here. For example, see Using Addresses and Pointers here, and for SUB versus SBB see the Processor chapter here. Note that there is a long-standing error in the description of SBB. The sentence "SBB is used to subtract the least significant portions of numbers that must be processed in multiple registers" should read "SBB is used to subtract the most significant portions of numbers that must be processed in multiple registers". You can verify this, as well as test the operation of any of the 8086/8088 or 8087 instructions, with the DOS Debug program. For example:

-a
0B07:0100 mov ax, 200
0B07:0103 mov dx, 10
0B07:0106 sub al, dl
0B07:0108 sbb ah, dh
0B07:010A nop
0B07:010B
-t

AX=0200  BX=0000  CX=0000  DX=0000  SP=FFEE  BP=0000  SI=0000  DI=0000
DS=0B07  ES=0B07  SS=0B07  CS=0B07  IP=0103   NV UP EI PL NZ NA PO NC
0B07:0103 BA1000        MOV     DX,0010
-t

AX=0200  BX=0000  CX=0000  DX=0010  SP=FFEE  BP=0000  SI=0000  DI=0000
DS=0B07  ES=0B07  SS=0B07  CS=0B07  IP=0106   NV UP EI PL NZ NA PO NC
0B07:0106 28D0          SUB     AL,DL
-t

AX=02F0  BX=0000  CX=0000  DX=0010  SP=FFEE  BP=0000  SI=0000  DI=0000
DS=0B07  ES=0B07  SS=0B07  CS=0B07  IP=0108   NV UP EI NG NZ NA PE CY
0B07:0108 18F4          SBB     AH,DH
-t

AX=01F0  BX=0000  CX=0000  DX=0010  SP=FFEE  BP=0000  SI=0000  DI=0000
DS=0B07  ES=0B07  SS=0B07  CS=0B07  IP=010A   NV UP EI PL NZ NA PO NC
0B07:010A 90            NOP
-


For a more thorough description of the instructions, including the most recent instructions, see the Intel manuals available from the Forum Web Site under Technical Reference.

For information on using Debug, search the forum for "debug tutorial".
eschew obfuscation

Glenn9999

Here's a specific example of something I'm trying to get to work.  I don't know if I'm not understanding something or simply getting it wrong somehow.


INT 21 - Windows95 - LONG FILENAME - FIND FIRST MATCHING FILE
AX = 714Eh
CL = allowable-attributes mask (see #1107 at AX=4301h)
      (bits 0 and 5 ignored)
CH = required-attributes mask (see #1107)
SI = date/time format (see #1467)
DS:DX -> ASCIZ filespec (both "*" and "*.*" match any filename)
ES:DI -> FindData record (see #1468)
Return: CF clear if successful
    AX = filefind handle (needed to continue search)
    CX = Unicode conversion flags (see #1469)
CF set on error
    AX = error code
7100h if function not supported
Notes: this function is only available when IFSMgr is running, not under bare
  MS-DOS 7
the application should close the filefind handle with AX=71A1h as soon
  as it has completed its search
for compatibility with DOS versions prior to v7.00, the carry flag
  should be set on call to ensure that it is set on exit
SeeAlso: AH=4Eh,AX=714Fh,AX=71A1h


Now here's the last attempt that I used to try to call this interrupt function.  The part I'm not sure about is loading the Pascal string filespec (word byte for length then the data), and the FileData structure SR.  As far as I'm understanding, LDS will load the proper address for DS:DX, and LES loads for ES:DI.  I also tried using MOVes to set the addresses, and didn't end up with anything working either.


      MOV   AX, 714Eh
      MOV   CX, fileattr
      XOR   SI, SI             { SI = 0 }
      LDS   DX, filespec+2  { try to load DS:DX for filespec }
      LES   DI, SR             { try to load ES:DI for SR }
      INT   21h
      SBB   BX, BX            { pull carry flag into BX }
      MOV   SHandle, AX
      AND   AX, BX           { not sure of I have this part right either. }
      MOV   DosError, AX


The code I have here crashes.  Any thoughts on why and how to correct?

MichaelW

Why a Pascal string? The DOS functions expect plain ASCIIZ strings, where ASCIIZ means a null-terminated string, as used by the C language.

LDS, LES, etc are intended for loading far pointers from memory. The following uses Debug to demonstrate loading a far pointer (A000h:1000h) from offset address 102 into ES:BX:

-a
0B07:0100 jmp 106
0B07:0102 dw 1000
0B07:0104 dw a000
0B07:0106 les bx, [102]
0B07:010A nop
0B07:010B
-t

AX=0000  BX=0000  CX=0000  DX=0000  SP=FFEE  BP=0000  SI=0000  DI=0000
DS=0B07  ES=0B07  SS=0B07  CS=0B07  IP=0106   NV UP EI PL NZ NA PO NC
0B07:0106 C41E0201      LES     BX,[0102]                          DS:0102=1000
-t

AX=0000  BX=1000  CX=0000  DX=0000  SP=FFEE  BP=0000  SI=0000  DI=0000
DS=0B07  ES=A000  SS=0B07  CS=0B07  IP=010A   NV UP EI PL NZ NA PO NC
0B07:010A 90            NOP


In your case, assuming that filespec and the FindData structure are stored in the data segment, you need to load DX with the offset address of filespec, DI with the offset address of the FindData structure, and ES with the segment address of the data segment. To load the offset addresses you could use the OFFSET operator:

mov dx, OFFSET filespec
mov di, OFFSET sr


Depending on whether you are running a .COM or .EXE, either the program loader or the startup code will have set DS to the segment address of the data segment. The easy way to set ES to the segment address of the data segment is to use the stack:

push ds
pop es


Note that your current code is loading DS with an invalid segment address. Most instructions that access data use DS by default and these instructions will likely fail if DS is not set to the segment address of the data segment. In situations were you must change the value in DS, you must take care to restore the original value when you finish with the new value and/or, for instructions that access data, control which segment register the instructions use.

For functions where the carry flag is used to indicate an error, you would typically test the carry flag with a conditional jump, something like this:

    jnc no_error
      ; error handler here
  no_error:

eschew obfuscation

Glenn9999

Quote from: MichaelW on November 07, 2008, 06:45:40 AM
Why a Pascal string? The DOS functions expect plain ASCIIZ strings, where ASCIIZ means a null-terminated string, as used by the C language.

One of the many odd things I was trying to eliminate other things.  What I'm trying to do with the function is interface it into Turbo Pascal for an old DOS program I'm trying to fix up to work with LFN for a boot disk (and the reason I wanted to learn ASM) - TP doesn't play with ASCIIZ well without a lot of work.

As for the specific case, I looked at the code really close in the debugger and noticed that my attempts to do as you posted (MOV  DX, Offset filespec), was blanking out the values or putting wierd (obviously wrong) values in their place.  I ended up sticking with LES/LDS and figured out what I was doing wrong.  Again a memory addressing issue.  :dazzled:  I changed filespec to pass a pointer and what I had worked.  So I guess pointers are the order of the day if I want to pass anything other than a byte, word, or dword to a ASM proc?

On to the next problem of many, I suppose.  Question: When you start out with ASM does it usually take hours to do the simplest of things?
At least when I get finished with this project, I can go on to 32-bit stuff and see what is different.

MichaelW

My reply was based on the assumption that the strings were defined in the MASM module. So you are actually doing mixed-language programming, with the Pascal code calling the MASM code? The MASM Programmer's Guide has a chapter on mixed-language programming.

QuoteQuestion: When you start out with ASM does it usually take hours to do the simplest of things?

It did for me.
eschew obfuscation

Glenn9999

okay, so let me see if my understanding is correct.

I can address a field by its variable name alone as long as it is a simple unit (DB, DD, DW, I'm sure more).  I can also address a multiple unit variable by name (a string or a struct) if it involves a direct reference for which that pointer would suffice.  For example, loading an address from it, or passing it to a subprogram which accepts the multiple unit variable. 

Now if I need to access a subunit of the multiple unit variable, I need to have the address of it and then refer to it by putting [] around it.  For example, if I process a string to work on its individual characters, I would need to load the address into a register and then refer to the register in brackets.  Like (untested):


DB  MyString "Hello World!"
LEA BX, MyString
MOV CX, [BX]


would hold the character equivalent of "H" in CX?  Then assuming if I wanted to process the rest, I could put it into a loop and keep adding 1 to BX?  Now if I remove the brackets, I'm just passing the address back to CX?  Finally, if I have a struct and have an interrupt function load into part of it, I can pass an address equal to the position of the place?  Say I have two words and a string and my interrupt function loads into the string, I can establish DS, and then load the struct address into DX and then add four to it?

Am I getting close to understanding the idea of addressing memory in ASM?

Quote
So you are actually doing mixed-language programming, with the Pascal code calling the MASM code?

Or Delphi....though both have rather nice inline assembler facilities which I can use rather than the MASM, but I can link OBJect files as well.  I will likely try that soon.  The project I asked about is more recreational now (I found out the LFN functions do not work in real DOS but only Windows), but still something I can start in learning.  Even though, I'll probably move on to some simple programming project definitions in full ASM that I have here to try to get a good breadth of how ASM works.

Thanks for your help, it is much appreciated.  And I'm probably sure I will ask more (or starting helping when I get comfortable, one...)

Glenn9999

MichaelW

You're not too far off, try experimenting with the following code, assembled and then linked with a 16-bit linker, using a batch file like this:

ML /c filename.asm
pause
LINK16 filename.obj;
pause



.model small
.stack
.data
    mystring db "hello world",0
.code
.startup
    lea bx, mystring
    ; load the indexed character into DL (note byte-sized register)
    mov dl, [bx]
    ; display the character in DL   
    mov ah, 2
    int 21h
    ; wait for a key press before exiting
    mov ah, 0
    int 16h
.exit
end


Quoteand then load the struct address into DX

In 16-bit code indirect memory operands can use a base register (BP or BX) or an index register (SI or DI), or one of each. See Operands in Chapter 3 of the MASM Programmer's Guide.
eschew obfuscation