News:

MASM32 SDK Description, downloads and other helpful links
MASM32.com New Forum Link
masmforum WebSite

strings and procedures

Started by RuiLoureiro, March 23, 2005, 03:16:20 PM

Previous topic - Next topic

RuiLoureiro

      Let me set this question: what are the problems with variables defined as:

       dd 0               ; - 8 effective number of bytes
       dd X               ; - 4 maximum number of bytes     
_var  db X dup (?)   ; + 0 buffer for X bytes

and procedures like this:

; GetByte is an interface procedure
; input:  ESI = ptr _var
CpyBytVar   proc
            pushad

            ; ini
            mov   ebx, esi
            mov   dword ptr [esi - 8], 0     ; 0 bytes in buffer
            mov   ecx, dword ptr [esi - 4]
            jecxz short _e

            ; get byte from keyb, file, etc. to AL
_i:         call  GetByte                    ; preserve all other registers
            jz    short _e                   ; no more

            ; set byte in var
            mov   byte ptr [esi], al
            inc   dword ptr [ebx - 8]
            inc   esi
            dec   ecx
            jnz   short _i                   ; accept another
            ;
_e:         popad
            ret
CpyBytVar   endp

MichaelW

Assuming you are using MASM:

The count for the DUP operator must be a constant.

Refering to variables by name rather than by an address or index is easier and helps prevent programmer errors. Using named constants (such as MAXBYTES in the code below) also helps prevent programmer errors.

For a statement like:

mov byte ptr [esi], al

MASM knows the size of the memory operand from the size of the register operand, so you don't need the "byte ptr".

I removed this because it does not make sense if the maximum number of bytes is a constant. Although this general method of conditionally jumping when the value of a variable is zero would work, it would be very inefficient. Generally, you should just compare the variable to zero and jump based on the result.

mov ecx, dword ptr [esi - 4]
jecxz short _e


.data
    MAXBYTES EQU 8              ; maximum number of bytes
    nBytes  dd 0                ; effective number of bytes
    _var    db MAXBYTES dup (?) ; buffer for X bytes
.code

; GetByte is an interface procedure
; input:  ESI = ptr _var
CpyBytVar proc
    push  eax
    ; ini
    mov   nBytes, 0               ; 0 bytes in buffer

    ; get byte from keyb, file, etc. to AL
  _i:
    call  GetByte                 ; preserve all other registers
    jz    _e                      ; no more

    ; set byte in var
    mov   [esi], al
    inc   nBytes
    inc   esi
    cmp   nBytes, MAXBYTES
    jna   _i                      ; accept another
    ;
  _e:
    pop   eax
    ret
CpyBytVar endp

eschew obfuscation

RuiLoureiro

Hi, MichaelW

   Thank you for your attention.

1. Ok, we can assume MASM (for GoAsm or other we can convert it).

2. Ok, "byte ptr" comes from my works with tasm !
   
3. The basic idea behind the structure of _var is having only «_var».
   I have one name which is _var.
   What i don't want is to put more names, more words, like «nBytes» in my text.

4. «DUP operator must be a constant»:
   When i am wrinting a program i use a particular character ( like $ ) for
   all names for a particular set ( equates ). So, ok, it could be
   $MAXBYTES  equ 8 ( assuming that we want a buffer for 8 bytes ); But
   now, I am violating my principle: i want one name only !

5. CpyBytVar is intended to be a general procedure for all variables of that type.
   So, it cannot use $MAXBYTES !
   More: I don't want to pass more than the variable address (by register or stack).

6. «it does not make sense if the maximum number of bytes is a constant»:
   a) The procedure CpyBytVar is one and the same for all variables;
   b) It should not accept more than the address ( one thing ).

7. Now, how to solve this problem ?

For instance, I have 2 variables _var1 and _var2:

.data
$var1   EQU 82                 
           dd $var1      ; maximum number of bytes
          dd 0                   ; effective number of bytes
_var1   db $var1 dup (?)    ; buffer for $var1 bytes
           ;
$var2   EQU 20                 
          dd $var2      ; maximum number of bytes
          dd 0                   ; effective number of bytes
_var2   db $var2 dup (?)    ; buffer for $var2 bytes

.code
; GetByte is an interface procedure
; input: ESI = ptr _var
CpyBytVar   proc
            pushad

            ; ini
            mov   ebx, esi            ; header pointer
            mov   dword ptr [esi - 8], 0     ; 0 bytes in buffer
      ;

            mov   ecx, dword ptr [esi - 4]   ; buffer for ECX bytes
            or       ecx, ecx              ; error
            jz    short _e             ; exit

            ; get byte from keyb, file, etc. to AL
@@:         
           call  GetByte                    ; preserve all other registers
            jz    short _e                   ; no more
           
            mov   [esi], al          ; set byte in var
            inc   dword ptr [ebx - 8]       ; one more byte
            inc   esi
            dec   ecx
            jnz   @B                         ; accept another
            ;

            ; is full
_e:      
           popad
            ret
CpyBytVar   endp

What´s your assessment about ? What is inefficient ?
Best regards
      «we want friends, not more!»

RuiLoureiro

Sorry, .data should be:


.data
$var1   EQU 82                 
           dd 0                   ; effective number of bytes
           dd $var1      ; maximum number of bytes
_var1   db $var1 dup (?)    ; buffer for $var1 bytes
   ;
$var2   EQU 20
           dd 0                   ; effective number of bytes
           dd $var2      ; maximum number of bytes
_var2   db $var2 dup (?)    ; buffer for $var2 bytes

regards

MichaelW

You cannot define dynamic data this way. The count for the DUP operator MUST be a constant. If you need to change the size of the data at run time, you probably need to allocate a block of memory from the OS.
eschew obfuscation

RuiLoureiro

Quote from: MichaelW on March 24, 2005, 09:13:08 PM
You cannot define dynamic data this way. The count for the DUP operator MUST be a constant. If you need to change the size of the data at run time, you probably need to allocate a block of memory from the OS.


Hi

1.  It is not dynamic data ! So, I don't need to allocate it from OS.
2.  The DUP operator is a CONSTANTE ( eg. $var1 EQU 82)!!!
3.  The total length of memory is a CONSTANT ( = 4+4+82 ). This memory is part of the program itself !!

.data
    $var1   EQU 82   ; What is the problem with this definition ?
              dd 0                   ; effective number of bytes
              dd $var1      ; maximum number of bytes
_var1      db $var1 dup (?)    ; buffer for $var1 bytes

Sorry, I don't see any reason in your answer !

Your first hint when you defined MAXBYTES EQU 8 and a new variable like this

MAXBYTES EQU 8       
     dd 0                      ; effective number of bytes
_var1         db MAXBYTES dup (?)    ; buffer for MAXBYTES bytes

MEANS: I must write one new procedure for each new variable. And it is not acceptable. Do you agree ?

Whats the problem with CpyBytVar ?

In anyway, thank you
Regards
------------------------------------------------------------------------------------------------------------------------------------------------
I am here to talk about programming and related questions in a friendly way with a friendly people, i hope
-------------------------------------------------------------------------------------------------------------------------------------------------

MichaelW

Quote from: RuiLoureiro on March 25, 2005, 03:55:06 PM
1. It is not dynamic data ! So, I don't need to allocate it from OS.
2. The DUP operator is a CONSTANTE ( eg. $var1 EQU 82)!!!

Sorry, it was late when I read your post. You left out too many details in your initial post, forcing me to guess at what you are trying to do, and I guessed wrong. Your naming conventions are IMO very odd and hard to follow.  For MASM, I think I would use something similar to this (not tested):

    varHeader STRUCT
        maxBytes  DWORD 0
        effBytes  DWORD 0
    varHeader ENDS

    .data

      MAXBYTES = 82
      var1 varHeader <MAXBYTES>
      db MAXBYTES dup(0)

      MAXBYTES = 20
      var2 varHeader <MAXBYTES>
      db MAXBYTES dup(0)

CopyByteVar proc uses ebx esi lpVar:DWORD
    mov   esi,lpVar   
    mov   ebx,esi
    add   ebx,SIZEOF varHeader

    ASSUME esi:ptr varHeader   

    mov   [esi].effBytes,0
    mov   ecx,[esi].maxBytes
  @@:   
    call  GetByte
    cmp   eax,-1
    je    @F   
    mov   [ebx],al
    inc   [esi].effBytes
    inc   ebx
    dec   ecx
    jnz   @B
  @@:

    ASSUME esi:nothing   

    ret
CopyByteVar endp


eschew obfuscation

RuiLoureiro

Hi

1.   What  does IMO  mean ?
My naming conventions is a help to me. The assembler can know all things, but who write
the text are me ( is this right ?! ). If it has an error, the assembler order me to make the
work, not it ("He" says: Oh, Oh, Oh go back and work more !).

2.   I have not any MASM manual to know exactly each instruction, directives, etc.   

3.  «hard to follow»: i don't think so !
    When we are deeply inside some issue (is your case) which is not followed by others
     it may be hard for both. It´s normal. If you put me some questions, i had the same problem, too !
     It is the case with  «uses ebx esi lpVar:DWORD».
     But i think my question is not strange ! It´s not normal, regular ! Try  to go to the right side
     every time whenever you go out from your home ( yes, it´s a joke !).

4.  The definitions of  var1 and var2 was bypassed ! The address of  the maxBytes' fields 
     ( whatever you call ) and effBytes' fields  are now,  forward addresses !
      Your var header is forward; my var header is backward. It implies that the address of var(1\2)
      don't point to the buffer directly. In my case it points to both sides directly and would be

    varHeader      STRUCT
         DWORD ?
      £maxBytes  DWORD 0      ; 4
      £effBytes     DWORD 0      ; 8
    varHeader      ENDS

   but imply backward address ( i use £ at the beginning to remember this ).
   I think we could define var1 and var2 with macros (in MASM), but i like it ... little !
   I implemented hundred of structures of that type. But not in masm.

5.    You use  «.» that means «plus» like in  « mov   ecx, [esi].maxBytes». I am not a mathematician  but a kind  of it ! This is because i prefer  «mov   ecx, [esi + maxBytes]». Yes, i know it is a  question related with what a converter ( assembler ) assumes etc. But nothing wrong ! I don't  know if MASM accept  «mov   ecx, [esi + maxBytes]». This question is related with
ambiguities: ( it is used in .data, 1.23, [esi].maxBytes, to point structures members, and ... ).
If  we must begin, for example, all directives with  ... #  or  % or other, it is one good way to
recognise and understand what we are reading. It is not a constraint to me. The unique
restiction was: don't use this symbol in your names but you can "make" all names.

6.  It was assumed that  GetByte gives -1 when it has no more bytes; In windows, cannot we use the flags register between procedures ? For example, stc meaning «no more», «exit» ?

7.   «uses ebx esi lpVar:DWORD»:  here, i don't know where is lpVar. I think MASM
      makes (assumes ) something like this ( i could see it with a debugger ):

sCopyByteVar   struct
ret0         dd ?   ; 0
ebp0         dd ?   ; 4
         ;
lpVar         dd ?   ; 8
sCopyByteVar   ends
lCopyByteVar      equ  SIZEOF  sCopyByteVar -  lpVar    ; in masm isnot SIZE, yes ?
CopyByteVar       proc 
         push   ebp
         mov   ebp, esp
         ;
         push   ebx  esi
             ;mov   esi, lpVar             ; header pointer
         mov    esi, [ebp + lpVar]      ; mov   esi, [ebp].lpVar  ?     
         .........
         .........
         pop   esi ebx ebp
         ret   lCopyByteVar
CopyByteVar       endp

   Is it right  in masm ? Is what it does ?

   That questions are other things i like ... very little. Despite the assembler makes it for me, i prefer to explain all things. In this case my text is longer, but it uses basic assembler rules it is not forced to be "compiled" by that  precisely "compiler". Is more or less ... free !

8. About efficiency, with this type, each variable consumes 4 bytes of data memory for maxBytes
but when we call the procedures to lead with it (to do some type of processing ) we don't need to spend perhaps more than 4 bytes to pass the length ( the max ). Is it wrong ?

Finaly: looking at my original question, in your solution, there are too many names in the text to
         define var1 and var2 ( i use _var1 and _var2 for all data variables ). If  i have an error and
         i have some simple organization ( like this ) it can be easier to follow, to see, etc. It help me.
        If  i give you  a big  text that follows simple rules, i can transmit you this rules with
        simplicity and you can follow it without problems.
        Is  it not right ?

Regards

tenkey

If you want to avoid using STRUCT, you can use equates...

fld_maxlen EQU -8  ; offset of maximum length field
fld_efflen EQU -4  ; offset of effective length field

    .code

    mov ecx,[eax + fld_efflen]   ; get current length


I've done this for object allocation where the header information is accessed with a negative offset.

Not much can be done about the number of declaration names without macros.

define_stringbuffer MACRO stringname, bufferlength
    dd bufferlength
    dd 0
stringname db (bufferlength) dup (?)
    ENDM

    .data

    define_stringbuffer _var1,8    ; define _var1, length 8
    define_stringbuffer _var2,16   ; define _var2, length 16

A programming language is low level when its programs require attention to the irrelevant.
Alan Perlis, Epigram #8

RuiLoureiro

Hi, tenkey
   Thanks

1.   I  remember that this topic is a  «general purpose question», only !

2.   Yes, we could do

   fld_maxlen    EQU -8 
   fld_efflen    EQU -4 
   or
   fld_maxlen    EQU 8 
   fld_efflen    EQU 4 

but:   a)  i  lost  the visual variable structure;
   b)  i  mixed  variables structures with another type of constants defined with EQU.
        (eg. EQU : used for constants  STRUCT : used for names of ... )

About  backward access:
I called backward access as an image for negative offset . It can be seen in a variety of ways.
( may be upward, topward, ...negative, ...??? A buffer beginning at  _var with a header sptamped at the back or behind ... ! )

As we know, Pascal variables have the structure:

_VarPAS     db X
                db X dup (?)

and i put a header behind the buffer

               dd 0
               dd X
_VarRCL    db X dup (?)

In  the same way, i used  ( a table –  which has the same structure as _varRCL   )
   dd  2         ;  effective number
   dd  2         ;  number of structures in the buffer \ max
_TblRCL   dd  offset _VarRCL1
   dd  offset _VarRCL2

3.   Using macro to structure the variable turn it simple. It looks like to be the best.

4.   In windows, cannot we use the flags register between procedures ?
      For example:    stc:  meaning «no more», «exit», «error»...
      clc:                «ok»

What  does IMO  mean ? Do you know ?

Stay well
Regards

MichaelW

Rui,

IMO means "In My Opinion". It is a way of stating "This is what I think, but I recognize that others may not agree".

IMO using unnamed values, such as your -4 and -8 offsets, is a bad practice because it increases the probability of programmer error.


eschew obfuscation

tenkey

RET does not change the flags register, so you can return information with the flags register. You only need to remember that any arith or logic instruction (such as CMP, ADD, AND, SHL, etc.) will change it. Also, CALLs (or INVOKEs) to other procedures can change it.
A programming language is low level when its programs require attention to the irrelevant.
Alan Perlis, Epigram #8

RuiLoureiro

Hi  MichaelW

   IMO OK ! I never used negative constants with equ.

tenkey,

   The behaviour of  a x386 processor i know (when it follow what is written ). I have the book «386SX Microprocessors». The question is about the behaviour of the windows and usual good practices. The call itsef  doesn't change the flags. The same for push. If my procedures use that convention ( exit with clc\stc ) and they don't call system or others procedures than they don't modify the flags established before exit. I think the window system saves all registers when it exit from a thread and restore when it comes back to that thread.

   I have some experience – some years – in assembler for  DOS 16 bites with tasm from Borland C++ 3.1 (don't like C). I have written many structures and procedures.

   I think this topic is near the end.

Stay well

RuiLoureiro

Hi all
Correction: The last structure sCopyByteVar should be

sCopyByteVar   struct
ebp0      dd ?   ; 0    esp + 0  access  ebp
ret0      dd ?   ; 4    esp + 4  access  ret
      ;
lpVar      dd ?   ; 8    esp + 8  access  lpVar
sCopyByteVar   ends
lCopyByteVar   equ  SIZEOF  sCopyByteVar -  lpVar

In anyway, we don't use ebp0 or ret0 inside the procedure, so the result is the same.
-----------------------------------------------------------------------------------------------------------------------------------------
   About concepts i tell you how i solved a problem in DOS-16 to see MyDB ( is a file with a set of structured registers ) in the screen. In the next lines we present a draft sketch of how it was conceived.

1. Intro
  «Prc» means «procedure»; « -> » means «call»; «EP» means «Entry Point»; «Tbl» means «Table»
  «The procedures don't return to the next instruction and they are in the same segment »
  «GoTo» = «push offset EP_???; ret »; «AltKey is used to open a menu window»
  «The entry points in _TblExcKeyA are defined in MainPrc »
  «The entry points in _TblExcKeyB are defined in RCLEditPrc »

2. Example of the tables of  keys and entry points ( where are the procedures to process the keys ):
   dw 3            dw 3
_TblKeyA   db PgUp      _TblExcKeyA dw offset EP_PgUp
   db Return             dw offset EP_Enter
   db Esc                 dw offset EP_Esc

3. The basic procedures 
       VrfKeyTbl: Verify  if  key is in the given table;
                       If yes, return the corresponding Entry Point EP_InTbl ( EP_Esc or
                                   EP_Enter or ... );

   MainPrc      ->     RCLEditPrc       ->   AllKeyPrc
     [ MainPrc is in the outter ring; AllKeyPrc is in the inner ring ]
     [ After all processing inside MainPrc or RCLEditPrc ,«ret» is used to return to AllKeyPrc ]
-----------------------------------------------------------------------------------------------------------------------------------------
4.  How each procedure works
AllKeyPrc:   wait for a key.   0   iEP_AllKeyPrc:
         1   If  key wasn't  pressed, then GoTo  EP_Wait ;
         2   If  key  = AltKey, then          GoTo  EP_Menu;
         3   Take _TblKeyA and _TblExcKeyA  (VrfKeyTbl)
               and see if key is in _TblKeyA. Yes ? GoTo  EP_InTbl
         4   Take _TblKeyB and _TblExcKeyB (VrfKeyTbl)
               and see if key is in _TblKeyB. Yes ? GoTo  EP_InTbl
         5    GoTo  iEP_AllKeyPrc   [ begin ]
      [  note: before GoTo EP_InTbl, iEP_AllKeyPrc is pushed into stack  ]
----------------------------------------------------------------------------------------------------------------------------------------
RCLEditPrc: Edit the string  _varRCL  in line LIN, column COL, color ATTR.
       It is called from MainPrc with {LIN, COL, ATTR, _varRCL, _TblKeyA,
                       _TblExcKeyA, EP_Wait, EP_Menu  }
       It calls AllKeyPrc with { _TblKeyA, _TblExcKeyA, EP_Wait, EP_Menu                            _TblKeyB, _TblExcKeyB, EP_Chars }
      « this set of parameters are passed through stack »
EP_Chars:       [ Process A to Z, 0-1, etc., put in _varRCL and Print it ]
EP_Del:      [ Del the current character from  _varRCL ]
EP_BkSpace: [ Del the previous character from  _varRCL ]
etc.
   «These EPs are in RCLEditPrc and defined in the table _ TblExcKeyB»
-------------------------------------------------------------------------------------------------------------------------------------
MainPrc:   is where  RCLEditPrc is called and where are all Entry Points defined in the table
      _TblExcKeyA to process the keys in the table _TblKeyA.

In this EPs there are procedures to:
EP_Wait:   [ Verify if  _varRCL is in MyDB. If yes, call a Prc to refresh the screen ]
EP_Menu:   [ Open the menu window and process menu ]
...
EP_PgUp:   [ Show the next set of  registers of MyDB ]
etc.
   «All these EPs are in MainPrc and defined in the table _ TblExcKeyA»
-----------------------------------------------------------------------------------------------------------------------------------------
5.   Conclusion

   If  we want to add or change keys, etc. we don't need to change the basic procedures.
   We need to change the tables and\or the procedures defined which are in EP_???.

   Did anyone not understand how this work ?

stay well
RuiLoureiro

pbrennick

RuiLoureiro,
Two things:

1)  I think your postings are antogonistic torwards people who know more about masm than you do and are only trying to help.  If you don't want the advice, don't ask, I guess I can be antogonistic, too!  :cheekygreen:

2)  Your 'assumption' that masm will return a message about 'all' errors is incorrect.  It will only report compilation errors, not the same thing at all.  Your code can contain tons of error (not saying it does) and it will still compile (I prefer 'assemble') without a hiccup.  This means that troubleshooting is a fact of life when coding without the benefit of an IDE that contains runtime debugging modules.  Personally, I wouldn't have it any other way, the penalty is IMO too high.  Still, be warned!

I can't wait to see how you reply too this one due to your track record!  :cheekygreen: :cheekygreen:

Paul