News:

MASM32 SDK Description, downloads and other helpful links
MASM32.com New Forum Link
masmforum WebSite

Macro Stack

Started by Rockoon, February 04, 2007, 06:28:58 AM

Previous topic - Next topic

Rockoon


Are there any good examples that impliment an assemble-time stack to aid macros?

I want a stack of strings that I can push to and pop from during assembly .. the strings will be single character symbols (operators) and instruction operands (memory references) .. i'm really not sure where to start.

Is there more than one technique? Advantages/Disadvantages?
When C++ compilers can be coerced to emit rcl and rcr, I *might* consider using one.

MichaelW

The MASM32 switch macros implement a stack, and there are some generalized text stack macros, some for the switch$ macros, and some that Hutch created near the end of macros.asm. And you might search Zooba's posts.

Also, here is a newer version of my pushtext macro that corrects a problem I found while working on a text tokenizing macro that required a text stack:

  ;----------------------------------------------------------------
  ; This macro is a fixed replacement for the pushtext macro,
  ; that will work in combination with the existing poptext
  ; macro. The original version performed the recursive
  ; $text_stack$ assignment in a single statement, and because
  ; of this it would corrupt the (text) stack if the last equate
  ; pushed was assigned a value after it was pushed.
  ;----------------------------------------------------------------   

    _pushtext MACRO name:req
      local tmp
      tmp CATSTR <name#>, $text_stack$
      $text_stack$ TEXTEQU tmp
    ENDM

eschew obfuscation

zooba

A big disadvantage to keep in mind is that assemble-time strings have a limit of 255 characters (possibly 256, but I think 255).

What Michael has shown is the simplest way of doing it, and if you can customise the macro to the sort of items you will be pushing it can be very simple. For example, if you are only doing single characters you don't need to define any delimiter between entries. If you know of a character that will never appear you can use that.

A good place to start for doing stuff like this is the string operators CATSTR, SUBSTR and INSTR. In short, CATSTR will concatenate two or more strings (necessary for pushing items), SUBSTR will return part of a string (necessary for popping items) and INSTR will find a specified character or string in another string (also necessary for popping items if you have used a delimiter).

Here is a starting point for you:

; «««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««

$stack TEXTEQU <||>

$PUSH MACRO item
    $stack CATSTR <item||>, $stack
ENDM

$POP MACRO
    LOCAL item
    item SUBSTR $stack, 1, @InStr(1, %$stack, <||>) - 1
    $stack SUBSTR $stack, @InStr(1, %$stack, <||>) + 2
    EXITM <item>
ENDM

; «««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««

    $PUSH you?
    $PUSH are
    $PUSH how
    $PUSH <Hello,>
   
    %ECHO $POP() $POP() $POP() $POP()
   
; «««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««
.486
.model flat,stdcall

.code
Start:
END Start


It's possible to do much more complicated stuff than this. In fact, MASM's macro capabilities are good enough to implement a linked list or linked array to overcome the length limitation. (This is left as an exercise to the reader :wink )

Good luck

Cheers,

Zooba :U

Rockoon

I think 255 characters will be enough.

I've had some problems with @InStr() macro and I am not sure why. It seemed like I couldnt coerce it to expand my text symbols. The INSTR directive hasn't failed me yet though.

I prefer the @InStr() style :(

Maybe I don't have my head around what is really happening under the hood yet (The proper times to use a '%' prefix, when a param needs <>'s, and so forth)

Am I correct in assuming that when a macro initiates, all its parameters get expanded/translated? Same with vararg parameters?
When C++ compilers can be coerced to emit rcl and rcr, I *might* consider using one.

zooba

The rules regarding the % operator are horrible to figure out. Here is as much as I know for sure:

Macro parameters are replaced directly.
Macro variables (created with TEXTEQU) are replaced by the assembler when it uses the value.

And here's what I guess:

If the macro is looking and expecting a variable, it will use it appropriately. For example, the left operand of TEXTEQU/CATSTR/INSTR/etc will behave properly for a variable (though not a parameter, since it replaces directly).
If the macro is expecting text (for example, inside @InStr/@SizeStr/@CatStr) you need to put a % before the parameter to tell the assembler that it is a variable and not just text.

I don't really know where the % is meant to go. In some cases it works with it at the start of the line, sometimes it needs to be next to the variable itself. It is really trial and error.

Quote from: Rockoon on February 04, 2007, 11:25:48 AMAm I correct in assuming that when a macro initiates, all its parameters get expanded/translated? Same with vararg parameters?

It's only slightly harder to test these things than it is to assume it, but you are correct. The parameter values are replaced before executing the macro. VARARG parameters are no different, but you must keep in mind that they probably include commas. VARARG parameters are only really useful in FOR statements (for which they need angle brackets) or passing to other macros with a VARARG parameter (for which they don't need angle brackets).

An easy way to see how these things work is to play with this macro:

EchoList MACRO items:VARARG
    FOR i, <items>
        ECHO i
    ENDM
ENDM


Try passing in a single item, a number of items, a number of items inside angle brackets, etc. You'll get a good idea of how the assembler treats these things. This really is an area without good enough documentation, so experimentation is the best way to pick it up.

Enjoy

Zooba :U

Rockoon

I want to thank you guys, and to show off what I have been doing :)


@newstack macro stackname:req
stackname equ <>
stackname&items = 0
endm

@push macro stackname:req, value:req
local tmp
tmp catstr <value>, <#>, stackname
stackname textequ tmp
stackname&items = stackname&items + 1
endm

@pop macro stackname:req
local returnval, position
pos = 0
if stackname&items
stackname&items = stackname&items - 1
position instr 1, stackname, <#>
returnval substr stackname, 1, position - 1
if stackname&items
  stackname substr stackname, position + 1
  else
  stackname equ <>
  endif
exitm returnval
else
echo "@pop() stack underflow"
endif
endm

; -----------------------------------------------

@tokenizer macro tokens:req, equation:req
local character, token, tokensize
local blocks

token equ <>
blocks = 0

@newstack tokenizerstack

forc character, <equation>
if @InStr(1, < >, <character>)
if blocks
token catstr token, <character>
else
tokensize sizestr token
if tokensize
@push tokenizerstack, <token>
token textequ <>
endif
endif
elseif @InStr(1, <+-*/>, <character>)
if blocks
token catstr token, <character>
else
tokensize sizestr token
if tokensize
@push tokenizerstack, <token>
token textequ <>
endif
@push tokenizerstack, <character>
endif
else
if @InStr(1, <[>, <character>)
blocks = blocks + 1
elseif @InStr(1, <]>, <character>)
blocks = blocks - 1
endif
token catstr token, <character>
endif
endm
tokensize sizestr token
if tokensize
@push tokenizerstack, <token>
endif

while tokenizerstackitems
token textequ @pop(tokenizerstack)
@push tokens, <token>
endm
endm

x87rpn macro equation:req
local token, fpustacklevel

@newstack tokens
@tokenizer tokens, <equation>

fpustacklevel = 0

while tokensitems

token textequ @pop(tokens)

ifidn token, <0> ; -------- constants
fpustacklevel = fpustacklevel + 1
fldz
elseifidn token, <1>
fpustacklevel = fpustacklevel + 1
fld1
elseifidni token, <pi>
fpustacklevel = fpustacklevel + 1
fldpi
elseifidni token, <sqrt> ; -------- functions
fsqrt
elseifidni token, <sqr>
fmul st(0), st(0)
elseifidni token, <sincos>
fpustacklevel = fpustacklevel + 1
fsincos
elseifidni token, <sin>
fsin
elseifidni token, <cos>
fcos
elseifidn token, <+> ; -------- boolean operators
fpustacklevel = fpustacklevel - 1
faddp st(1), st(0)
elseifidn token, <->
fpustacklevel = fpustacklevel - 1
fsubp st(1), st(0)
elseifidn token, <*>
fpustacklevel = fpustacklevel - 1
fmulp st(1), st(0)
elseifidn token, </>
fpustacklevel = fpustacklevel - 1
fdivp st(1), st(0)
else
  fpustacklevel = fpustacklevel + 1
  fld token
endif

if fpustacklevel GT 8
  echo ** warning (x87rpn): fpu stack overflow (expression too complex)
endif
endm

if fpustacklevel LT 1
  echo ** warning (x87rpn): fpu stack underflow (too many pops)
elseif fpustacklevel GT 1
  echo ** warning (x87rpn): fpu stack contains more than 1 return value
endif
endm



an example (a somewhat common function):

.data

x   real8 1.0
y   real8 1.0
z   real8 1.0

.code

x87rpn x sqr y sqr z sqr + + sqrt



00000000  DD 05 00000000 R  2   fld ??0000
00000006  D8 C8      2 fmul st(0), st(0)
00000008  DD 05 00000008 R  2   fld ??0000
0000000E  D8 C8      2 fmul st(0), st(0)
00000010  DD 05 00000010 R  2   fld ??0000
00000016  D8 C8      2 fmul st(0), st(0)
00000018  DE C1      2 faddp st(1), st(0)
0000001A  DE C1      2 faddp st(1), st(0)
0000001C  D9 FA      2 fsqrt

When C++ compilers can be coerced to emit rcl and rcr, I *might* consider using one.

MichaelW

Interesting.

My tokenizer is modeled after the strtok function, returning one token per call, with delimiters that can be changed between calls etc. I had forgotten that the macro does not use a text stack (I had to fix the pushtext macro for a related project). Your x87rpn macro is just the sort of thing I had in mind when I started on the tokenizer. This is an implementation using my tokenizer:

; «««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««
    include \masm32\include\masm32rt.inc
; «««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««
  ;----------------------------------------------------------------
  ; This macro is functionally similar to the CRT strtok function,
  ; with the obvious differences that it runs at compile time, and
  ; returns the tokens as text.
  ;
  ; Arguments that contain embedded commas or leading or trailing
  ; whitespace must be enclosed in text delimiters (<>).
  ;----------------------------------------------------------------

    texttok MACRO text$, delimiter$
      LOCAL toklen, c$, cpos, texttok$
      IFNB <text$>
        __texttok__text$__    TEXTEQU <text$>
        __texttok__textlen__  SIZESTR <text$>
        __texttok__tokstart__ = 1
      ENDIF
      toklen = 0
      __texttok__prev__tokstart__ = __texttok__tokstart__
      WHILE (__texttok__tokstart__ + toklen) LE __texttok__textlen__
        c$ SUBSTR __texttok__text$__, (__texttok__tokstart__ + toklen), 1
        cpos INSTR <delimiter$>, c$
        IF cpos
          IF toklen
            EXITM
          ENDIF
          __texttok__tokstart__ = __texttok__tokstart__ + 1
        ELSE
          toklen = toklen + 1
        ENDIF
      ENDM
      IF toklen EQ 0
        texttok$ TEXTEQU <>     ;; Return null if no more tokens.
      ELSE
        texttok$ SUBSTR __texttok__text$__, __texttok__tokstart__, toklen
        __texttok__tokstart__ = __texttok__tokstart__ + toklen
      ENDIF
      EXITM <texttok$>
    ENDM

  ;----------------------------------------------------------------
  ; This macro effectively puts the last token returned by texttok
  ; back, so it can be returned again by the next call to texttok.
  ;----------------------------------------------------------------

    put_token_back MACRO
      __texttok__tokstart__ = __texttok__prev__tokstart__
    ENDM

; «««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««

x87rpn MACRO equation:req
    local token, fpustacklevel
    fpustacklevel = 0
    token TEXTEQU texttok(equation, < >)
    %echo token
    WHILE 1
      IFIDN token, <>
        EXITM
      ELSEIFIDN token, <0>
        fpustacklevel = fpustacklevel + 1
        fldz
      ELSEIFIDN token, <1>
        fld1
        fpustacklevel = fpustacklevel + 1
      ELSEIFIDNI token, <pi>
        fpustacklevel = fpustacklevel + 1
        fldpi
      ELSEIFIDNI token, <sqrt>
        fsqrt
      ELSEIFIDNI token, <sqr>
        fmul st(0), st(0)
      ELSEIFIDNI token, <sincos>
        fpustacklevel = fpustacklevel + 1
        fsincos
      ELSEIFIDNI token, <sin>
        fsin
      ELSEIFIDNI token, <cos>
        fcos
      ELSEIFIDNI token, <+>
        fpustacklevel = fpustacklevel - 1
        faddp st(1), st(0)
      ELSEIFIDNI token, <->
        fpustacklevel = fpustacklevel - 1
        fsubp st(1), st(0)
      ELSEIFIDNI token, <*>
        fpustacklevel = fpustacklevel - 1
        fmulp st(1), st(0)
      ELSEIFIDNI token, </>
        fpustacklevel = fpustacklevel - 1
        fdivp st(1), st(0)
      ELSE
        fpustacklevel = fpustacklevel + 1
        fld token
      ENDIF
      token TEXTEQU texttok( , < >)
      %echo token
    ENDM
    IF fpustacklevel LT 1
      echo ** warning (x87rpn): fpu stack underflow (too many pops)
    ELSEIF fpustacklevel GT 1
      echo ** warning (x87rpn): fpu stack contains more than 1 return value
    ENDIF
ENDM

; «««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««
    .data
      x REAL8 1.0
      y REAL8 1.0
      z REAL8 1.0
      r dd 0
    .code
; «««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««
start:
; «««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««
    x87rpn x sqr y sqr z sqr + + sqrt

    fistp r
    print ustr$(r),13,10

    fld x
    fmul st,st(0)   
    fld y
    fmul st,st(0)
    fld z
    fmul st,st(0)
    faddp st(1),st
    faddp st(1),st
    fsqrt

    fistp r
    print ustr$(r),13,10

    inkey "Press any key to exit..."
    exit
; «««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««
end start

eschew obfuscation

Rockoon

I have just discovered that my technique does not play well with ('s and )'s when trying to leverage it for an infix to postfix converter (not that I really desire to have an x87infix, which would produce non-obvious instructions) It seems that the preprocessor expects all ('s and to paired with )'s at least in certain cases.

I'll give your technique a shot but i'll have to modify it as I don't like forcing a seperator between operators (hidden assumptions = bad, especialy considering how bad the error messages are with macros), and I eventualy want to handle cases where neither a space nor operator is a token seperator within this context, such as:

assume esi:ptr vector3

[esi + ebx].x

My tokenizer seems to behave correctly with the above, although it certainly isnt extensively tested.

Hopefully I'll have an epiphany in the comming days and can rewrite everything and be able to say "ok, thats more than good enough"
When C++ compilers can be coerced to emit rcl and rcr, I *might* consider using one.