News:

MASM32 SDK Description, downloads and other helpful links
MASM32.com New Forum Link
masmforum WebSite

Ultrashort procs

Started by jj2007, August 08, 2009, 03:42:52 PM

Previous topic - Next topic

jj2007

I was trying to size-optimise procedures, and found this for procs with three arguments:

Size no frame A:        7
Size no frame B:        15
Size with frame:        16


Full code below, for joyful playing :bg

.nolist
include \masm32\include\masm32rt.inc
.686
include \masm32\macros\timers.asm

LOOP_COUNT = 5000
RC = 1000

TestNoFrameA PROTO: DWORD, :DWORD, :DWORD
TestNoFrameB PROTO: DWORD, :DWORD, :DWORD
TestWithFrame PROTO: DWORD, :DWORD, :DWORD

.code
start:
mov esi, esp ; for debugging and testing, we check the stack
REPEAT 3
invoke Sleep, 100
counter_begin LOOP_COUNT, HIGH_PRIORITY_CLASS
REPEAT RC
invoke TestNoFrameA, 1, 2, 3
ENDM
counter_end
print str$(eax), 9, "millicycles for TestNoFrameA", 13, 10

counter_begin LOOP_COUNT, HIGH_PRIORITY_CLASS
REPEAT RC
invoke TestNoFrameB, 1, 2, 3
ENDM
counter_end
print str$(eax), 9, "millicycles for TestNoFrameB", 13, 10

counter_begin LOOP_COUNT, HIGH_PRIORITY_CLASS
REPEAT RC
invoke TestWithFrame, 1, 2, 3
ENDM
counter_end
print str$(eax), 9, "millicycles for TestWithFrame", 13, 10, 10
ENDM

invoke TestNoFrameA, 1, 2, 3
print "Size no frame A: ", 9
mov eax, TestNoFrameA_END
sub eax, TestNoFrameA
print str$(eax), 13, 10

invoke TestNoFrameB, 1, 2, 3
print "Size no frame B: ", 9
mov eax, TestNoFrameB_END
sub eax, TestNoFrameB
print str$(eax), 13, 10

invoke TestWithFrame, 1, 2, 3
print "Size with frame: ", 9
mov eax, TestWithFrame_END
sub eax, TestWithFrame
print str$(eax), 13, 10

sub esi, esp
.if !Zero?
MsgBox 0, str$(esi), "Problem with stack", MB_OK
.endif
exit

OPTION PROLOGUE:NONE
OPTION EPILOGUE:NONE
TestNoFrameA proc arg1, arg2, arg3
  pop ecx ; pop ret address
  pop eax ; pop arg1
  pop edx ; pop arg2
  xchg ecx, [esp] ; exchange ret address with arg3
;  .if eax!=1 || edx!=2 || ecx!=3
MsgBox 0, "Problem NA", "Hi", MB_OK
;  .endif
  ret
TestNoFrameA endp
TestNoFrameA_END:

TestNoFrameB proc arg1, arg2, arg3
  mov eax, [esp+4]
  mov edx, [esp+8]
  mov ecx, [esp+12]
;  .if eax!=1 || edx!=2 || ecx!=3
MsgBox 0, "Problem NB", "Hi", MB_OK
;  .endif
  ret 12
TestNoFrameB endp
TestNoFrameB_END:

OPTION PROLOGUE:PrologueDef
OPTION EPILOGUE:EpilogueDef
TestWithFrame proc arg1, arg2, arg3
  mov ecx, arg3
  mov edx, arg2
  mov eax, arg1
;  .if eax!=1 || edx!=2 || ecx!=3
MsgBox 0, "Problem WF", "Hi", MB_OK
;  .endif
  ret
TestWithFrame endp
TestWithFrame_END:
end start

dedndave

i like that plan - as long as you only need to grab the parm once
i am working on a couple routines right now that are passed only a single parm - a pointer to a structure
i was using

        push    ebp
        mov     ebp,[esp+8]
.
.
.
        pop     ebp
        ret     4

then, i access the structure with ebp instead of a stack frame

this looks good.....

        pop     ecx
        pop     eax
        xchg    eax,ebp
        push    ecx
        push    eax
.
.
.
        pop     ebp
        ret                  ;no pops

(prologue and epilogue turned off, of course)

qWord

jj, try to beat this:

argument macro num
EXITM @CatStr(<[esp+4+>,%(num*4),<]>)
endm

TheOne proto C :DWORD,:DWORD,:DWORD

OPTION PROLOGUE:NONE
OPTION EPILOGUE:NONE
TheOne proc C arg1, arg2, arg3
;mov eax,argument(1)
ret
TheOne endp
OPTION PROLOGUE:PrologueDef
OPTION EPILOGUE:EpilogueDef


:green
FPU in a trice: SmplMath
It's that simple!

jj2007

Quote from: qWord on August 08, 2009, 06:41:07 PM
jj, try to beat this:

Hmmm.... not sure what you mean ::)

Size no frame A:        7 (SEVEN)
Size no frame B:        15
Size TheOne:            13
Size with frame:        16


qWord

Quote from: jj2007 on August 08, 2009, 07:40:30 PM
Hmmm.... not sure what you mean ::)
I was only a bit joking  :P - without mov it gets 1 byte
FPU in a trice: SmplMath
It's that simple!

jj2007

Quote from: qWord on August 08, 2009, 08:40:06 PM
Quote from: jj2007 on August 08, 2009, 07:40:30 PM
Hmmm.... not sure what you mean ::)
I was only a bit joking  :P - without mov it gets 1 byte

:bg

I would not recommend this for an innermost loop, it costs about 30 cycles. But for non-speed critical tasks, this little trick saves quite a number of bytes...

jj2007

@qWord:
Just looked at your proposal with Olly, and finally understood what the C calling convention means: Masm inserts add esp, 12 after each call. So size for TheOne should read 13+3*n :bg
Pretty expensive, I wonder whether there is any good reason to use this.

jj2007

Here is a test of possible VARARG versions. They all do the same:
Quoteinvoke TestC2,  2, offset MyString1, offset MyString2
... where 2 is the variable number of arguments passed. Of course, there is also a macro that counts the args:
Quoteinvva TestA, offset MyString1, offset MyString2

Output with 1, 2 and 3 variables:
A string
A string concatenation demo
A string concatenation demo using a variable number of arguments


Sizes of proc skeleton including loop and stack correction:
18      TestA: in contrast to invoke with C calling convention, no extra bytes per call
12      TestB: no return value, very limited but ultrashort
12      TestC1: +3 bytes per call, order of args inverted, clumsy
14      TestC2: +3 bytes per call

My favourite is TestA - short and flexible, called with invva TestA, offset MyString1, offset MyString2:
TestA proc ; varargs on stack, no need for PROTO & OPTION PRO
  push ebx
  mov ebx, [esp+8]
  .Repeat
dec ebx
.Break .if Sign?
push [esp+4*ebx+12] ; or do other stuff
call StdOut
  .Until 0
  pop ebx
  pop edx ; pop ret address
  pop ecx ; pop counter
  lea esp, [esp+4*ecx] ; stack correction
  jmp dword ptr edx ; returns StdOut eax
TestA endp