News:

MASM32 SDK Description, downloads and other helpful links
MASM32.com New Forum Link
masmforum WebSite

macro for division by a constant (magic division)

Started by qWord, October 02, 2009, 02:11:19 AM

Previous topic - Next topic

qWord

while updating my program for determining 'magic' numbers, I've got the idea to transform the algorithm to a macro, that calculate all needed values at compile time (not at runtime). Here is an example how to use these macro named 'cdiv':
include ConstDiv.inc
...
mov eax,9999999
; divide eax by 100000
cdiv 100000
; edx = quotient

for this example the macro generates:

mov eax,9999999
mov edx,0A7C5AC47h
add eax,1
.if !CARRY?
mul edx
.endif
shr edx,16


Like the div-instruction, the macro has one parameter (constant) and trash eax and edx. I've test it with various values, and it seems like it works proper. Depending on the divisor, the macro will produce four different code-solutions (see the macro definition).
The macro has been tested with masm version 6-10 without problems - however, due to the needed of 64bit-arithmetic, the macro may be slow down building process on (very) old PC's. On my Core2Duo the compile time for 100 calls to cdiv (algorithm of type 1, which is the costliest) is 1.2 second.

So, what do you think about?

regards, qWord

-------

10-07-2009: fixed a smal bug and add two new macros: cmod (remainder) and cDivMod (quotient + remainder)
11-07-2009: bug fix
03-12-2012: bug fix, jWasm supported (JWasm v2.07pre)
FPU in a trice: SmplMath
It's that simple!

ecube

I think that's awesome, dividing by multiplying is really neat, nice work  :U

raymond

The main purpose of "magic numbers" is to reduce overall computation time. However, it can only be effective if the divisor is a constant either known at compile time, or one that will become known according to some input and later used numerous times. Computing a "magic number" and its code for each division is a real waste of time.

Macros are also designed to insert "in-line" code. Imagine performing hundreds of divisions with different divisors with all that code to compute magic numbers inserted so many times. This would certainly ressemble typical HLL bloating.

The only macro which could be useful in such context would be one which would compute the parameters for any "constant" expected to be used several times afterwards and store them for later use in a struct. For constants known before compiling, this would only save the task of running some other small app to get the required data and insert it directly in the source code. For constants unknown at compile time, some other macro could always be designed to access the correct code, with the struct name and the dividend as the parameters.

When you assume something, you risk being wrong half the time
http://www.ray.masmcode.com

qWord

The macro determinate the needed code-snippet,multiplier and shift-value at compile time, not at runtime:
e.g.:

mov eax,123456789
cdiv 1000

the macro-generated code:

mov eax,123456789
mov edx,010624DD3h
mul edx
shr edx,06h

FPU in a trice: SmplMath
It's that simple!

raymond

Quotenot at runtime:

My apology. I had not had time to look at your macro and your post was not clear about its intended purpose. It should now make it very clear for other readers. :clap:
When you assume something, you risk being wrong half the time
http://www.ray.masmcode.com

qWord

Quote from: raymond on October 03, 2009, 02:09:27 AM
It should now make it very clear for other readers. :clap:
sorry - the description was a bit inaccurate  :red

regrads, qWord
FPU in a trice: SmplMath
It's that simple!

Jimg


ecube

can this be used with mod? mod is division only the result is the remainer I believe.

dedndave

yes - the mod function has a lot of uses
for multiply to divide, the remainder is not always given, but can be acquired with an additional MUL
here is some code of Drizz's that i modified a bit
it actually extracts the "remainder" in 3 pieces
as i remember, this thing takes less than 30 clock cycles

;eax = dividend (0 to 9999)
;divisor = 1,000

        mov     edx,4294968          ;first digit multiplier
        mul     edx                  ;extend the 1st digit
        mov     ebx,10               ;remaining digit multiplier
        mov     ecx,edx              ;digit 1 in cl
        mul     ebx                  ;extend the 2nd digit
        mov     ch,dl                ;digit 2 in ch
        mul     ebx                  ;extend the 3rd digit
        bswap   ecx                  ;digits 1 & 2 up high
        mov     ch,dl                ;digit 3 in ch
        mul     ebx                  ;extend the 4th digit into dl
        lea     ecx,[edx+ecx+'0000'] ;combine and make ASCII
        bswap   ecx                  ;re-order bytes

;ascii digits in ecx ('0000' to '9999')

another piece of "Drizz code"
this one calculates the product of the quotient and divisor, then subtracts that from the dividend to get the remainder
this one is really fast ~18 clock cycles i think

;eax = dividend (0 to 4294967295)
;divisor = 10,000

        mov     edx,0D1B71759h       ;2^45/10,000
        mov     ecx,eax              ;save the dividend
        mul     edx                  ;multiply to divide
        shr     edx,13               ;adjust the quotient
        imul    eax,edx,10000        ;find the product
        sub     ecx,eax              ;subtract the difference

;edx = quotient
;ecx = remainder

raymond

It can be used for mod with any divisor as follows:


;after multiplying by the "magic number" but before shifting EDX
   mov  ecx,s  ;the shifting requirement
   shrd eax,edx,cl
   inc  eax
   push edx    ;save it if you also need the quotient
   mul  X      ;X would be your original "divisor"
               ;->EDX is the remainder (mod)
   pop  eax    ;retrieve the previous EDX if needed for the quotient
   shr  eax,cl ;->EAX=quotient if you need it
When you assume something, you risk being wrong half the time
http://www.ray.masmcode.com

qWord

there was a little bug inside the macro: when dividing by 2,4,8,16...  the result was stored in eax (it should stored in edx).
I've also add two macros: one for calculating the modulo and one returning both, remainder and quotient.
cmod imm32
; EDX = EAX MOD imm32


cDivMod imm32
; EDX = EAX MOD imm32
; EAX = EAX / imm32


regards, qWord
FPU in a trice: SmplMath
It's that simple!

Biterider

Hi qWord
First of all, thanks for the macro collection.
I successfully use them with ML and now I'm moving to JWasm (206 & 207) where there are making some troubles. JWasm is reporting several errors like

DebugCenter_Main.inc(160) : Error A2235: Constant value too large: 640000000h
div_u64(25)[ConstDiv.inc]: Macro called from
  make_c32(41)[ConstDiv.inc]: Macro called from
   cdiv(11)[ConstDiv.inc]: Macro called from
    DebugCenter_Main.inc(160): Included by
     DebugCenter.asm(44): Main line code


when using e.g. cdiv 100. The values seem not to fit 32 bit. Strange is that ML doesn't complain about it.

The second issue is that JWasm doesn't like the error messages. The macro keyword is making problems there. Removing is solves this issue.

Do I have a chance that you revise the macros?

Regards, Biterider

qWord

hi Biterider,

It seems like that jWasm has a bug, thus it returns 64Bit values instead of (S)DWORDs when expanding constants. I will try to solve the problem using the LOW/HIGH32 operator or some other hack...

regards qWord
FPU in a trice: SmplMath
It's that simple!

qWord

New version attached in first post.
Using this macros with jWasm should be no more problem (hopefully)  :bg
(tested with JWasm v2.07pre)

qWord
FPU in a trice: SmplMath
It's that simple!

Biterider

Thanks qWord!
I tested the macros in 3 different sitations and the result is correct.
What was the problem? Is it really a jWasm bug?

Regards, Biterider