The MASM Forum Archive 2004 to 2012

General Forums => The Campus => Topic started by: delhibelly on June 28, 2011, 12:06:03 AM

Title: Multiplication using shift/add 32-bit
Post by: delhibelly on June 28, 2011, 12:06:03 AM
Hi, I am currently working on writing a prog that does 32-by-32bit multiplication using the shift/add algorithm. I managed to get a 16-by-16bit multiplication working using DX-AX to store the result and storing the multiplicand in BX register.
But can't seem to be able to extend that to 32bit version. If someone can point me out in the right direction, that would be great. I'm using x86. Thanks
Title: Re: Multiplication using shift/add 32-bit
Post by: clive on June 28, 2011, 12:24:54 AM
Post the code you have for the 16x16-bit version and where you've got with the 32x32-bit.

Are you trying to implement it with the 16-bit instruction set, or can you use the 32-bit instructions/registers of the 80386?

You could also probably optimize by having a 32x16 version, or switching the multiplier/multiplicand.

For non-homework assignments it's easier just to use the multiply instructions.
Title: Re: Multiplication using shift/add 32-bit
Post by: dedndave on June 28, 2011, 12:33:32 AM
... and faster   :P

welcome to the forum   :U
Title: Re: Multiplication using shift/add 32-bit
Post by: delhibelly on June 28, 2011, 03:18:10 AM
well the assignment is to use shift/add to multiply two 32-bit numbers. I can only use the 16-bit registers in my emulator.

Well here is the code I have so far, I know it's not the best way to do things but I'm kinda new to assembly and did the best I could.

mov si, offset A
mov di, offset B
    mov bx, offset product         
    mov dx, 0
    mov ax, 0
    mov bx, ds:di
    mov cx, 0
   
    BEGIN:   
        push bx
        and bx, 1h
        cmp bx, 0
        JE SHIFT_IT
           
            add dx, word ptr ds:si
           
        SHIFT_IT:
            pop bx
            push dx
            and dx, 01h
            cmp dx, 0
            JE DX_ZERO
           
            shr ax, 1
            or ax, 8000h
            pop dx
            shr dx, 1
            shr bx, 1
            inc cx
           
            JMP BEGIN
           
            DX_ZERO:
           
            shr ax, 1
            pop dx
            shr dx, 1
            shr bx, 1
            inc cx
           
            cmp bx, 0h
            JE DONE_PROD
           
            JMP BEGIN
           
DONE_PROD:
    mov temp, 10h
    sub temp, cx
    mov cx, temp
           
    LOOP_SET:
        push dx
        and dx, 1h
        cmp dx, 0h
        JE NO_ONE
            shr ax, 1
            or ax, 8000h
            pop dx
            shr dx, 1
            JMP END_LOOP
           
        NO_ONE:
            shr ax, 1
            pop dx
            shr dx, 1
           
        END_LOOP:
        loop LOOP_SET
           
so at the end of all this the product is stored in DX-AX and i store them in the memory allocated for the product. I'm lost as to how I can extend this to 32bits. I'm using up all my registers here. If I do store the multiplicand in DX-AX and multiplier in BX-CX then where do i temporarily store my 64bit product? That is my main concern as to how I can extend this to 32 bits. This code works a charm for 16bit multiplication.
Title: Re: Multiplication using shift/add 32-bit
Post by: dedndave on June 28, 2011, 03:34:24 AM
    mov bx, ds:di
            add dx, word ptr ds:si
try replacing those lines with
    mov bx,[di]
            add dx,[si]
the brackets tell the assembler you want the word at the address that is in DI or SI
the DS: segment override is not required, as DS is the default segment register for DI and SI in this addressing mode
WORD PTR is not required, because DX is a word size register - it can be no other size

that looks like a lot of code for a relatively simple operation   :P
(i have only glanced through it briefly)
although, you are out of registers quick because the product is 64 bits
good effort on your part   :U
Title: Re: Multiplication using shift/add 32-bit
Post by: delhibelly on June 28, 2011, 07:21:31 AM
Quote from: dedndave on June 28, 2011, 03:34:24 AM
    mov bx, ds:di
            add dx, word ptr ds:si
try replacing those lines with
    mov bx,[di]
            add dx,[si]
the brackets tell the assembler you want the word at the address that is in DI or SI
the DS: segment override is not required, as DS is the default segment register for DI and SI in this addressing mode
WORD PTR is not required, because DX is a word size register - it can be no other size

that looks like a lot of code for a relatively simple operation   :P
(i have only glanced through it briefly)
although, you are out of registers quick because the product is 64 bits
good effort on your part   :U

thanks for your help, I knew I had some redundant stuff that could be replaced with something faster or more efficient. I guess I could try making it more efficient. thing is I need to extend this to 32-by-32bit, any suggestions on how that can be achieved. whether it be pushing or storing results in temporary variables?
Title: Re: Multiplication using shift/add 32-bit
Post by: hutch-- on June 28, 2011, 07:44:19 AM
Here is a really crude ADD version.  :bg

Note that if it exceeds DWORD range that it just rolls over EAX and keeps adding.


IF 0  ; ¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤
                      Build this template with "CONSOLE ASSEMBLE AND LINK"
ENDIF ; ¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤

    include \masm32\include\masm32rt.inc

    slowmul PROTO :DWORD,:DWORD

    .code

start:
   
; ¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤

    call main
    inkey
    exit

; ¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤

main proc

    LOCAL var   :DWORD

    mov var, 1024*1024

    invoke slowmul,var,1024
    print ustr$(eax),13,10

    invoke slowmul,1024,var
    print ustr$(eax),13,10


    ret

main endp

; ¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤

slowmul proc arg1:DWORD,arg2:DWORD

    xor eax, eax
    mov ecx, arg1

    .if ecx > arg2
      mov ecx, arg1
      mov edx, arg2
    .else
      mov edx, arg1
      mov ecx, arg2
    .endif

  lbl0:
    add eax, ecx
    sub edx, 1
    jnz lbl0

    ret

slowmul endp

; ¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤

end start
Title: Re: Multiplication using shift/add 32-bit
Post by: dedndave on June 28, 2011, 10:34:32 AM
well - the product needs to be 64 bits, as i said
because you are limited to 16-bit registers, i would use the product variable as the accumulator
then, use 4 general registers as the shifter (which is the multiplicand or addend, depending how you look at it)
this requires that you zero out the product before starting
Title: Re: Multiplication using shift/add 32-bit
Post by: dedndave on June 28, 2011, 11:09:39 AM
this isn't tested and may have bugs - lol
also, it may not be the fastest
but, it gives you an idea of where to start
        .MODEL  Small
        .STACK  4096

;-------------------------------------------------------------

        .DATA

A       dd      12345678h
B       dd      23456789h

;------------------------

        .DATA?

product dw      4 dup(?)
count   dw      ?

;-------------------------------------------------------------

        .CODE
        ASSUME  DS:DGROUP

_main   PROC

        mov     ax,@DATA
        mov     ds,ax

        mov     si,offset A
        mov     di,offset B
        mov     bx,offset product
        call    LongMul

        mov     ax,4C00h
        int     21h

_main   ENDP

;-------------------------------------------------------------

LongMul PROC

        push    si
        push    di
        push    bp
        xor     ax,ax               ;zero AX register

;zero out the product

        mov     [bx],ax
        mov     [bx+2],ax
        mov     [bx+4],ax
        mov     [bx+6],ax

;load the multiplicand into DX:AX:DI:CX

        mov     dx,ax
        mov     cx,[di]
        mov     di,[di+2]

;test bit in BP and initialize count

        mov     bp,1
        mov     count,2

top_of_loop:
        test    bp,[si]
        jz      next_bit

        add     [bx],cx
        adc     [bx+2],di
        adc     [bx+4],ax
        adc     [bx+6],dx

next_bit:
        shl     cx,1
        rcl     di,1
        rcl     ax,1
        rcl     dx,1
        rol     bp,1
        jnc     top_of_loop

        add     si,2
        dec     count
        jnz     top_of_loop

        pop     bp
        pop     di
        pop     si
        ret

LongMul ENDP

;-------------------------------------------------------------

        END     _main
Title: Re: Multiplication using shift/add 32-bit
Post by: hutch-- on June 28, 2011, 12:32:05 PM
From the description of the original assignment, the term Russian Peasant Multiply comes to mind as it uses and ADD / div by 2 (shr reg, 1) technique and from memory its no big deal to do on a computer.

Somewhat later: This is a simple example of peasant multiply using SHIFT ADD. Not exhaustively tested but appears to be generating the correct numbers.

This is the URL for the method used. http://mathforum.org/dr.math/faq/faq.peasant.html


IF 0  ; ¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤
                      Build this template with "CONSOLE ASSEMBLE AND LINK"
ENDIF ; ¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤

    include \masm32\include\masm32rt.inc

    peasant_multiply PROTO :DWORD,:DWORD

    .code

start:
   
; ¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤

    call main
    inkey
    exit

; ¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤

main proc

    invoke peasant_multiply,65536,16
    print ustr$(eax),13,10

    invoke peasant_multiply,9,9
    print ustr$(eax),13,10

    invoke peasant_multiply,1024,16
    print ustr$(eax),13,10

    invoke peasant_multiply,13,59
    print ustr$(eax),13,10

    ret

main endp

; ¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤

peasant_multiply proc arg1:DWORD,arg2:DWORD

    mov ecx, arg1           ; load 1st arg into ECX
    mov edx, arg2           ; load 2nd arg into EDX
    xor eax, eax            ; zero EAX
    jmp testit              ; jump to test if arg2 is ODD or EVEN

  lbl0:
    add ecx, ecx            ; double arg1
    shr edx, 1              ; div arg2 by 2
    cmp edx, 1              ; compare it to 1
    jbe lbl1                ; exit loop if below or equal

  testit:
    test edx, 00000000000000000000000000000001b
    je lbl0                 ; jump back if its even

    add eax, ecx            ; accumulate ECX in EAX if EDX is odd
    jmp lbl0

  lbl1:
    add eax, ecx            ; add last ECX to EAX and exit

    ret

peasant_multiply endp

; ¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤

end start
Title: Re: Multiplication using shift/add 32-bit
Post by: dedndave on June 28, 2011, 02:46:28 PM
Quote from: delhibelly on June 28, 2011, 03:18:10 AM
well the assignment is to use shift/add to multiply two 32-bit numbers. I can only use the 16-bit registers in my emulator.
Title: Re: Multiplication using shift/add 32-bit
Post by: delhibelly on June 28, 2011, 06:42:10 PM
Wow, this is all great. Thanks a lot for your help. I'll look into what you guys said and try implementing something. My assignment is due tomorrow, I'll let you guys know what happens. Thanks a again.
Title: Re: Multiplication using shift/add 32-bit
Post by: dedndave on June 28, 2011, 07:08:23 PM
i wanted a 64-bit to ASCII decimal routine in 16-bit, so i added it to the program

see attached...
Title: Re: Multiplication using shift/add 32-bit
Post by: MichaelW on June 29, 2011, 04:32:16 AM
Implementing the algorithm described  here (http://courses.cs.vt.edu/~cs1104/BuildingBlocks/multiply.040.html), for a 64-bit result from two 32-bit multiplicands, you can use BSR to find the bit position of the leftmost 1 in the first multiplicand, SHLD to shift a 64-bit working copy of the second multiplicand into position, do a 64-bit addition of the shifted copy with the result using ADD followed by ADC, and use BTR to clear the leftmost 1 in the first multiplicand. Note that the 64-bit working copy is simply the second multiplicand zero-extended into the high-order DWORD, and it must be reloaded from the second multiplicand for each loop, discarding the previously shifted value. In my test the algorithm duplicated the results of MUL over almost the entire 32-bit range (actually 0 to 4294901760), and ran in ~150 cycles on a P3.
Title: Re: Multiplication using shift/add 32-bit
Post by: delhibelly on June 29, 2011, 04:57:16 AM
Quote from: dedndave on June 28, 2011, 07:08:23 PM
i wanted a 64-bit to ASCII decimal routine in 16-bit, so i added it to the program

see attached...

thanks. understood most of the code, works fine for me. Still working on my code though and think I am almost there.
Title: Re: Multiplication using shift/add 32-bit
Post by: delhibelly on June 29, 2011, 04:58:49 AM
Quote from: MichaelW on June 29, 2011, 04:32:16 AM
Implementing the algorithm described  here (http://courses.cs.vt.edu/~cs1104/BuildingBlocks/multiply.040.html), for a 64-bit result from two 32-bit multiplicands, you can use BSR to find the bit position of the leftmost 1 in the first multiplicand, SHLD to shift a 64-bit working copy of the second multiplicand into position, do a 64-bit addition of the shifted copy with the result using ADD followed by ADC, and use BTR to clear the leftmost 1 in the first multiplicand. Note that the 64-bit working copy is simply the second multiplicand zero-extended into the high-order DWORD, and it must be reloaded from the second multiplicand for each loop, discarding the previously shifted value. In my test the algorithm duplicated the results of MUL over almost the entire 32-bit range (actually 0 to 4294901760), and ran in ~150 cycles on a P3.

Actually did come across your website earlier. algorithm is explained very well and I kind of did use it for my 16-bit result but was having trouble extended it to 64 bits as I ran out of registers.
Title: Re: Multiplication using shift/add 32-bit
Post by: delhibelly on June 29, 2011, 05:02:03 AM
Quote from: dedndave on June 28, 2011, 10:34:32 AM
well - the product needs to be 64 bits, as i said
because you are limited to 16-bit registers, i would use the product variable as the accumulator
then, use 4 general registers as the shifter (which is the multiplicand or addend, depending how you look at it)
this requires that you zero out the product before starting

Excellent solution. thanks.
Title: Re: Multiplication using shift/add 32-bit
Post by: MichaelW on June 29, 2011, 08:11:25 AM
You are limited to using 16-bit registers? Using 32-bit registers and the instructions that I listed would make the task much easier. Where your 16/16 code uses 48 instructions, my 32/64 code encapsulated in a procedure uses only 17 instructions.

This is a demo of the support code that I used:

;==============================================================================
    include \masm32\include\masm32rt.inc
;==============================================================================

printf MACRO format:REQ, args:VARARG
    IFNB <args>
        invoke crt_printf, cfm$(format), args
    ELSE
        invoke crt_printf, cfm$(format)
    ENDIF
    EXITM <>
ENDM

;==============================================================================

bin$ MACRO val
    ifndef __bin$_buff__
        .data
        align 4
            __bin$_buff__  db 68 dup(0)
        .code
    endif
    IF TYPE val EQ 8
        mov eax, DWORD PTR val
        mov edx, DWORD PTR val+4
    ELSEIF TYPE val EQ 4
        mov eax, DWORD PTR val
        mov edx, 0
    ELSE
        ECHO -----------------------------------
        ECHO bin$ arg must be DWORD or QWORD
        ECHO -----------------------------------
        .ERR
    ENDIF
    invoke crt__ui64toa, edx::eax, ADDR __bin$_buff__, 2
    EXITM <eax>
ENDM

;==============================================================================
    .data
        p dq 0
        q dq 0
    .code
;==============================================================================
start:
;==============================================================================

    xor ebx, ebx
    .WHILE ebx < 100
        printf( "%s\n", bin$(ebx) )
        inc ebx
    .ENDW

    .WHILE ebx < 200
        mov DWORD PTR p, ebx
        mov DWORD PTR p+4, ebx
        printf( "%s\n", bin$(p) )
        inc ebx
    .ENDW

    printf( "\n%I64Xh\n", p )
    printf( "%I64d\n\n", p )

    inkey "Press any key to exit..."
    exit
;==============================================================================
end start

Title: Re: Multiplication using shift/add 32-bit
Post by: delhibelly on June 29, 2011, 08:54:40 AM
@ MichaelW - I'm using the emu8086 which is emulator for 8086 processor. That is why I cannot access the 32 bit registers, hence running out of registers. If I could my life would be a whole lot easier and I could definitely use the code you have  :bg
Title: Re: Multiplication using shift/add 32-bit
Post by: FORTRANS on September 23, 2011, 09:52:01 PM
Quote from: dedndave on June 28, 2011, 11:09:39 AM
also, it may not be the fastest

Hi,

   I scarfed up your code to check against my implementation
of the shift and add multiply.  Finally got around to timing the
two fixed point multiplies I wrote for my arbitrary precision
set of routines.  One is a shift and add as above and one is
using a multiply terms and accumulate algorithm.

  Test with both multiply routines on the HP200LX.
W/WO display  FixMulM ~7.8/1.0 Seconds  FixMulA ~13.8/7.1 Seconds.
80186 MUL mem 41-43 cycles, RCL mem 5.

Regards.

Steve