how to do matrix matrix multiplication with 8086 in masm 5.00

Started by ahmed_eltalkhawy, May 03, 2005, 08:31:51 PM

Previous topic - Next topic

ahmed_eltalkhawy

i want to multiply matrix X of dimention m×k by  Y matrix of dimention k×n

i wrote the following code but i have been short of registers to point to the resulting matrix

;==================================================
;
; PROGRAM : top.asm
;
; AUTHOR  : ahmed el talkhawy
;
; PURPOSE : matrix matrix multiplication
;
;==================================================

.model small
;-----------------------------------------------
.data
   m   equ   2
   k   equ   2
   n   equ   2
   xmat   dw 1,1,1,1
   ymat   dw 1,1,1,1
   result   dw m*n*2   dup (0)
;-----------------------------------------------
.code
main:   mov   ax,@data
   mov   ds,ax
   call   matrixmul
   mov   ah,4ch
   int   21h
;--------------------------------------------------------------------------
matrixmul   proc   near
   mov   si,0
   mov   ax,0
   mov   cl,m
lp2:   mov   ch,n
   mov   bx,0
lp1:   call   vectormul
   add   bx,2
   dec   ch
   jnz   lp1
   add   si,k*2
   dec   cl
   jnz   lp2
   ret
matrixmul   endp
;--------------------------------------------------------------------------
vectormul   proc   near
   push   cx
   push   si
   push   bx
   mov   bp,0
   mov   di,0
   mov   cl,k
lp:   mov   ax,xmat[si]
   mul   ymat[bx]
   add   bp,ax
   adc   di,dx
   add   si,2
   add   bx,2*n
   dec   cl
   jnz   lp
   pop   bx
   pop   si
   pop   cx
   mov   result[si][bx],bp                 ;wrong try
   mov   result[si][bx+2],di              ;wrong try
   ret
vectormul   endp
;--------------------------------------------------------------------------
scalmul   proc   near
   mov   ax,xmat[si]
   mul   ymat[bx]
   add   bp,ax
   adc   di,dx
   ret
scalmul   endp
;--------------------------------------------------------------------------
   end   main






can any one help me here??

AeroASM

If you are short of registers then you can save stuff to memory with push or mov.

pbrennick

AeroASM,
Sorry, I missed your posting as I did not examine the full topic, as soon as I saw int 21h, I moved it.

ahmed_eltalkhawy,
Please post your questions in this area.  You will probably get more responses from here, also.

Paul

ahmed_eltalkhawy

the problem is that i don't want to accumlate the results of the vectmul proc. in the memory because that will make my program slower

pbrennick

ahmed,
By the way, welcome to the forum, I hope you find this a useful relationship.  About accumulating results in memory; well, you have to live with what you have and the register set is very small.  Have you tried using the stack?  Be careful!

Paul

AeroASM

I know what matrix matrix multiplication is, but I am having a little diffictuly understnading the code. PLease could you post a commented version?

Using memory is not as slow as you might think, because of the prefetch cycle.

ahmed_eltalkhawy

i will send a new version of the code with comments but if you could send me any matrix matrix multiplication code without accssesing the memory a lot in the loops

ahmed_eltalkhawy

here is the commented code

;==================================================
;
; PROGRAM : top.asm
;
; AUTHOR  : ahmed el talkhawy
;
; PURPOSE : matrix matrix multiplication
;
;==================================================

.model small
;-----------------------------------------------
.data
   m   equ   2 ;no. of rows in xmat
   k   equ   2  ;no. of colmons in xmat and rows in ymat
   n   equ   2  ;no. of colmons in ymat
   xmat   dw 1,1,1,1
   ymat   dw 1,1,1,1
   result   dw m*n*2   dup (0)
;-----------------------------------------------
.code
main:   mov   ax,@data
   mov   ds,ax
   call   matrixmul
   mov   ah,4ch
   int   21h
;--------------------------------------------------------------------------
matrixmul   proc   near
   mov   si,0                          ;points to rows in xmat
   mov   ax,0
   mov   cl,m
lp2:   mov   ch,n                  ;no. of col to multiplied
   mov   bx,0                        ;points to cols in ymat
lp1:   call   vectormul
   add   bx,2*n                     ;points to next colomn in ymat
   dec   ch
   jnz   lp1                             ;loop until the row of xmat is multiplied with every col in ymat
   add   si,k*2                       ;points to next row in xmat
   dec   cl                               ;loop until every row of xmat is multiplied with every col in ymat
   jnz   lp2
   ret
matrixmul   endp
;--------------------------------------------------------------------------
vectormul   proc   near
   push   cx
   push   si
   push   bx
   mov   bp,0                ;register to accumlate the result
   mov   di,0                ;register to accumlate the result
   mov   cl,k                ;loop by the number of elements in the xmat's row or ymat's colomn
lp:   mov   ax,xmat[si]
   mul   ymat[bx]
   add   bp,ax                     ;save the results to be accumlated
   adc   di,dx                     ;save the results to be accumlated
   add   si,2                    ;points to next element in xmat's row
   add   bx,2*n                   ;points to next element in ymat's colomn
   dec   cl
   jnz   lp
   pop   bx
   pop   si
   pop   cx
   mov   result[si][bx],bp                 ;wrong try needs another register to point to result matrix
   mov   result[si][bx+2],di              ;wrong try
   ret
vectormul   endp
;--------------------------------------------------------------------------
   end   main

MichaelW

Aero, instructions that access memory are much slower on the 8088/8086. For example:

mov   reg,reg   ;2 clocks
mov   reg,mem   ;9 clocks + EA clocks

add   reg,reg   ;3 clocks
add   reg,mem   ;9 clocks + EA clocks

EA (effective address) clocks vary from 5 for base or index (indirect memory operands) to 12 for base plus index plus displacement. So for a mov instruction you are looking at 2 clocks versus 14-21 clocks.

Ahmed, you could use CX for your additional register and control the loop by comparing SI to a pre-calculated termination value. Or if your code size is not restricted you could unroll the loop.
eschew obfuscation

ahmed_eltalkhawy

dear MichaelW
you are completly right,but the the two solutions you said won't fit me because my program should be general to any matixs dimentions so i couldn't unroll the loop, and the compare also takes much time , so i need to free up any another register in the vectormul procedure

AeroASM

Use a segment register like es (someone will probably tell me off for telling you this, but hey.)

MichaelW

Ahmed,

You could use DX for the additional register and push/pop it around the multiply. Push and pop are slow, but not as slow as most other memory operations. If you in-lined the vectormul code in matrixmul you could eliminate the relatively slow call and ret instructions.
eschew obfuscation

ahmed_eltalkhawy

MichaelW
you are right but it is better to use di instead of dx because i will need it only to store the elements in the result matrix