i want to multiply matrix X of dimention m×k by Y matrix of dimention k×n
i wrote the following code but i have been short of registers to point to the resulting matrix
;==================================================
;
; PROGRAM : top.asm
;
; AUTHOR : ahmed el talkhawy
;
; PURPOSE : matrix matrix multiplication
;
;==================================================
.model small
;-----------------------------------------------
.data
m equ 2
k equ 2
n equ 2
xmat dw 1,1,1,1
ymat dw 1,1,1,1
result dw m*n*2 dup (0)
;-----------------------------------------------
.code
main: mov ax,@data
mov ds,ax
call matrixmul
mov ah,4ch
int 21h
;--------------------------------------------------------------------------
matrixmul proc near
mov si,0
mov ax,0
mov cl,m
lp2: mov ch,n
mov bx,0
lp1: call vectormul
add bx,2
dec ch
jnz lp1
add si,k*2
dec cl
jnz lp2
ret
matrixmul endp
;--------------------------------------------------------------------------
vectormul proc near
push cx
push si
push bx
mov bp,0
mov di,0
mov cl,k
lp: mov ax,xmat[si]
mul ymat[bx]
add bp,ax
adc di,dx
add si,2
add bx,2*n
dec cl
jnz lp
pop bx
pop si
pop cx
mov result[si][bx],bp ;wrong try
mov result[si][bx+2],di ;wrong try
ret
vectormul endp
;--------------------------------------------------------------------------
scalmul proc near
mov ax,xmat[si]
mul ymat[bx]
add bp,ax
adc di,dx
ret
scalmul endp
;--------------------------------------------------------------------------
end main
can any one help me here??
If you are short of registers then you can save stuff to memory with push or mov.
AeroASM,
Sorry, I missed your posting as I did not examine the full topic, as soon as I saw int 21h, I moved it.
ahmed_eltalkhawy,
Please post your questions in this area. You will probably get more responses from here, also.
Paul
the problem is that i don't want to accumlate the results of the vectmul proc. in the memory because that will make my program slower
ahmed,
By the way, welcome to the forum, I hope you find this a useful relationship. About accumulating results in memory; well, you have to live with what you have and the register set is very small. Have you tried using the stack? Be careful!
Paul
I know what matrix matrix multiplication is, but I am having a little diffictuly understnading the code. PLease could you post a commented version?
Using memory is not as slow as you might think, because of the prefetch cycle.
i will send a new version of the code with comments but if you could send me any matrix matrix multiplication code without accssesing the memory a lot in the loops
here is the commented code
;==================================================
;
; PROGRAM : top.asm
;
; AUTHOR : ahmed el talkhawy
;
; PURPOSE : matrix matrix multiplication
;
;==================================================
.model small
;-----------------------------------------------
.data
m equ 2 ;no. of rows in xmat
k equ 2 ;no. of colmons in xmat and rows in ymat
n equ 2 ;no. of colmons in ymat
xmat dw 1,1,1,1
ymat dw 1,1,1,1
result dw m*n*2 dup (0)
;-----------------------------------------------
.code
main: mov ax,@data
mov ds,ax
call matrixmul
mov ah,4ch
int 21h
;--------------------------------------------------------------------------
matrixmul proc near
mov si,0 ;points to rows in xmat
mov ax,0
mov cl,m
lp2: mov ch,n ;no. of col to multiplied
mov bx,0 ;points to cols in ymat
lp1: call vectormul
add bx,2*n ;points to next colomn in ymat
dec ch
jnz lp1 ;loop until the row of xmat is multiplied with every col in ymat
add si,k*2 ;points to next row in xmat
dec cl ;loop until every row of xmat is multiplied with every col in ymat
jnz lp2
ret
matrixmul endp
;--------------------------------------------------------------------------
vectormul proc near
push cx
push si
push bx
mov bp,0 ;register to accumlate the result
mov di,0 ;register to accumlate the result
mov cl,k ;loop by the number of elements in the xmat's row or ymat's colomn
lp: mov ax,xmat[si]
mul ymat[bx]
add bp,ax ;save the results to be accumlated
adc di,dx ;save the results to be accumlated
add si,2 ;points to next element in xmat's row
add bx,2*n ;points to next element in ymat's colomn
dec cl
jnz lp
pop bx
pop si
pop cx
mov result[si][bx],bp ;wrong try needs another register to point to result matrix
mov result[si][bx+2],di ;wrong try
ret
vectormul endp
;--------------------------------------------------------------------------
end main
Aero, instructions that access memory are much slower on the 8088/8086. For example:
mov reg,reg ;2 clocks
mov reg,mem ;9 clocks + EA clocks
add reg,reg ;3 clocks
add reg,mem ;9 clocks + EA clocks
EA (effective address) clocks vary from 5 for base or index (indirect memory operands) to 12 for base plus index plus displacement. So for a mov instruction you are looking at 2 clocks versus 14-21 clocks.
Ahmed, you could use CX for your additional register and control the loop by comparing SI to a pre-calculated termination value. Or if your code size is not restricted you could unroll the loop.
dear MichaelW
you are completly right,but the the two solutions you said won't fit me because my program should be general to any matixs dimentions so i couldn't unroll the loop, and the compare also takes much time , so i need to free up any another register in the vectormul procedure
Use a segment register like es (someone will probably tell me off for telling you this, but hey.)
Ahmed,
You could use DX for the additional register and push/pop it around the multiply. Push and pop are slow, but not as slow as most other memory operations. If you in-lined the vectormul code in matrixmul you could eliminate the relatively slow call and ret instructions.
MichaelW
you are right but it is better to use di instead of dx because i will need it only to store the elements in the result matrix