I am building a register level convolultion algorithm (with maximum size 4x4) and I need help building one of my macros.
I have the idea I just don't know how to correctly express it
I have the xmm registers renamed like this:
A0 TEXTEQU <xmm1>
A1 TEXTEQU <xmm2>
A2 TEXTEQU <xmm3>
A3 TEXTEQU <xmm4>
B0 TEXTEQU <xmm5>
B1 TEXTEQU <xmm6>
B2 TEXTEQU <xmm7>
B3 TEXTEQU <xmm8>
I have the accumulator registers named like this:
v_3 TEXTEQU <xmm9>
v_2 TEXTEQU <xmm10>
v_1 TEXTEQU <xmm11>
v0 TEXTEQU <xmm12>
vP1 TEXTEQU <xmm13>
vP2 TEXTEQU <xmm14>
vP3 TEXTEQU <xmm15>
they stand for v(-3), v(-2), v(-1), v(0), v(1), v(2), v(3)
The macro I post below is basically what I want to do, but I don't know how to express it in the compile-time language syntax. Can anyone help me out?
conv_axb_Reg a, b
i EQ 0
j EQ 0
; this line is pseudo-code
STR_LIST = {v_3, v_2, v_1, v0, vP1, vP2, vP3}
WHILE i LT a
WHILE j LT b
;
mulAccum A&i, B&j, STR_LIST(i-j), xmm0 ; v(i-j) += a(i)*b(j), using xmm0 as a temporary register
ENDM
ENDM
you may show us a example-call and the code, which should be produced by the macro.
conv_axb_Reg 4,4 should generate this code:
(Also, I can change the names of the v variables if that makes the macro easier to write).
mulAccum A0, B3, v_3, xmm0
mulAccum A0, B2, v_2, xmm0
mulAccum A0, B1, v_1, xmm0
mulAccum A0, B0, v0, xmm0
mulAccum A1, B3, v_2, xmm0
mulAccum A1, B2, v_1, xmm0
mulAccum A1, B1, v0, xmm0
mulAccum A1, B0, vP1, xmm0
mulAccum A2, B3, v_1, xmm0
mulAccum A2, B2, v0, xmm0
mulAccum A2, B1, vP1, xmm0
mulAccum A2, B0, vP2, xmm0
mulAccum B3, A3, v0, xmm0
mulAccum B2, A3, vP1, xmm0
mulAccum B1, A3, vP2, xmm0
mulAccum B0, A3, vP3, xmm0
sorry, but what is mulAccum? What datatypes are you using?
mulAccum MACRO A,B,v,tmp
movaps tmp, A
mulps tmp, B
addps v, tmp
ENDM
it may not produce the correct output, but it should give you the idea how to solve your problem:
Convolution macro a,b
LOCAL i,j
i=0
WHILE i LT a
j=0
WHILE j LT b
mulAccum @CatStr(<xmm>,%1+i),@CatStr(<xmm>,%5+j),@CatStr(<xmm>,%12+i-j),xmm0
j = j + 1
ENDM
i = i + 1
ENDM
endm
Quote
Convolution 4,4
mulAccum xmm1,xmm5,xmm12,xmm0
mulAccum xmm1,xmm6,xmm11,xmm0
mulAccum xmm1,xmm7,xmm10,xmm0
mulAccum xmm1,xmm8,xmm9,xmm0
mulAccum xmm2,xmm5,xmm13,xmm0
mulAccum xmm2,xmm6,xmm12,xmm0
mulAccum xmm2,xmm7,xmm11,xmm0
mulAccum xmm2,xmm8,xmm10,xmm0
mulAccum xmm3,xmm5,xmm14,xmm0
mulAccum xmm3,xmm6,xmm13,xmm0
mulAccum xmm3,xmm7,xmm12,xmm0
mulAccum xmm3,xmm8,xmm11,xmm0
mulAccum xmm4,xmm5,xmm15,xmm0
mulAccum xmm4,xmm6,xmm14,xmm0
mulAccum xmm4,xmm7,xmm13,xmm0
mulAccum xmm4,xmm8,xmm12,xmm0
For debugging purpose, you can add
%echo before the mulAccum, thus the produced calls are shown is the build console.
Thanks, I'm starting to get then hang of it and that %echo really helps.
Can you explain what the % sign does and how it relates to the i and j variables (i.e. does the assembler see i and j as integers, and %1 casts the character '1' to an integer?)
I tried replacing the xmm with my A and B variables and I got this weird output (it is replacing my A and B with '4'
LOCAL i, j
i = 0
WHILE i LT a
j = 0
WHILE j LT b
%echo mulAccum @CatStr(<A>,%0+i), @CatStr(<B>,%0+j), @CatStr(<xmm>,%12+i-j), xmm0
j = j + 1
ENDM
i = i + 1
ENDM
mulAccum 40, 40, xmm12, xmm0
mulAccum 40, 41, xmm11, xmm0
mulAccum 40, 42, xmm10, xmm0
mulAccum 40, 43, xmm9, xmm0
mulAccum 41, 40, xmm13, xmm0
mulAccum 41, 41, xmm12, xmm0
mulAccum 41, 42, xmm11, xmm0
mulAccum 41, 43, xmm10, xmm0
mulAccum 42, 40, xmm14, xmm0
mulAccum 42, 41, xmm13, xmm0
mulAccum 42, 42, xmm12, xmm0
mulAccum 42, 43, xmm11, xmm0
mulAccum 43, 40, xmm15, xmm0
mulAccum 43, 41, xmm14, xmm0
mulAccum 43, 42, xmm13, xmm0
mulAccum 43, 43, xmm12, xmm0
Quote from: qWord on March 16, 2012, 02:31:47 AM
it may not produce the correct output, but it should give you the idea how to solve your problem:
Convolution macro a,b
LOCAL i,j
i=0
WHILE i LT a
j=0
WHILE j LT b
mulAccum @CatStr(<xmm>,%1+i),@CatStr(<xmm>,%5+j),@CatStr(<xmm>,%12+i-j),xmm0
j = j + 1
ENDM
i = i + 1
ENDM
endm
Quote
Convolution 4,4
mulAccum xmm1,xmm5,xmm12,xmm0
mulAccum xmm1,xmm6,xmm11,xmm0
mulAccum xmm1,xmm7,xmm10,xmm0
mulAccum xmm1,xmm8,xmm9,xmm0
mulAccum xmm2,xmm5,xmm13,xmm0
mulAccum xmm2,xmm6,xmm12,xmm0
mulAccum xmm2,xmm7,xmm11,xmm0
mulAccum xmm2,xmm8,xmm10,xmm0
mulAccum xmm3,xmm5,xmm14,xmm0
mulAccum xmm3,xmm6,xmm13,xmm0
mulAccum xmm3,xmm7,xmm12,xmm0
mulAccum xmm3,xmm8,xmm11,xmm0
mulAccum xmm4,xmm5,xmm15,xmm0
mulAccum xmm4,xmm6,xmm14,xmm0
mulAccum xmm4,xmm7,xmm13,xmm0
mulAccum xmm4,xmm8,xmm12,xmm0
For debugging purpose, you can add %echo before the mulAccum, thus the produced calls are shown is the build console.
Quote from: ASMManiac on March 16, 2012, 04:41:41 AMCan you explain what the % sign does and how it relates to the i and j variables (i.e. does the assembler see i and j as integers, and %1 casts the character '1' to an integer?)
% is the expansion operator, which is used for to replacing [text]macros and equates by the corresponding text - it's behaviour depends on the context.
In the above case,
% is used to force evaluation of given expressions: %1+j, %12+j-i ,...
i and j are integers. Also
%0+i is the same as
%i+0,
%(0+j) and
%j.
For more details about macros, please read chapter 9 in MASM Programmer's Guide (http://www.masm32.com/board/index.php?topic=5433.0)
I've tried using the %, &, <>, and ! operators in my code above, but I still can't get it to print out the 'A' instead of '4'
If A is a textmacro, there is no way, because the resulting expression will allways expanded. For debugging, you can add a underscore: <_A>
A is a built-in text macro?
Can it be expanded as desired in the assembly code, just not with the %echo?
This gives me (for i=2)
temp TEXTEQU @CatStr(<mulAccum !A >, %i)
mulAccum A 2
This gives me
temp TEXTEQU @CatStr(<mulAccum !A>, %i)
mulAccum xmm3
is it possibly because the macro I defined uses lower-case a and b?
conv_axb_Reg MACRO a:req, b:req
.....
I forgot about this:
_out macro l1,n1,l2,n2,l3,n3
echo mulAccum l1&n1,l2&n2,&l3&(&n3),xmm0
endm
...
_out A,%i,B,%j,V,%i-j
mulAccum A0,B0,V(0),xmm0
mulAccum A0,B1,V(-1),xmm0
mulAccum A0,B2,V(-2),xmm0
mulAccum A0,B3,V(-3),xmm0
...
why do you need this output?
I'm trying to make the macro for convolution that I talked about above.
The specific format of the output in echo is not crucial, but I want to be sure that the code the assembler generates (i.e. by removing the echo line) is correct.
It seems like putting %echo before the line generates a different string than the assembler would generate (i.e. without the %echo before the line). Is this correct?
Quote from: qWord on March 16, 2012, 03:41:13 PM
I forgot about this:
_out macro l1,n1,l2,n2,l3,n3
echo mulAccum l1&n1,l2&n2,&l3&(&n3),xmm0
endm
...
_out A,%i,B,%j,V,%i-j
mulAccum A0,B0,V(0),xmm0
mulAccum A0,B1,V(-1),xmm0
mulAccum A0,B2,V(-2),xmm0
mulAccum A0,B3,V(-3),xmm0
...
why do you need this output?
Quote from: ASMManiac on March 16, 2012, 03:48:58 PMIt seems like putting %echo before the line generates a different string than the assembler would generate (i.e. without the %echo before the line). Is this correct?
no!
If you want to see what the Assembler generates, create a listing with the commandline option
/Flfile.txt /Sn. Also use
.nolist/
.listall to show only the interesting parts in the listing.
Okay, I did that and I see now that the assembler is doing.
I have my variables aliased like this:
A0 TEXTEQU <xmm1>
... etc...
When I do this:
mulAccum @CatStr(!A,%i), @CatStr(!B,%j), @CatStr(<v_>,%j-i), xmm0
The assembler evaluates A0 to xmm1 and passes that to mulAccum.
I thought that it would make 2 passes, the first would be to evaluate the string, the second would be to substitute the values for the varaibles aliased with TEXTEQU. It looks like it does it all in one pass though.
Quote from: qWord on March 16, 2012, 03:53:24 PM
Quote from: ASMManiac on March 16, 2012, 03:48:58 PMIt seems like putting %echo before the line generates a different string than the assembler would generate (i.e. without the %echo before the line). Is this correct?
no!
If you want to see what the Assembler generates, create a listing with the commandline option /Flfile.txt /Sn. Also use .nolist/.listall to show only the interesting parts in the listing.
Quote from: ASMManiac on March 16, 2012, 04:03:14 PMIt looks like it does it all in one pass though.
exactly.