The MASM Forum Archive 2004 to 2012

General Forums => The Workshop => Topic started by: a_h on September 07, 2005, 04:18:08 PM

Title: loop problem that I cannot find!
Post by: a_h on September 07, 2005, 04:18:08 PM
Hi folks!

Currently I'm trying to replace some inline assembly with external masm-files to - among others - remove unnecessities like


for (int comp=0; comp<count; comp++)
{
Block_Ptr = block[comp];

__asm
{
mov   eax, [Block_Ptr];
pxor    xmm0, xmm0;
movdqa [eax+0  ], xmm0;
movdqa [eax+16 ], xmm0;
movdqa [eax+32 ], xmm0;
movdqa [eax+48 ], xmm0;
movdqa [eax+64 ], xmm0;
movdqa [eax+80 ], xmm0;
movdqa [eax+96 ], xmm0;
movdqa [eax+112], xmm0;
}
}


my fastcall replacement looks like this (int count in ecx, block is defined as short *block[8] and its baseadress &block[0] is in edx):


        sub ecx, 1
pxor xmm0, xmm0;
.repeat
mov   eax, [edx+4*ecx]
movdqa [eax+0  ], xmm0;
movdqa [eax+16 ], xmm0;
movdqa [eax+32 ], xmm0;
movdqa [eax+48 ], xmm0;
movdqa [eax+64 ], xmm0;
movdqa [eax+80 ], xmm0;
movdqa [eax+96 ], xmm0;
movdqa [eax+112], xmm0;
.untilcxz

but I've the problem that my replacement does not do the same as the for-loop. Of course it counts down instead of up, but that cannot be the problem? count is never reached in the for-loop, so I substract 1 from it (1st line) and count down from count-1 to - including - 0. The loop does the same (just ascending), but yields the correct frame which mine does not.

Changing 4*ecx into 2*ecx (it's a short pointer) does not change anything.

How can I access the elements of short *block[8]?

Unfortunately I'm not able to debug that dll, so I hope somebody can help me out! Thanks a lot!

Cheers, Hannes
Title: Re: loop problem that I cannot find!
Post by: a_h on September 07, 2005, 05:39:54 PM
Hey that's my assembly present of today...use just WORD PTR [edx+4*ecx] to indicate that a short stands behing the adress!

Works with my test-procedure, lets go for the real thing!

Cheers, Hannes
Title: Re: loop problem that I cannot find!
Post by: a_h on September 07, 2005, 06:07:54 PM
Err, no. Doesn't work...

Maybe somebody has a clue?

Thanks! Cheers, Hannes
Title: Re: loop problem that I cannot find!
Post by: tenkey on September 08, 2005, 03:31:57 AM
Is that a typo, or did you really put the subtraction in front of the loop instead of within it?
Title: Re: loop problem that I cannot find!
Post by: raymond on September 08, 2005, 04:09:19 AM
If your block contains WORDS which you want to load into EAX using a counter as a displacement into the block, try the following:

movzx eax,word ptr[edx+ecx*2]

Raymond
Title: Re: loop problem that I cannot find!
Post by: a_h on September 08, 2005, 09:16:29 AM
Thanks for your tips!

@tenkey: that's correct. The passed counter-variable is 1 too large, repeat-untilcxz decrements itself the cx-register.

@raymond: the pointers itself consume 4bytes of storage, right? Just the variable behind the pointer is a short, so I've to use 4*ecx?

However the main problem is this: if I pass block[comp] (do the loop in C++ and call the asm-procdure for every array element) and remove the repeat-loop everything works.

Just if I want to to do the loop within the asm-procedure - so I've to pass the base-adress &block[0] (in edx) - nothing works. I don't even access the 1st element correctly (I set the counter to 0 so I should get the 1st element only):

        mov ecx, 0
.repeat
movzx eax, WORD PTR [edx+4*ecx]
movdqa [eax+0  ], xmm0;
movdqa [eax+16 ], xmm0;
movdqa [eax+32 ], xmm0;
movdqa [eax+48 ], xmm0;
movdqa [eax+64 ], xmm0;
movdqa [eax+80 ], xmm0;
movdqa [eax+96 ], xmm0;
movdqa [eax+112], xmm0;
.untilcxz


I've to [] once more, since the procedure is procedure(...,**short block) instead of procedure(...,*short block). Doing the same as DWORD doesn't help. However, this should be equivalent to


       procedure(...,block[0])

with

        pxor xmm0, xmm0;
movdqa [edx+0  ], xmm0;
movdqa [edx+16 ], xmm0;
movdqa [edx+32 ], xmm0;
movdqa [edx+48 ], xmm0;
movdqa [edx+64 ], xmm0;
movdqa [edx+80 ], xmm0;
movdqa [edx+96 ], xmm0;
movdqa [edx+112], xmm0;


but it isn't.

Makes me crazy!

Thanks for your help! Hannes
Title: Re: loop problem that I cannot find!
Post by: a_h on September 08, 2005, 05:53:54 PM
Boah, that one nearly made me crazy.

The solution is that simple it's a shame that it took so long.


.repeat
mov eax, DWORD PTR [edx+4*ecx-4]
movdqa [eax+0  ], xmm0;
movdqa [eax+16 ], xmm0;
movdqa [eax+32 ], xmm0;
...


@tenkey: yeah, the repeat did have a problem...it didn't loop with 0, so -4 has to be added (and the sub ecx, 1 to be left out).

@raymond: don't ask me why, but above code works. WORD PTR and/or 2*ecx gives just garbage (maybe because the pointers itself are 4 bytes, just the [eax+...] is of type short).

Thanks for your help! Cheers and have a nice evening, Hannes