News:

MASM32 SDK Description, downloads and other helpful links
MASM32.com New Forum Link
masmforum WebSite

Branch if an SSE register equals zero

Started by HooKooDooKu, December 15, 2011, 06:17:52 PM

Previous topic - Next topic

HooKooDooKu

Can someone provide me with a better (quicker) way to branch based on an SSE register being zero?

So far the only thing I can come up with on my own is to basically 'or' the high dword and low dword of the register together then use 'ucomisd' to compare the register with another register of all zeros to update the EFLAGs, then branch based on the EFLAGS

pxor    xmm0, xmm0    ;clear xmm0
movdqa  xmm1, xmm3    ;copy upper dword of xmm3
psrldq  xmm1, 8       ; to lower dword of xmm1
por     xmm1, xmm3    ;or the upper and lowwer dwords together   
ucomisd xmm1, xmm0    ;compare the combined dwords to zero
jne Location02        ;branch based on EFLAGs

qWord

; check xmm0
pxor xmm1,xmm1
pcmpeqd xmm1,xmm0
pmovmskb eax,xmm1
cmp eax,0ffffh
je @xmm0_is_zero
FPU in a trice: SmplMath
It's that simple!

HooKooDooKu

Well that is a solution... unfortunately it's just as many lines of code as I already have (once implemented).

My actual implementation is trying to test xmm3, but I've already got xmm7 permanently set to all zeros (because there's several other places where I'm doing comparisons to zero).  Additionally, I have to preserve the non-zero value of xmm3.  So my current code looks like this:


   movdqa  xmm1, xmm3      ;if xmm3 == 0
   psrldq  xmm1, 8         ;   ...
   por     xmm1, xmm3      ;   ...
   ucomisd xmm1, xmm7      ;   ...
   je      RLL10         ;   then goto next row


So I'm already at an implementation that takes 5 instructions. 

qWord

KooDooKu,

less number of instruction does not impose, that the code is faster - it is more important what the instruction does. Also, your implementation can't work correct, because you may pass invalid floating point values to ucomisd.
When you need the zero-test more than one time, write a function or a macro.
FPU in a trice: SmplMath
It's that simple!