The MASM Forum Archive 2004 to 2012

General Forums => The Campus => Topic started by: msoftprogramming on May 24, 2011, 11:29:57 AM

Title: 128-bit integer arithmetics
Post by: msoftprogramming on May 24, 2011, 11:29:57 AM
Hello everyone,
just a very simple question. I'd like to use XMM 128-bit registries to do some VERY simple 128-bit integer math on them.
In particular, I'd need to:
set XMM registries to some 128 bit int value
compare two 128-bit integer and jump somewhere if they are equal
multiply 128 bit XMM registry value by 3
divide 128 bit XMM registry value by 2
add 1 to 128 XMM registry value

is that possible?
Thank you very much!

Matteo Monti
Msoft Programming
Title: Re: 128-bit integer arithmetics
Post by: dedndave on May 24, 2011, 12:34:23 PM
might be easier to divide by 2, add that to the original, then increment
Y = 1.5X + 1
Title: Re: 128-bit integer arithmetics
Post by: jj2007 on May 24, 2011, 12:55:27 PM
Quote from: msoftprogramming on May 24, 2011, 11:29:57 AM
compare two 128-bit integer and jump somewhere if they are equal

Ciao Matteo,

Here is a snippet that compares two 128-bit memory vars:
include \masm32\include\masm32rt.inc
.686
.xmm

IsEqual128 MACRO arg1, arg2
movups xmm0, oword ptr arg1
movups xmm1, oword ptr arg2
psubd xmm0, xmm1
xorps xmm1, xmm1
pcmpeqb xmm0, xmm1
pmovmskb eax, xmm0
cwde
inc eax
EXITM <Zero?>
ENDM

.data
x128A REAL8 123456789.0, 123456789.1
x128B REAL8 123456789.0, 123456789.1
x128C REAL8 123456789.1, 123456789.1

.code
AppName db "Masm32:", 0

start:
.if IsEqual128(x128A, x128B)
MsgBox 0, "A=B", "Hi", MB_OK
.else
MsgBox 0, "A and B are different", "Hi", MB_OK
.endif
.if IsEqual128(x128A, x128C)
MsgBox 0, "A=C", "Hi", MB_OK
.else
MsgBox 0, "A and C are different", "Hi", MB_OK
.endif
exit

end start


This is for REAL8 vars, but it works for integers, too; try this:

x128A dq 1234567890, 1234567891
x128B dq 1234567890, 1234567891
x128C dq 1234567891, 1234567891
Title: Re: 128-bit integer arithmetics
Post by: qWord on May 24, 2011, 02:00:47 PM
hi,
Quote from: msoftprogramming on May 24, 2011, 11:29:57 AMI'd like to use XMM 128-bit registries to do some VERY simple 128-bit integer math on them.
you have obvious miss understood what SSEx is for: it is designed for processing vectorized data (SIMD). There is no nature support for 128Bit integers - 64Bit integers are the maximum.
Title: Re: 128-bit integer arithmetics
Post by: msoftprogramming on May 24, 2011, 02:35:02 PM
... all right! I think I should find out something to understand the language a bit better.. could you tell me where can I find something like a tutorial explaining everything from the beginning? Something about every registry and how does each work and all the operation I can perform with them? Thank you very much.
Anyway, jj2007, I tried to run your code, but.. it always says that numbers are different, even if i change the values...

Thank you again!

Matteo
Title: Re: 128-bit integer arithmetics
Post by: jj2007 on May 24, 2011, 02:40:59 PM
Quote from: msoftprogramming on May 24, 2011, 02:35:02 PM
Anyway, jj2007, I tried to run your code, but.. it always says that numbers are different, even if i change the values...

That's odd - here it works, and it should work, as it simply tests two packed quadwords for equality (which is indeed a 128:128-bit comparison, although a very simple one).

Can anybody confirm Matteo's finding?
Title: Re: 128-bit integer arithmetics
Post by: qWord on May 24, 2011, 02:51:10 PM
jj's code works for me, but it could be done a bit easier:
.data
align 16
data1 QWORD -123,-123
data2 QWORD -123,-123
.code

movdqa xmm0,OWORD ptr data1
pcmpeqb xmm0,OWORD ptr data2
pmovmskb eax,xmm0
.if eax == 0ffffh
MsgBox 0,"equal",0,0
.endif
Title: Re: 128-bit integer arithmetics
Post by: jj2007 on May 24, 2011, 03:18:05 PM
Yes, that's right, I forgot the pcmpeqb compares all bytes to their counterparts...
Nonetheless I would stick with movups - assuming 16-bit alignment is kind of bug-prone :wink

QuoteIsEqual128 MACRO arg1, arg2
   movups xmm0, oword ptr arg1
   movups xmm1, oword ptr arg2
   pcmpeqb xmm0, xmm1
   pmovmskb eax, xmm0
   cwde
   inc eax
   EXITM <Zero?>
ENDM
Title: Re: 128-bit integer arithmetics
Post by: raymond on May 25, 2011, 01:20:16 AM
Quote.if IsEqual128(x128A, x128B)

Looking at the IsEqual128 macro, it is obvious that you need memory locations as arguments. As coded, would the x128A and x128B arguments always be interpreted as offsets with ALL assemblers or could some assemblers interpret them as actual values?
Title: Re: 128-bit integer arithmetics
Post by: lingo on May 25, 2011, 01:35:50 AM
"There is no nature support for 128Bit integers - 64Bit integers are the maximum."

Take a look of AVX coz AVX adds new register-state through the 256-bit wide YMM register-file, so explicit operating system support is required to properly save & restore AVX's new registers between context switches.

I have Intel Sandy Bridge processor, Windows 7-64bit SP1, MASM from VS2010-SP1 and have no problem with the new instructions. See my replay #4  here (http://www.masm32.com/board/index.php?topic=15895.0) :U
Title: Re: 128-bit integer arithmetics
Post by: qWord on May 25, 2011, 02:27:10 AM
I know about AVX,
but AFAIKS there is no nature support for arithmetic on 128Bit integers !(?)
Title: Re: 128-bit integer arithmetics
Post by: jj2007 on May 25, 2011, 08:00:17 AM
Quote from: raymond on May 25, 2011, 01:20:16 AMLooking at the IsEqual128 macro, it is obvious that you need memory locations as arguments.

Well, not really:
movups xmm3, oword ptr x128B
.if IsEqual128(x128A, xmm3)


QuoteAs coded, would the x128A and x128B arguments always be interpreted as offsets with ALL assemblers or could some assemblers interpret them as actual values?

Most of the code posted in the Forum can be interpreted correctly only by ml.exe and jwasm.exe ...

Quote from: lingo on May 25, 2011, 01:35:50 AM"There is no nature support for 128Bit integers

Except for the supernatural IsEqual128 macro, of course :bg
Title: Re: 128-bit integer arithmetics
Post by: hutch-- on May 25, 2011, 11:39:46 AM
This seems to run and the results appear to work but no garrantees.  :bg You know, Eenie, meanie (blue), miney and MOE.


IF 0  ; ¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤
                      Build this template with "CONSOLE ASSEMBLE AND LINK"
ENDIF ; ¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤

    include \masm32\include\masm32rt.inc

    cmp128 PROTO :DWORD,:DWORD

    .data
      item1 oword 0FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFh
      item2 oword 0
      item3 oword 1
      item4 oword 0FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFEh
      item5 oword 0000000000000000FFFFFFFFFFFFFFFFh
      item6 oword 0FFFFFFFFFFFFFFFF0000000000000000h

    .code

start:
   
; ¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤

    call main
    inkey
    exit

; ¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤

main proc

    invoke cmp128,ADDR item2,ADDR item2
    print str$(eax),13,10

    invoke cmp128,ADDR item4,ADDR item1
    print str$(eax),13,10

    invoke cmp128,ADDR item3,ADDR item2
    print str$(eax),13,10

    invoke cmp128,ADDR item5,ADDR item6
    print str$(eax),13,10

    invoke cmp128,ADDR item6,ADDR item5
    print str$(eax),13,10

    ret

main endp

; ¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤

cmp128 proc num1:DWORD,num2:DWORD

    mov ecx, num1
    mov edx, num2

    mov eax, [ecx+12]
    cmp eax, [edx+12]
    jb lessthan
    ja greater

    mov eax, [ecx+8]
    cmp eax, [edx+8]
    jb lessthan
    ja greater

    mov eax, [ecx+4]
    cmp eax, [edx+4]
    jb lessthan
    ja greater

    mov eax, [ecx]
    cmp eax, [edx]
    jb lessthan
    ja greater

  equal:
    xor eax, eax
    ret

  greater:
    mov eax, 1
    ret

  lessthan:
    or eax, -1
    ret

cmp128 endp

; ¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤

end start
Title: Re: 128-bit integer arithmetics
Post by: sinsi on May 25, 2011, 12:02:54 PM
Jumpin' Jiminy!


cmp128 proc num1:DWORD,num2:DWORD

    mov ecx, num1
    mov edx, num2

    mov eax, [ecx+12]
    sub eax, [edx+12]
    jnz above_or_below

    mov eax, [ecx+8]
    sub eax, [edx+8]
    jnz above_or_below

    mov eax, [ecx+4]
    sub eax, [edx+4]
    jnz above_or_below

    mov eax, [ecx]
    sub eax, [edx]
    jz done
   
above_or_below:
    sbb eax,eax
    jnz done
    add eax,1

done:
    ret

cmp128 endp

Unsigned I assume.
Title: Re: 128-bit integer arithmetics
Post by: hutch-- on May 25, 2011, 12:11:34 PM
 :bg
Title: Re: 128-bit integer arithmetics
Post by: dedndave on May 25, 2011, 12:56:04 PM
yikes !!!!!!
you guys don't have your thinking caps on - lol

why not use SUB once, then SBB
get rid of all those JMP's
when you're done, zero and carry are set for you

also, it might be a little faster to push EBX
then, use alternating registers
Title: Re: 128-bit integer arithmetics
Post by: dedndave on May 25, 2011, 01:04:45 PM
something like this
cmp128  proc    lpNum1:DWORD,lpNum2:DWORD

        push    ebx
        mov     edx,lpNum1
        mov     ebx,lpNum2
        mov     eax,[edx]
        mov     ecx,[edx+4]
        sub     eax,[ebx]
        sbb     ecx,[ebx+4]
        mov     eax,[edx+8]
        mov     ecx,[edx+12]
        sbb     eax,[ebx+8]
        sbb     ecx,[ebx+12]
        pop     ebx
        ret

cmp128  endp
Title: Re: 128-bit integer arithmetics
Post by: sinsi on May 25, 2011, 01:35:24 PM

hutch: 0,-1,1,-1,1
sinsi: 0,-1,1,-1,1
dave:  0,-1,0,1,-2

Title: Re: 128-bit integer arithmetics
Post by: dedndave on May 25, 2011, 03:53:08 PM
that's because mine does not return a value in EAX
it returns with the flags set
Title: Re: 128-bit integer arithmetics
Post by: dedndave on May 26, 2011, 03:13:35 AM
gosh - what happened in here ?

noone noticed that the flags returned by my proc are not correct   :P
you guys are asleep at the wheel - lol

well, the original poster wanted to test for equality/inequality only
that greatly simplifies the requirements
however, it would be nice to have a compare function that sets the overflow, sign, and carry flags, as well
that way it could be used for signed/unsigned greater/less comparisons

EDIT
the sign and carry flags should be correct for my method
the zero flag isn't too hard to set
the overflow flag needs a little work   :bg
i don't think i would mess with parity - it isn't very useful - although, it wouldn't be hard
Title: Re: 128-bit integer arithmetics
Post by: FORTRANS on May 26, 2011, 12:02:30 PM
Quote from: dedndave on May 26, 2011, 03:13:35 AM
i don't think i would mess with parity - it isn't very useful - although, it wouldn't be hard

Hi,

   Just AND the low byte with itself and store the parity flag
somewhere.

Regards,

Steve N.


Checking parity flag settings.

  AX  AX AH AL PF
4FB4  O  O  E  E
B783  O  E  O  O
7706  E  E  E  E
62AD  E  O  O  O
E428  E  E  E  E
7967  E  O  O  O
279A  E  E  E  E
5231  E  O  O  O
A5DC  O  E  O  O
078B  O  O  E  E
B76E  O  E  O  O
17F5  E  E  E  E
C8D0  E  O  O  O
05EF  O  E  O  O
7A82  O  O  E  E
F7F9  O  O  E  E
C104  E  O  O  O
5893  O  O  E  E
Title: Re: 128-bit integer arithmetics
Post by: dedndave on May 26, 2011, 02:46:48 PM
i didn't think it was worth the clock cycles or code bytes, Steve   :P
that low-byte-only thing is for serial communications, of course
if we actually wanted the parity of bignum values, we could XOR all the bytes together

anyways, here is my code and a test piece...
Pentium 4 Prescott (2005+), MMX, SSE3

NO NS ZF NC
NO SF NZ CF
NO NS NZ NC
NO NS NZ NC
NO SF NZ CF
NO NS NZ CF
NO SF NZ NC
NO NS NZ CF
OF SF NZ CF
NO SF NZ NC
OF NS NZ NC

58 57 57 58 57


i am sure it could be a few cycles faster without using SSE
and there is probably a way to do it with SSE, too
but, you can now use it for signed or unsigned comparisons

as it turned out, the only flag that requires modification after SUB, SBB, SBB, SBB is the zero flag
you can save the other flags using PUSHFD
POPFD is very slow, however
and - LAHF/SAHF do not load/store the overflow flag (i did not realize that until now - lol)

EDIT - see the next post for download...
Title: Re: 128-bit integer arithmetics
Post by: dedndave on May 26, 2011, 03:56:47 PM
slight improvement....
Pentium 4 Prescott (2005+), MMX, SSE3

NO NS ZF NC
NO SF NZ CF
NO NS NZ NC
NO NS NZ NC
NO SF NZ CF
NO NS NZ CF
NO SF NZ NC
NO NS NZ CF
OF SF NZ CF
NO SF NZ NC
OF NS NZ NC

52 52 52 52 53


updated code below
Title: Re: 128-bit integer arithmetics
Post by: dedndave on May 26, 2011, 10:56:00 PM
my apologies
i left an instruction in there from a previous attempt that destroyed the contents of EBX on the stack
i hope it hasn't caused anyoe trouble   :red

Pentium 4 Prescott (2005+), MMX, SSE3

NO NS ZF NC
NO SF NZ CF
NO NS NZ NC
NO NS NZ NC
NO SF NZ CF
NO NS NZ CF
NO SF NZ NC
NO NS NZ CF
OF SF NZ CF
NO SF NZ NC
OF NS NZ NC

53 52 52 52 52


updated code attached...
Title: Re: 128-bit integer arithmetics
Post by: FORTRANS on May 27, 2011, 12:41:05 PM
Hi Dave,

Quotei didn't think it was worth the clock cycles or code bytes, Steve

   Actually wrote it a few days ago to address a concern from
a while back.  "Educational."  Hm,  may add a tweak or two to it.

Quotethat low-byte-only thing is for serial communications, of course

   Yeah, a left over from the 8085 I guess.  Makes you (me)
wonder how often it was/is used.

Quoteupdated code attached...

   Nice.  An incentive to update my fixed point routines.

Thanks,

Steve N.


   This behaved a teensie bit odd with the cursor.  Win98
full screen.  SSE0? <g>

P1 (1993+), MMX, SSE0

NO NS ZF NC
NO SF NZ CF
NO NS NZ NC
NO NS NZ NC
NO SF NZ CF
NO NS NZ CF
NO SF NZ NC
NO NS NZ CF
OF SF NZ CF
NO SF NZ NC
OF NS NZ NC

34 34 34 34 34

Press any key to continue ...

P3 (2000+), MMX, SSE1

NO NS ZF NC
NO SF NZ CF
NO NS NZ NC
NO NS NZ NC
NO SF NZ CF
NO NS NZ CF
NO SF NZ NC
NO NS NZ CF
OF SF NZ CF
NO SF NZ NC
OF NS NZ NC

32 32 32 32 32

Press any key to continue ...
Title: Re: 128-bit integer arithmetics
Post by: dedndave on May 27, 2011, 12:51:59 PM
yah - that's an old version of Jochen's ShowCPU   :P
he has newer versions around - i just grabbed what was handy - lol
it looks like you have a Pentium MMX
i have one of those around - it is a 200 MHz CPU, but it seems to run well at 225   :bg
back before the year 2000, that was my main machine with win 95 or win 98 on it

i forgot to explain how the routine might be used in normal operation
it is like CMP so, for example, if you wanted to use JLE...
        INVOKE  cmp128,offset FirstValue,offset SecondValue
        jle     SomeLabel


i managed to squeeze a couple more clock cycles out of it by changing the order of the zero-test instructions
got rid of a little dependancy
        push    ebx
        mov     edx,[esp+8]            ;lpNum1
        mov     ebx,[esp+12]           ;lpNum2
        mov     eax,[edx]
        mov     ecx,[edx+4]
        sub     eax,[ebx]
        sbb     ecx,[ebx+4]
        push    eax
        push    ecx
        mov     eax,[edx+8]
        mov     ecx,[edx+12]
        sbb     eax,[ebx+8]
        sbb     ecx,[ebx+12]
        pop     edx
        pop     ebx
        pushfd
        or      edx,ecx        ;little change
        or      edx,eax        ;little change
        pop     ecx
        or      edx,ebx        ;little change
        lahf
        pop     ebx
        and     ah,40h
        and     cx,8BFh
        or      ah,cl
        add     ch,78h
        sahf
        ret     8