News:

MASM32 SDK Description, downloads and other helpful links
MASM32.com New Forum Link
masmforum WebSite

SSE

Started by bomz, December 08, 2011, 07:13:49 AM

Previous topic - Next topic

bomz

#15
Quote.686
.xmm

.model flat, stdcall
option casemap :none

include \MASM32\INCLUDE\windows.inc
include \MASM32\INCLUDE\user32.inc
include \MASM32\INCLUDE\kernel32.inc
includelib \MASM32\LIB\user32.lib
includelib \MASM32\LIB\kernel32.lib

.data
align 16
var1   db "1111222233334444"
mess   db "MOVAPS         ",9
var5   db "0000000000000000"
   db 13,10,"SHUFPS 0D8h  ",9
var6   db "0000000000000000"
   db 13,10,"SHUFPS 01Eh  ",9
var7   db "0000000000000000"
   db 13,10,"MOVUPS       ",9
var8   db "0000000000000000"
   db 13,10,"UNPCKHPS     ",9
var9   db "0000000000000000"
   db 13,10,"UNPCKLPS     ",9
var10   db "0000000000000000"
   db 13,10,"MOVDQA       ",9
var11   db "0000000000000000"
   db 13,10,"PINSRW       ",9
var12   db "0000000000000000"
   db 13,10,"PEXTRW       ",9
var13   db "0000000000000000",0

.code
start:
lea esi, var1
lea edi, var5
MOVAPS      xmm1, XMMWORD PTR[esi]
MOVAPS      [edi],xmm1
SHUFPS      XMM1, XMM1, 0D8h
MOVAPS      [edi+32],xmm1
MOVAPS      xmm1, XMMWORD PTR[esi]
SHUFPS      XMM1, XMM1, 01Eh
MOVAPS      [edi+64],xmm1
MOVUPS      xmm1, [esi]
MOVAPS      [edi+96],xmm1
MOVAPS      xmm1, XMMWORD PTR[esi]
UNPCKHPS   XMM1, XMM1
MOVAPS      [edi+128],xmm1
MOVAPS      xmm1, XMMWORD PTR[esi]
UNPCKLPS   XMM1, XMM1
MOVAPS      [edi+160],xmm1
MOVDQA      XMM0, [esi]
MOVDQA      [edi+192], XMM0
MOVDQA      XMM0, [esi]
MOV      eax, '**'
PINSRW      XMM0, eax, 4
MOVDQA      [edi+224], XMM0
MOVDQA      XMM0, [esi]
MOVDQA      XMM0, [esi]
PEXTRW      eax, XMM0, 7
PINSRW      XMM0, eax, 0
MOVDQA      [edi+256], XMM0
MOVDQA      XMM0, [esi]

invoke MessageBox,0,ADDR mess,0,MB_ICONASTERISK
invoke ExitProcess,0
end start

bomz

#16
360 mine yours 141

QSort Mu$()

StringsDiffer(esi, Mu$(ecx))

What is this?


as I understand first you sort list by the oder above-under. and than only need to compare neighbor strings.
When I do mine I think that sorting need much steps than 1 full list reading

jj2007

Quote from: bomz on December 09, 2011, 12:50:27 AM
360 mine yours 141

QSort Mu$()  ; QuickSort of strings

StringsDiffer(esi, Mu$(ecx))  ; what the name says

What is this?

> as I understand first you sort list by the oder above-under. and than only need to compare neighbor strings. ; YES
> When I do mine I think that sorting need much steps than 1 full list reading
the logic is interesting, but it might take longer

> warning LNK4078
Thanks, will look into it. It's a harmless warning, though.

bomz

It's strange because when you sorting you already compare each strings. and may put them to double and unique list already.

It's need thinking and fresh head. yesterday I think about reason of so difference. first I think that yours first making list when read it from cache but than I try my own list. mine no need "clean list" URL's may find it in trash. than I rebuild mine to console. may be ...... it is not clear for me now why this algorithm need low steps two time

When I make it first I need to decide my problem, So it was not important 10 min or 20. When I find prog for it that do this about 1 min or something about. I think that this prog must use the algorithm from common theory and try another way

sad it's only russian. it's allow to work with lists
http://zalil.ru/32233341

bomz


qWord

FPU in a trice: SmplMath
It's that simple!

bomz

#21
Quote.686
.MMX
.XMM

.model flat, stdcall
option casemap :none

include \MASM32\INCLUDE\windows.inc
include \MASM32\INCLUDE\user32.inc
include \MASM32\INCLUDE\kernel32.inc
includelib \MASM32\LIB\user32.lib
includelib \MASM32\LIB\kernel32.lib

.data
align 16
var1   db "1234567812345678"
var2   db "0000000000000000",0

.code
start:
lea esi, var1
lea edi, var2
movdqa  xmm0,[esi]
movntps [edi],xmm0
invoke MessageBox,0,ADDR var2,0,MB_ICONASTERISK
invoke ExitProcess,0
end start
http://www.mark.masmcode.com/

under DOS don't see any difference between SSE and 386 rep movsd

Quote@@:
   movaps  xmm0, ds:[esi]
   movaps  xmm1, ds:[esi+16]
   movaps  xmm2, ds:[esi+32]
   movaps  xmm3, ds:[esi+48]
   movaps  xmm4, ds:[esi+64]
   movaps  xmm5, ds:[esi+80]
   movaps  xmm6, ds:[esi+96]
   movaps  xmm7, ds:[esi+112]
   movaps  es:[edi],xmm0
   movaps  es:[edi+16],xmm1
   movaps  es:[edi+32],xmm2
   movaps  es:[edi+48],xmm3
   movaps  es:[edi+64],xmm4
   movaps  es:[edi+80],xmm5
   movaps  es:[edi+96],xmm6
   movaps  es:[edi+112],xmm7
   add si, 128
   add di, 128
   sub cx, 1
   jnz @B

bomz

How convert REAL4 to string?
Quote;and ebx, 111111111111111111111111b
;and eax, 00111111000000000000000000000000b
;rol eax, 8


PINSRB PINSRQ don't work on SSE2 is it possible to ADD 64 bit integer or only REAL4?

jj2007

Quote from: bomz on December 11, 2011, 09:51:55 AM
How convert REAL4 to string?
Beware of the lousy precision...

Quoteinclude \masm32\MasmBasic\MasmBasic.inc
.data
MyReal4   REAL4 3.14159265358979324
   Init
   DefNum
9
   fldpi
   fstp MyReal4
   Let esi=Str$(MyReal4)
   DefNum 19
   Inkey "PI=", Tb$, esi, CrLf$, "Exact=", Tb$, Str$(PI)
   Exit
end start

PI=     3.14159274
Exact=  3.141592653589793238

bomz


jj2007

Quote from: bomz on December 11, 2011, 10:23:50 AM
This is Basic?

No, it's Assembler. To be precise: it assembles with ml.exe versions 6.15 ... 10.0 or JWasm :bg

Hint: if you prefer C, go for crt_sprintf.

bomz

I prefer MASM

It's convert through FPU?

jj2007

Quote from: bomz on December 11, 2011, 10:28:35 AM
It's convert through FPU?

Yes, Str$() uses the FPU and has a REAL10 internal precision. You can set output precision either with DefNum n (n=1...19) or with a sprintf type format string:

   PrintLine "Precision:", Str$("\n%3f", PI), Str$("\n%7f", PI), Str$("\n%Cf", PI), Str$("\n%Gf", PI), Str$("\n%Jf", PI)

Precision:
3.14
3.141593
3.14159265359
3.141592653589793
3.141592653589793238

bomz

I try do this with masm, read about REAL - BCD

bomz

Quote.386
.MMX
.XMM


.model flat, stdcall
option casemap :none

   include \MASM32\INCLUDE\windows.inc
   include \MASM32\INCLUDE\masm32.inc
   include \MASM32\INCLUDE\gdi32.inc
   include \MASM32\INCLUDE\user32.inc
   include \MASM32\INCLUDE\kernel32.inc
   include \MASM32\INCLUDE\fpu.inc
   includelib \MASM32\LIB\masm32.lib
   includelib \MASM32\LIB\gdi32.lib
   includelib \MASM32\LIB\user32.lib
   includelib \MASM32\LIB\kernel32.lib
   includelib \MASM32\LIB\fpu.lib

.data
   mestitle      db "Bomz",0
   VAR1         REAL4 11111.0
   VAR2         dt ?
.data?
   buffer         db 512 dup(?)
.code
start:
   finit
   fld VAR1
   ;fstp VAR2

   invoke FpuFLtoA, 0, 10, ADDR buffer, SRC1_FPU or SRC2_DIMM
   invoke MessageBox,0, ADDR buffer,ADDR mestitle,MB_ICONASTERISK
   invoke ExitProcess,0
end start