News:

MASM32 SDK Description, downloads and other helpful links
MASM32.com New Forum Link
masmforum WebSite

SSE

Started by bomz, December 08, 2011, 07:13:49 AM

Previous topic - Next topic

bomz

somebody may give really working code for 128 bit registers? or better all instructions with examples
http://www.mark.masmcode.com/
Quote        mov     ecx,16384           ;write 16384 16-byte values, 16384*16 = 256KB.
                                    ; So we are copying a 256KB array
        mov     esi,offset src_arr  ;pointer to the source array which has to be
                                    ; 16-byte aligned or you will get an exception.
        mov     edi,offset dst_arr  ;pointer to the destination array which has to be
                                    ; 16-byte aligned or you will get an exception.
looper:
        movdqa  xmm0,[esi]          ;works on P3 and up
        movntps [edi],xmm0          ;Works on P3 and up
        add     esi,16
        add     edi,16
        dec     ecx
        jnz     looper

Quote.686
.xmm

.model flat, stdcall
option casemap :none

include \MASM32\INCLUDE\windows.inc
include \MASM32\INCLUDE\user32.inc
include \MASM32\INCLUDE\kernel32.inc
includelib \MASM32\LIB\user32.lib
includelib \MASM32\LIB\kernel32.lib

.data
align 16
var1   db "1234567812345678",0
var2   db "0000000000000000",0

.code
start:
lea esi, var1
lea edi, var2
;movd  xmm(0),[esi]
;movd [edi],xmm(0)
movq  xmm(0),[esi]
movq [edi],xmm(0)
invoke MessageBox,0,ADDR var2,0,MB_ICONASTERISK
invoke ExitProcess,0
end start

jj2007

Search the forum for SSE2 - 14 pages of results. Many have attachments.
Or try a search for pcmpeqb - 2 pages, most of them on timing fast SSE2 algos (and many found their way into MasmBasic).
If you have a more specific need, be more specific and somebody will help.

bomz

somebody have example (for any code and make-batch file) for masm 6.15?

application with SSE instruction both compatible with AMD and Intel processors?

hutch--

bomz,

Do yourself a favour, get ML 9, 10 or 11, you can have real PHUN with SSE4.2  :P
Download site for MASM32      New MASM Forum
https://masm32.com          https://masm32.com/board/index.php

bomz

QuotePentium 4 2.26    SL6RY    C1    2.26 GHz    512 KB    533 MT/s    17×    1.53 V    58 W    Socket 478       RK80532PE051512

Northwood (130 nm)

    * Intel Family 15 Model 2
    * All models support: MMX, SSE, SSE2

I need to delete double urls from the list of 20.000 http addresses. I do a little application which do this 10 minutes first. than I optimize it... optimize... now it do this 0.375 sec. This good task for SSE training


bomz

#6
Quote.686
.xmm

.model flat, stdcall
option casemap :none

include \MASM32\INCLUDE\windows.inc
include \MASM32\INCLUDE\user32.inc
include \MASM32\INCLUDE\kernel32.inc
includelib \MASM32\LIB\user32.lib
includelib \MASM32\LIB\kernel32.lib

.data
align 16
var1   db "1234567812345678",0
var2   db "0000000000000000",0
buffer db 512 dup (0)

.code
start:
lea esi, var1
lea edi, var2

movups  xmm1, [esi]; XMMWORD PTR[esi]
movups  [edi],xmm1

invoke MessageBox,0,ADDR var2,0,MB_ICONASTERISK
invoke ExitProcess,0
end start


http://www.microsoft.com/downloads/en/details.aspx?familyid=7A1C9DA0-0510-44A2-B042-7EF370530C64&displaylang=en

Quote from: dedndave on April 16, 2011, 01:10:18 PM
it is far easier to just use 7-zip to extract the file from the masm8 setup
1) right-click on the setup program, 7-zip, Extract files, OK
2) inside the resulting folder, another file - repeat the same thing
3) inside that, there are 2 files, an MSI, and a CAB
4) again, use 7-zip to extract files from the CAB file
5) inside the CAB is a file named FL_ml_exe_____X86.3643236F_FC70_11D3_A536_0090278A1BB8
6) rename it to ML.exe
7) change ML.EXE in C:\masm32\bin

bomz, don't post files that are not yours. It is a licence violation to post a Microsoft owned binary. Just use the normal Microsoft link.

bomz

http://neilkemp.us/src/sse_tutorial/sse_tutorial.html
Intel SSE Tutorial : An Introduction to the SSE Instruction Set

jj2007

Quote from: bomz on December 08, 2011, 09:44:18 AM
I need to delete double urls from the list of 20.000 http addresses. I do a little application which do this 10 minutes first. than I optimize it... optimize... now it do this 0.375 sec. This good task for SSE training

Zip the list and post it here. We can do it in less than 0.1 seconds.

bomz

It's password URL's for files access. 92 sign long. I doubt it's possible more quickly. make your own random list

bomz

it's hard to me formulate it in english

I match to each URLtricky algorithm logical summary of all it's signs in huge matrix (32 mb size for 500.000 max strings), so to compare two string enough compare it's 32 bit "hash" (legth is equal). if hash equal - this need sign compare, if not - no need any compare. so in 20.000 list only 5-10 may have the same "hash" or any

jj2007

#11
Quote from: bomz on December 08, 2011, 11:51:08 AM
It's password URL's for files access. 92 sign long. I doubt it's possible more quickly. make your own random list

Creates a file with 20,000 different URLs, of which roughly half are unique. Writes to a second file only the unique URLs.
30 lines, 32...47 ms on my slow old Celeron, reading the old and writing the new file included.

include \masm32\MasmBasic\MasmBasic.inc   ; download
  Init
 
mov ecx, 19999      ; we need a random file with 20,000 URLs
  Dim
My$(ecx)
 
.Repeat
     Let My$(ecx)="http://go"+Str$(Rand(10000))+"site.htm"
     dec ecx
  .Until Sign?
  Store "MyURLs.txt", My$()
  push Timer            ; ------- timing includes reading and writing of files ------
  Recall "MyURLs.txt", Mu$()    ; file contains multiple URLs, about 50% are unique
 
xchg eax, ecx         ; save # of lines
 
QSort Mu$()
 
Dim URL$(ecx)
  xor edi, edi
  dec ecx
  .Repeat
     mov esi, Mu$(ecx)
     .Repeat
          dec ecx
     .Until Sign? || StringsDiffer(esi, Mu$(ecx))
     Let URL$(edi)=esi
     inc edi
  .Until signed ecx<=0
  Store "MyUniqueURLs.txt", URL$(), edi
  void Timer
  pop edx
  Inkey Str$("The action took %i ms", eax-edx)
  Exit
end start

EDIT: .Until signed ecx<=0 ; signed is a simple equate: sdword ptr - without "signed", the code would continue if ecx was below zero, and trouble was ahead. Not for n=20000, but e.g. for 50000 strings.

@dancho: Thanxalot :bg

dancho

little off topic here
@jj2007
didn't notice this before but you masm basic is really top notch product,
really nice and clean code,
gratz on that...

dedndave

yah - he has spent a lot of time on it
it is pretty fast, too   :U

bomz

QuoteMasmBasic.lib(libtmpAB.obj) : warning LNK4078: multiple ".drectve" sections foun
d with different attributes (00000240)