I can find almost anything if I look long enough. This one is for JJ and anyone else who is interested to play with. Somewhere I also have a working Microsoft BM algo.
[attachment deleted by admin]
Thanxalot, Sir! The ToLower macro is not very fast, I am afraid, even when I use the "long" versions of InstrCI.
----------------------
9473 average crt_strstr res=4233787
7696 average InstrCi 1L res=4233787
7681 average InstrCi 2L res=4233787
7585 * average InstrCi
-236 diff finstr-InstrCi (neg = finstr is faster)
15434 avg BM Collake res=4233787
EDIT: Forgot to add a search for a loooong string. See new attachment.
----------------------
9454 average crt_strstr res=4241979
7378 average finstr res=2619
5547 average InStringL res=4241979
7480 average InstrCi 1s res=4241979
7466 average InstrCi 2s res=4241979
7710 average InstrCi 1L res=4241979
7821 average InstrCi 2L res=4241979
7619 * average InstrCi
-241 diff finstr-InstrCi (neg = finstr is faster)
451 diff BM s - InstrCi (neg = BM is faster)
8070 avg Boyer-Moore s res=2619
7240 avg Boyer-Moore L res=2616
11971 avg Hll istring res=2620
10681 avg BM Collake res=4241979
14776 avg BMiS Collake res=4241979
8696 avg BMiL Collake res=4241979
1943849 average StrStrI res=4241979
[attachment deleted by admin]
I was having a look at suffix trees for substring searches.. haven't gotten around to implementing anything, but they're worth reading up on.. also suffix arrays .. should be a lot faster than BM for large strings (not sure of the threshold).. but scanning say a megabytes worth of text ...
I know it's bad manners to update an old thread, but maybe it's worth to show what SSE2 can do on a P4...
12651 average crt_strstr res=4237883
10784 average finstr res=2619
2938 average InstrJJ ##### res=2620 <---------------- SSE2 -------------
16378 average InStringL res=2620 <--- the Masm32 library version
10556 average InstrCi 1s res=4237883 <---- case insensitive search
11364 average InstrCi 2s res=4237883
11449 average InstrCi 1L res=4237883
11595 average InstrCi 2L res=4237883
11240 * average InstrCi
... and a Celeron M:
8866 average crt_strstr res=4237883
9076 average finstr res=2619
1822 average InstrJJ ##### res=2620 <---------------- SSE2 -------------
9841 average InStringL res=2620 <--- the Masm32 library version
8525 average InstrCi 1s res=4237883
8503 average InstrCi 2s res=4237883
8592 average InstrCi 1L res=4237883
8681 average InstrCi 2L res=4237883
8575 * average InstrCi
1822/2620 = 0.7 cycles per byte of string length :bg
[attachment deleted by admin]
Now make a quad-core version which does one byte in 0.7/4 ~= 0.175 clocks.
Quote from: Mark Jones on April 28, 2009, 03:32:34 PM
Now make a quad-core version which does one byte in 0.7/4 ~= 0.175 clocks.
Let's be modest: a factor four on crt_strstr and the masm32 version should be enough :wink
Sorry but a newbie question here :
Microsoft (R) Macro Assembler Version 6.14.8444
Copyright (C) Microsoft Corp 1981-1997. All rights reserved.
Assembling: C:\masm32\jj string search\jj.asm
C:\masm32\jj string search\jj.asm(682) : error A2008: syntax error : pshufd
C:\masm32\jj string search\jj.asm(703) : error A2008: syntax error : xmm1
C:\masm32\jj string search\jj.asm(704) : error A2008: syntax error : xmm4
C:\masm32\jj string search\jj.asm(707) : error A2008: syntax error : movdqa
C:\masm32\jj string search\jj.asm(711) : error A2008: syntax error : movdqa
C:\masm32\jj string search\jj.asm(712) : error A2008: syntax error : xmm4
I get these kind of errors when I try to assemble SSE code. Is it because I need a newer ML ? I use the one in MASM32.
mitchi - You need to use at least masm 6.15 or jwasm to assemble.
I have MASM 6.14!! I'm only a version away :eek
Why is that? I believe that my MASM32 installation is recent!
I own Visual Studio and it comes with ML version 9, should I use that one instead?
Quote from: mitchi on April 29, 2009, 05:30:33 AM
I own Visual Studio and it comes with ML version 9, should I use that one instead?
Yep, version 9 is fine, but you have to find a number of files and copy them into the \Masm32\bin folder....
Apparently, to run ml version 9.0, copying these files seems to be essential and sufficient:
\masm32\bin\link.exe
\masm32\bin\ml.exe
\masm32\bin\mspdb80.dll
\masm32\bin\msvcr90.dll
\masm32\bin\Microsoft.VC90.CRT.manifest <-that one caused me some headaches, it's absolutely needed
ml.exe 6.15 on my computer.
[attachment deleted by admin]
Quote from: UtillMasm on April 29, 2009, 08:26:01 AM
ml.exe 6.15 on my computer.
Thanks, UtillMasm. Be aware of copyright rules when posting such stuff, though (it's included in the GeneSys package, too).
Mitchi, could you please give me feedback whether it works with the files listed above? Thanks.
I think with ML 9.0 if you have it already installed and working on your machine that just copying it to the masm32 directory will do the job as the DLLs needed are already there.
Dependency Viewer shows that the code base of ML 9.0 is considerably bigger.
ML 9.0 has the Visual 2008 C-Runtime DLL dependency (winsx) + ADVAPI32.DLL (For Hash functions) and it still stands at 349 KB.
ML 6.14 has only the KERNEL32.DLL dependency (The C Runtime is statically linked) and it's only 364 KB.
Just so you know :
The hello world application is 52 KB with /MT (static link) (VS2008)
The hello world is 8 KB with /MD (VS2008)
Gives you an idea..
You are getting the size increase by using the later linker, just use ML 9 by itself and use the linker from masm32 or Pelle's POLINK and you will get the same size small file. I have ML 9 installed on my dev box and it works fine with no change in file size.
No No, I wasn't talking about that. I can use ML 9 and the MASM32 linker just fine. I was just talking about C hello horld (Compiled and linked with VS2008).
Talking about that, the CL compiler is actually smaller than the assembler :dazzled: !!
cl.exe 122 KB
ML.EXE 349 KB
There are a Boyer-Moore algorithm for Unicode? I was looking around the internet and found only in C++ ...
You would have to roll that one yourself as I have not seen a hand written unicode version. It would still work the same way but you would be working with 2 byte characters instead of 1.
Quote from: hutch-- on February 23, 2011, 02:10:25 PM
You would have to roll that one yourself as I have not seen a hand written unicode version. It would still work the same way but you would be working with 2 byte characters instead of 1.
I have not seen too. Thank you
hutch! :U