The MASM Forum Archive 2004 to 2012

General Forums => The Laboratory => Topic started by: hutch-- on July 06, 2008, 08:04:42 AM

Title: Golden oldie BM algos written by Jeremy Collake
Post by: hutch-- on July 06, 2008, 08:04:42 AM
I can find almost anything if I look long enough. This one is for JJ and anyone else who is interested to play with. Somewhere I also have a working Microsoft BM algo.

[attachment deleted by admin]
Title: Re: Golden oldie BM algos written by Jeremy Collake
Post by: jj2007 on July 06, 2008, 09:22:11 AM
Thanxalot, Sir! The ToLower macro is not very fast, I am afraid, even when I use the "long" versions of InstrCI.

----------------------
9473 average crt_strstr res=4233787
7696 average InstrCi 1L res=4233787
7681 average InstrCi 2L res=4233787
7585 * average InstrCi
-236 diff finstr-InstrCi (neg = finstr is faster)
15434 avg BM Collake    res=4233787


EDIT: Forgot to add a search for a loooong string. See new attachment.

----------------------
9454 average crt_strstr res=4241979
7378 average finstr     res=2619
5547 average InStringL          res=4241979
7480 average InstrCi 1s res=4241979
7466 average InstrCi 2s res=4241979
7710 average InstrCi 1L res=4241979
7821 average InstrCi 2L res=4241979
7619 * average InstrCi
-241 diff finstr-InstrCi (neg = finstr is faster)
451 diff BM s - InstrCi (neg = BM is faster)
8070 avg Boyer-Moore s  res=2619
7240 avg Boyer-Moore L  res=2616
11971 avg Hll istring   res=2620
10681 avg BM Collake    res=4241979
14776 avg BMiS Collake  res=4241979
8696 avg BMiL Collake   res=4241979
1943849 average StrStrI res=4241979

[attachment deleted by admin]
Title: Re: Golden oldie BM algos written by Jeremy Collake
Post by: johnsa on July 18, 2008, 10:58:13 AM
I was having a look at suffix trees for substring searches.. haven't gotten around to implementing anything, but they're worth reading up on.. also suffix arrays .. should be a lot faster than BM for large strings (not sure of the threshold).. but scanning say a megabytes worth of text ...
Title: Re: Golden oldie BM algos written by Jeremy Collake
Post by: jj2007 on April 27, 2009, 08:47:09 AM
I know it's bad manners to update an old thread, but maybe it's worth to show what SSE2 can do on a P4...

12651 average crt_strstr        res=4237883
10784 average finstr            res=2619
2938 average InstrJJ #####      res=2620 <---------------- SSE2 -------------
16378 average InStringL         res=2620   <--- the Masm32 library version
10556 average InstrCi 1s        res=4237883  <---- case insensitive search
11364 average InstrCi 2s        res=4237883
11449 average InstrCi 1L        res=4237883
11595 average InstrCi 2L        res=4237883
11240 * average InstrCi


... and a Celeron M:
8866 average crt_strstr res=4237883
9076 average finstr     res=2619
1822 average InstrJJ #####      res=2620 <---------------- SSE2 -------------
9841 average InStringL          res=2620   <--- the Masm32 library version
8525 average InstrCi 1s res=4237883
8503 average InstrCi 2s res=4237883
8592 average InstrCi 1L res=4237883
8681 average InstrCi 2L res=4237883
8575 * average InstrCi


1822/2620 = 0.7 cycles per byte of string length :bg

[attachment deleted by admin]
Title: Re: Golden oldie BM algos written by Jeremy Collake
Post by: Mark Jones on April 28, 2009, 03:32:34 PM
Now make a quad-core version which does one byte in 0.7/4 ~= 0.175 clocks.
Title: Re: Golden oldie BM algos written by Jeremy Collake
Post by: jj2007 on April 28, 2009, 07:41:12 PM
Quote from: Mark Jones on April 28, 2009, 03:32:34 PM
Now make a quad-core version which does one byte in 0.7/4 ~= 0.175 clocks.

Let's be modest: a factor four on crt_strstr and the masm32 version should be enough :wink
Title: Re: Golden oldie BM algos written by Jeremy Collake
Post by: mitchi on April 28, 2009, 09:17:25 PM
Sorry but a newbie question here :
Microsoft (R) Macro Assembler Version 6.14.8444
Copyright (C) Microsoft Corp 1981-1997.  All rights reserved.

Assembling: C:\masm32\jj string search\jj.asm
C:\masm32\jj string search\jj.asm(682) : error A2008: syntax error : pshufd
C:\masm32\jj string search\jj.asm(703) : error A2008: syntax error : xmm1
C:\masm32\jj string search\jj.asm(704) : error A2008: syntax error : xmm4
C:\masm32\jj string search\jj.asm(707) : error A2008: syntax error : movdqa
C:\masm32\jj string search\jj.asm(711) : error A2008: syntax error : movdqa
C:\masm32\jj string search\jj.asm(712) : error A2008: syntax error : xmm4


I get these kind of errors when I try to assemble SSE code. Is it because I need a newer ML ? I use the one in MASM32.

Title: Re: Golden oldie BM algos written by Jeremy Collake
Post by: Jimg on April 28, 2009, 10:29:46 PM
mitchi - You need to use at least masm 6.15 or jwasm to assemble.
Title: Re: Golden oldie BM algos written by Jeremy Collake
Post by: mitchi on April 29, 2009, 05:30:33 AM
I have MASM 6.14!! I'm only a version away  :eek
Why is that? I believe that my MASM32 installation is recent!

I own Visual Studio and it comes with ML version 9, should I use that one instead?
Title: Re: Golden oldie BM algos written by Jeremy Collake
Post by: jj2007 on April 29, 2009, 06:07:35 AM
Quote from: mitchi on April 29, 2009, 05:30:33 AM
I own Visual Studio and it comes with ML version 9, should I use that one instead?

Yep, version 9 is fine, but you have to find a number of files and copy them into the \Masm32\bin folder....

Apparently, to run ml version 9.0, copying these files seems to be essential and sufficient:

\masm32\bin\link.exe
\masm32\bin\ml.exe
\masm32\bin\mspdb80.dll
\masm32\bin\msvcr90.dll
\masm32\bin\Microsoft.VC90.CRT.manifest   <-that one caused me some headaches, it's absolutely needed
Title: Re: Golden oldie BM algos written by Jeremy Collake
Post by: UtillMasm on April 29, 2009, 08:26:01 AM
ml.exe 6.15 on my computer.

[attachment deleted by admin]
Title: Re: Golden oldie BM algos written by Jeremy Collake
Post by: jj2007 on April 29, 2009, 08:35:21 AM
Quote from: UtillMasm on April 29, 2009, 08:26:01 AM
ml.exe 6.15 on my computer.

Thanks, UtillMasm. Be aware of copyright rules when posting such stuff, though (it's included in the GeneSys package, too).

Mitchi, could you please give me feedback whether it works with the files listed above? Thanks.
Title: Re: Golden oldie BM algos written by Jeremy Collake
Post by: hutch-- on April 29, 2009, 08:46:58 AM
I think with ML 9.0 if you have it already installed and working on your machine that just copying it to the masm32 directory will do the job as the DLLs needed are already there.
Title: Re: Golden oldie BM algos written by Jeremy Collake
Post by: mitchi on April 29, 2009, 12:17:10 PM
Dependency Viewer shows that the code base of ML 9.0 is considerably bigger.

ML 9.0 has the Visual 2008 C-Runtime DLL dependency (winsx) + ADVAPI32.DLL (For Hash functions) and it still stands at 349 KB.
ML 6.14 has only the KERNEL32.DLL dependency (The C Runtime is statically linked) and it's only 364 KB.

Just so you know :
The hello world application is 52 KB with /MT (static link) (VS2008)
The hello world is 8 KB with /MD (VS2008)

Gives you an idea..



Title: Re: Golden oldie BM algos written by Jeremy Collake
Post by: hutch-- on April 29, 2009, 01:48:17 PM
You are getting the size increase by using the later linker, just use ML 9 by itself and use the linker from masm32 or Pelle's POLINK and you will get the same size small file. I have ML 9 installed on my dev box and it works fine with no change in file size.
Title: Re: Golden oldie BM algos written by Jeremy Collake
Post by: mitchi on April 29, 2009, 02:10:24 PM
No No, I wasn't talking about that. I can use ML 9 and the MASM32 linker just fine. I was just talking about C hello horld (Compiled and linked with VS2008).
Talking about that, the CL compiler is actually smaller than the assembler   :dazzled: !!

cl.exe     122 KB
ML.EXE   349 KB

Title: Re: Golden oldie BM algos written by Jeremy Collake
Post by: 0x401000 on February 23, 2011, 01:54:18 PM
There are a Boyer-Moore algorithm for Unicode? I was looking around the internet and found only in C++ ...
Title: Re: Golden oldie BM algos written by Jeremy Collake
Post by: hutch-- on February 23, 2011, 02:10:25 PM
You would have to roll that one yourself as I have not seen a hand written unicode version. It would still work the same way but you would be working with 2 byte characters instead of 1.
Title: Re: Golden oldie BM algos written by Jeremy Collake
Post by: 0x401000 on February 23, 2011, 02:35:13 PM
Quote from: hutch-- on February 23, 2011, 02:10:25 PM
You would have to roll that one yourself as I have not seen a hand written unicode version. It would still work the same way but you would be working with 2 byte characters instead of 1.

I have not seen too. Thank you hutch!  :U