The MASM Forum Archive 2004 to 2012

General Forums => The Laboratory => Topic started by: ecube on November 18, 2006, 06:05:17 AM

Title: InString Speed Test!
Post by: ecube on November 18, 2006, 06:05:17 AM
My results

msvcrt strstr          126
windows api strstr  1112(ugh)
Masm InString        220
InStringL               89 (don't know who wrote this one but is fantastic code)
ntdll strstr                   (undocumented? I couldn't get it to work)

[attachment deleted by admin]
Title: Re: InString Speed Test!
Post by: hutch-- on November 18, 2006, 10:06:25 AM
cube,

On my work PIV the MSVCRT strstr is faster than the InstringL proc. The API is very slow and the MASM32 module carries a bit too much overhead. It was never classed as fast, it just generally worked.
Title: Re: InString Speed Test!
Post by: ecube on November 18, 2006, 10:32:50 AM
hutch--,
Really? I tested on many strings InStringL is always the fastest, seems to get better with longer strings IMO anyway.
Title: Re: InString Speed Test!
Post by: hutch-- on November 18, 2006, 11:03:40 AM
Cube,

What box are you running the tests on ? I did the test on a 2.8 gig Northwood core PIV.
Title: Re: InString Speed Test!
Post by: lingo on November 18, 2006, 01:46:12 PM
E^cube,
It is my old Instring proc (at the end with L as Lingo)
http://www.asmcommunity.net/board/index.php?topic=13638.msg134462#msg134462

It is still the fastest proc
at my box with P4 3.6 GHz   :lol

times:
     timing msvcrt strstr - 319
     timing strstr - 1830
     timing InString - 337
     timing InStringL- 148

Regards,
Lingo   
Title: Re: InString Speed Test!
Post by: Seb on November 18, 2006, 02:00:32 PM
Results on my AMD Athlon 64 X2 Dual Core:


msvcrt strstr   124
windows api strstr   1089
Masm InString   217
InStringL   89


Edit:

While I'm at it, would anybody mind explaining why Lingo's code is so much faster than the rests? Does having prologue and epilogue turned off make any difference in terms of speed? This is not the first time I've seen lingo's cryptic and very-hard-to-read code beat other codes, so I must ask: why? :dazzled:
Title: Re: InString Speed Test!
Post by: MichaelW on November 18, 2006, 04:18:59 PM
The NTDLL strstr appears to be slightly slower than the MSVCRT strstr. Also, InString returns a character position instead of an address, so it's probably doing more work.



[attachment deleted by admin]
Title: Re: InString Speed Test!
Post by: EduardoS on November 18, 2006, 04:20:54 PM
Quote from: Seb on November 18, 2006, 02:00:32 PM
While I'm at it, would anybody mind explaining why Lingo's code is so much faster than the rests? Does having prologue and epilogue turned off make any difference in terms of speed? This is not the first time I've seen lingo's cryptic and very-hard-to-read code beat other codes, so I must ask: why? :dazzled:
Maybe he is skilled?  :8)
Title: Re: InString Speed Test!
Post by: Seb on November 18, 2006, 05:36:49 PM
Quote from: EduardoS on November 18, 2006, 04:20:54 PM
Quote from: Seb on November 18, 2006, 02:00:32 PM
While I'm at it, would anybody mind explaining why Lingo's code is so much faster than the rests? Does having prologue and epilogue turned off make any difference in terms of speed? This is not the first time I've seen lingo's cryptic and very-hard-to-read code beat other codes, so I must ask: why? :dazzled:
Maybe he is skilled?  :8)

Yes, obviously he is. ::)
Title: Re: InString Speed Test!
Post by: ecube on November 18, 2006, 09:48:20 PM
Quote from: Seb on November 18, 2006, 02:00:32 PM
Results on my AMD Athlon 64 X2 Dual Core:


msvcrt strstr   124
windows api strstr   1089
Masm InString   217
InStringL   89


Edit:

While I'm at it, would anybody mind explaining why Lingo's code is so much faster than the rests? Does having prologue and epilogue turned off make any difference in terms of speed? This is not the first time I've seen lingo's cryptic and very-hard-to-read code beat other codes, so I must ask: why? :dazzled:
Yes hes very skilled , if you look at his code it makes sense he doesn't use the stack-frame and access the parameters directly(prologue and epilogue is turning off the ebp setup code,)

mov edi, dword ptr [esp+8] ;param 1
mov esi, dword ptr [esp+12] ;param 2
mov ecx, dword ptr [esp+16] ;param 3

instead of
mov edi, param1
mov esi, param 2
mov ecx,param 3


also if you read his code hes checks 2 things at once, uses byte alignment, and reading somethings in dwords. As Donkey has mentioned before the cpu reads everything in dwords, not bytes or words, all dwords, just shifts up and down when needed, so he uses a lot of speed techniques, all in all lingo is a n asm all-star.
Title: Re: InString Speed Test!
Post by: hutch-- on November 18, 2006, 10:19:28 PM
It did have that look of Lingo's code.  :bg

Something that is worth checking is doing a test on a much longer string, from memory the test piece was using a very short string and this may account for the bad result with the API which usually have SEH and other things around them to slow them up. A longer test piece would remove those factors to a large degree.

I was thinking of loading an external file of 16 or 32k just so you get past the minor overhead factors.
Title: Re: InString Speed Test!
Post by: Seb on November 18, 2006, 10:22:34 PM
Quote from: E^cube on November 18, 2006, 09:48:20 PM
all in all lingo is a n asm all-star.

Agreed.

With the risk of going off-topic in the discussion, I have a question for lingo: Would you mind mind telling us curious fokes what optimization manuals or papers you read? I'm very interested in every aspect of Assembly code optimization, so it'd be greatly appreciated if you shared your resources with your fellow comrades at the MASM forum. :U
Title: Re: InString Speed Test!
Post by: hutch-- on November 18, 2006, 10:24:01 PM
Seb,

I think Lingo IS the optimisation manual, he has been producing very fast algos for a long time now.  :U
Title: Re: InString Speed Test!
Post by: ecube on November 18, 2006, 10:30:43 PM
Quote from: Seb on November 18, 2006, 10:22:34 PM
Quote from: E^cube on November 18, 2006, 09:48:20 PM
all in all lingo is a n asm all-star.

Agreed.

With the risk of going off-topic in the discussion, I have a question for lingo: Would you mind mind telling us curious fokes what optimization manuals or papers you read? I'm very interested in every aspect of Assembly code optimization, so it'd be greatly appreciated if you shared your resources with your fellow comrades at the MASM forum. :U

I'm not lingo but http://www.mark.masmcode.com/ and http://www.masm32.com/board/index.php?topic=5464.0 are good optimization pages.
Title: Re: InString Speed Test!
Post by: Seb on November 18, 2006, 10:41:44 PM
Quote from: hutch-- on November 18, 2006, 10:24:01 PM
Seb,

I think Lingo IS the optimisation manual, he has been producing very fast algos for a long time now.  :U

Hear, hear! :cheekygreen:

Quote from: E^cube on November 18, 2006, 10:30:43 PM
I'm not lingo but http://www.mark.masmcode.com/ and http://www.masm32.com/board/index.php?topic=5464.0 are good optimization pages.

Thanks!
Title: Re: InString Speed Test!
Post by: hutch-- on November 19, 2006, 04:11:47 AM
I had a quick play with the library version of InString by onrolling a couple of the loops and it is a bit faster but the algo properly needs a rewrite. An interesting factor is both the msvcrt and lingo's algos get faster with longer patterns where a byte scanner gets slower in its matching loop.

Here are the timings on my 2 PIVs.


Northwood code 2.8 gig PIV

719 timing msvcrt strstr
763 timing InStringx
480 timing InStringL
699 timing msvcrt strstr
768 timing InStringx
474 timing InStringL
696 timing msvcrt strstr
769 timing InStringx
465 timing InStringL
713 timing msvcrt strstr
764 timing InStringx
465 timing InStringL
698 timing msvcrt strstr
767 timing InStringx
470 timing InStringL
704 timing msvcrt strstr
763 timing InStringx
466 timing InStringL
716 timing msvcrt strstr
767 timing InStringx
462 timing InStringL
702 timing msvcrt strstr
767 timing InStringx
464 timing InStringL
----------------------
705 average msvcrt strstr
766 average InStringx
468 average InStringL
----------------------
Press any key to continue ...

Prescott core 3.0 gig PIV.

845 timing msvcrt strstr
848 timing InStringx
434 timing InStringL
846 timing msvcrt strstr
846 timing InStringx
438 timing InStringL
844 timing msvcrt strstr
845 timing InStringx
447 timing InStringL
845 timing msvcrt strstr
845 timing InStringx
464 timing InStringL
842 timing msvcrt strstr
844 timing InStringx
445 timing InStringL
844 timing msvcrt strstr
843 timing InStringx
450 timing InStringL
843 timing msvcrt strstr
847 timing InStringx
458 timing InStringL
847 timing msvcrt strstr
845 timing InStringx
470 timing InStringL
----------------------
844 average msvcrt strstr
845 average InStringx
450 average InStringL
----------------------
Press any key to continue ...


Lingo's algo has clearly got the legs on the other two.


[attachment deleted by admin]
Title: Re: InString Speed Test!
Post by: hutch-- on November 19, 2006, 01:29:57 PM
I have coded a new version of Instring "finstr" which has a lot less overhead than the library version. This version differs from the library version in that it does not use a preset length for either the source string or the search pattern but calculates them as zero terminated strings. The new version is faster on normal to short length strings than either Lingo's or the MSVCRT strstr but as the string gets longer, Lingos starts to become faster. The timings below are on my 2 PIVs but I did run a test on an AMD I own and Lingo's code better suits the AMD than either the MSVCRT or my own does.

These are the results on my 2 PIVs.



2.8 gig Northwood PIV

163 timing msvcrt strstr
70 timing finstr
118 timing InStringL
163 timing msvcrt strstr
71 timing finstr
127 timing InStringL
164 timing msvcrt strstr
71 timing finstr
120 timing InStringL
162 timing msvcrt strstr
70 timing finstr
120 timing InStringL
162 timing msvcrt strstr
71 timing finstr
119 timing InStringL
163 timing msvcrt strstr
71 timing finstr
119 timing InStringL
161 timing msvcrt strstr
72 timing finstr
119 timing InStringL
160 timing msvcrt strstr
71 timing finstr
118 timing InStringL
----------------------
162 average msvcrt strstr
70 average finstr
120 average InStringL
----------------------
Press any key to continue ...

3.0 gig Prescott PIV

257 timing msvcrt strstr
110 timing finstr
141 timing InStringL
259 timing msvcrt strstr
110 timing finstr
141 timing InStringL
258 timing msvcrt strstr
110 timing finstr
142 timing InStringL
253 timing msvcrt strstr
110 timing finstr
142 timing InStringL
257 timing msvcrt strstr
110 timing finstr
142 timing InStringL
257 timing msvcrt strstr
110 timing finstr
142 timing InStringL
257 timing msvcrt strstr
110 timing finstr
142 timing InStringL
257 timing msvcrt strstr
110 timing finstr
141 timing InStringL
----------------------
256 average msvcrt strstr
110 average finstr
141 average InStringL
----------------------
Press any key to continue ...

[attachment deleted by admin]
Title: Re: InString Speed Test!
Post by: EduardoS on November 19, 2006, 04:56:06 PM
My AMD, K-8:
instr2:

813 timing msvcrt strstr
749 timing InStringx
313 timing InStringL
769 timing msvcrt strstr
749 timing InStringx
371 timing InStringL
768 timing msvcrt strstr
771 timing InStringx
318 timing InStringL
782 timing msvcrt strstr
749 timing InStringx
307 timing InStringL
768 timing msvcrt strstr
751 timing InStringx
310 timing InStringL
768 timing msvcrt strstr
751 timing InStringx
319 timing InStringL
766 timing msvcrt strstr
749 timing InStringx
318 timing InStringL
771 timing msvcrt strstr
750 timing InStringx
307 timing InStringL
----------------------
775 average msvcrt strstr
752 average InStringx
320 average InStringL
----------------------
Press any key to continue ...


instr3:

122 timing msvcrt strstr
102 timing finstr
87 timing InStringL
118 timing msvcrt strstr
90 timing finstr
91 timing InStringL
118 timing msvcrt strstr
99 timing finstr
89 timing InStringL
119 timing msvcrt strstr
31 timing finstr
56 timing InStringL
118 timing msvcrt strstr
87 timing finstr
92 timing InStringL
117 timing msvcrt strstr
89 timing finstr
89 timing InStringL
118 timing msvcrt strstr
89 timing finstr
91 timing InStringL
118 timing msvcrt strstr
89 timing finstr
91 timing InStringL
----------------------
118 average msvcrt strstr
84 average finstr
85 average InStringL
----------------------
Press any key to continue ...
Title: Re: InString Speed Test!
Post by: lingo on November 21, 2006, 12:58:06 AM
Sorry Hutch,

I apologize... :(
just forgot about that...

Regards,
Lingo
Title: Re: InString Speed Test!
Post by: hutch-- on November 21, 2006, 01:27:50 AM
The source code for the "finstr" proc has been modified by removing the loop unrolls. I removed the comments and the timings are as below. The algo was designed with the large unroll to compensate for the extra testing for zero in both the scan loop and the match loop.


2.8 gig Northwood core PIV

160 timing msvcrt strstr
72 timing finstr
104 timing InStringL
162 timing msvcrt strstr
72 timing finstr
106 timing InStringL
164 timing msvcrt strstr
75 timing finstr
103 timing InStringL
163 timing msvcrt strstr
72 timing finstr
106 timing InStringL
162 timing msvcrt strstr
72 timing finstr
106 timing InStringL
157 timing msvcrt strstr
73 timing finstr
105 timing InStringL
163 timing msvcrt strstr
75 timing finstr
105 timing InStringL
159 timing msvcrt strstr
73 timing finstr
105 timing InStringL
----------------------
161 average msvcrt strstr
73 average finstr
105 average InStringL
----------------------
Press any key to continue ...

3.0 gig Prescott core PIV

257 timing msvcrt strstr
110 timing finstr
140 timing InStringL
257 timing msvcrt strstr
110 timing finstr
142 timing InStringL
257 timing msvcrt strstr
110 timing finstr
142 timing InStringL
255 timing msvcrt strstr
111 timing finstr
140 timing InStringL
257 timing msvcrt strstr
110 timing finstr
142 timing InStringL
257 timing msvcrt strstr
110 timing finstr
143 timing InStringL
257 timing msvcrt strstr
112 timing finstr
141 timing InStringL
258 timing msvcrt strstr
110 timing finstr
142 timing InStringL
----------------------
256 average msvcrt strstr
110 average finstr
141 average InStringL
----------------------
Press any key to continue ...


Later.

Tested with the longer string commented out in the test piece, Lingo's algo is clearly faster as the string length gets longer.


710 timing msvcrt strstr
526 timing finstr
343 timing InStringL
700 timing msvcrt strstr
525 timing finstr
349 timing InStringL
708 timing msvcrt strstr
526 timing finstr
341 timing InStringL
707 timing msvcrt strstr
524 timing finstr
350 timing InStringL
708 timing msvcrt strstr
530 timing finstr
334 timing InStringL
706 timing msvcrt strstr
525 timing finstr
334 timing InStringL
705 timing msvcrt strstr
522 timing finstr
354 timing InStringL
701 timing msvcrt strstr
524 timing finstr
352 timing InStringL
----------------------
705 average msvcrt strstr
525 average finstr
344 average InStringL
----------------------
Press any key to continue ...

[attachment deleted by admin]
Title: Re: InString Speed Test!
Post by: ecube on November 21, 2006, 04:34:44 AM
114 timing msvcrt strstr
95 timing finstr
108 timing InStringL
114 timing msvcrt strstr
97 timing finstr
106 timing InStringL
115 timing msvcrt strstr
95 timing finstr
107 timing InStringL
115 timing msvcrt strstr
96 timing finstr
112 timing InStringL
113 timing msvcrt strstr
97 timing finstr
106 timing InStringL
115 timing msvcrt strstr
95 timing finstr
107 timing InStringL
113 timing msvcrt strstr
108 timing finstr
107 timing InStringL
113 timing msvcrt strstr
96 timing finstr
107 timing InStringL
----------------------
114 average msvcrt strstr
97 average finstr
107 average InStringL
----------------------
Press any key to continue ...

amd 64 3800+
Title: Re: InString Speed Test!
Post by: Jimg on November 21, 2006, 04:40:26 AM
Hi Hutch-

Looking at your code, I don't understand why you duplicate the code inside the repeat one more time outside the repeat loop.

Isn't this-
scanloop:
  REPEAT 15
    add esi, ebp
    movzx edx, BYTE PTR [esi]
    cmp eax, edx
    je prematch
    test edx, edx                   ; test for source terminator
    jz no_match
  ENDM

    add esi, ebp
    movzx edx, BYTE PTR [esi]
    cmp eax, edx
    je prematch
    test edx, edx                   ; test for source terminator
    jnz scanloop
    jmp no_match

the same as this-
  REPEAT 16
    add esi, ebp
    movzx edx, BYTE PTR [esi]
    cmp eax, edx
    je prematch
    test edx, edx                   ; test for source terminator
    jz no_match
  ENDM
    jmp scanloop
?  At best it saves one cycle every 16 bytes tested for the most likely jump being to scanloop.

And if you were to get four bytes to test at once, you could replace 3 movzx's with one rol edx,16 like-
  REPEAT 4
    add esi, ebp
    mov edx,[esi]
    test dl, dl                   ; test for source terminator
    jz no_match
    cmp al, dl
    je prematch

    add esi,ebp
    test dh, dh                   ; test for source terminator
    jz no_match
    cmp al, dh
    je prematch
   
    add esi,ebp
    rol edx,16
    test dl, dl                   ; test for source terminator
    jz no_match
    cmp al, dl
    je prematch

    add esi,ebp
    test dh, dh                   ; test for source terminator
    jz no_match
    cmp al, dh
    je prematch
   
  ENDM

    jmp scanloop

which run slightly faster on my amd.
Title: Re: InString Speed Test!
Post by: hutch-- on November 21, 2006, 05:54:26 AM
Jim,

> Looking at your code, I don't understand why you duplicate the code inside the repeat one more time outside the repeat loop.

Habit, I usually do an unroll by repeating the first one less than the required unroll count and changing any necessary jumps.

I would be interested in seeing your alternate version with the DWORD read and the ROL. It would probably need to be aligned before the DWORD read which is easy enough to do. The only thing that would worry me is the stall on some older Intel hardware for doing byte reads after a DWORD write.
Title: Re: InString Speed Test!
Post by: MichaelW on November 21, 2006, 07:33:00 AM
P3:

129 timing msvcrt strstr
114 timing finstr
111 timing InStringL
129 timing msvcrt strstr
113 timing finstr
111 timing InStringL
129 timing msvcrt strstr
113 timing finstr
110 timing InStringL
129 timing msvcrt strstr
113 timing finstr
111 timing InStringL
129 timing msvcrt strstr
114 timing finstr
110 timing InStringL
129 timing msvcrt strstr
113 timing finstr
111 timing InStringL
128 timing msvcrt strstr
113 timing finstr
111 timing InStringL
129 timing msvcrt strstr
113 timing finstr
110 timing InStringL
----------------------
128 average msvcrt strstr
113 average finstr
110 average InStringL
----------------------
Press any key to continue ...
Title: Re: InString Speed Test!
Post by: zooba on November 21, 2006, 07:40:32 AM
P3 Mobile (probably 1400MHz, though maybe less, SpeedStep doesn't seem to work properly...):

93 timing msvcrt strstr
108 timing finstr
71 timing InStringL
96 timing msvcrt strstr
111 timing finstr
71 timing InStringL
96 timing msvcrt strstr
131 timing finstr
70 timing InStringL
93 timing msvcrt strstr
108 timing finstr
71 timing InStringL
95 timing msvcrt strstr
108 timing finstr
70 timing InStringL
94 timing msvcrt strstr
118 timing finstr
70 timing InStringL
93 timing msvcrt strstr
119 timing finstr
69 timing InStringL
93 timing msvcrt strstr
108 timing finstr
70 timing InStringL
----------------------
94 average msvcrt strstr
113 average finstr
70 average InStringL
----------------------
Press any key to continue ...


Interesting that finstr is the same for both mine and MichaelW's computers, though the others are approx 30-40 faster on mine (consistently). I guess I am running at 1400MHz :bg

Cheers,

Zooba :U
Title: Re: InString Speed Test!
Post by: six_L on November 21, 2006, 08:45:26 AM
QuoteGenuineIntel  Family 6  Model 13  Stepping 6
Intel Brand String: Intel(R) Pentium(R) M processor
Features: FPU TSC CX8 CMOV CLFSH FXSR MMX SSE SSE2
Quote94 timing msvcrt strstr
107 timing finstr
71 timing InStringL
93 timing msvcrt strstr
122 timing finstr
71 timing InStringL
95 timing msvcrt strstr
129 timing finstr
71 timing InStringL
94 timing msvcrt strstr
127 timing finstr
71 timing InStringL
94 timing msvcrt strstr
122 timing finstr
71 timing InStringL
94 timing msvcrt strstr
107 timing finstr
71 timing InStringL
93 timing msvcrt strstr
123 timing finstr
71 timing InStringL
94 timing msvcrt strstr
107 timing finstr
71 timing InStringL
----------------------
93 average msvcrt strstr
118 average finstr
71 average InStringL
----------------------
Press any key to continue ...
Title: Re: InString Speed Test!
Post by: Jimg on November 21, 2006, 02:33:56 PM
Hutch-

Quote from: hutch-- on November 21, 2006, 05:54:26 AM
>
I would be interested in seeing your alternate version with the DWORD read and the ROL. It would probably need to be aligned before the DWORD read which is easy enough to do. The only thing that would worry me is the stall on some older Intel hardware for doing byte reads after a DWORD write.

Look at the last block of code in my previous post.  All I did is replace your block with that one, and it reduced the run from 115 to 107 cycles on the shorter string.  Still much slower than Lingo on the longer string however.  But look carefully, I could have screwed up the endian thing.  I haven't done any masm since february and am a littly rusty.

edit.   Actually, SHR edx,16 would be better than the ROL, and move it up one line.

edit2.  I just dusted off my old celeron laptop and gave it a try.  No difference in speed so basically, forget everything I said.
Title: Re: InString Speed Test!
Post by: lingo on November 22, 2006, 01:52:04 AM
Shorter substrings and rolling out?
OK, ... new InStringL with new results:  :lol
207 timing msvcrt strstr
111 timing finstr
70 timing InStringL
209 timing msvcrt strstr
117 timing finstr
71 timing InStringL
209 timing msvcrt strstr
111 timing finstr
70 timing InStringL
209 timing msvcrt strstr
112 timing finstr
70 timing InStringL
206 timing msvcrt strstr
113 timing finstr
70 timing InStringL
209 timing msvcrt strstr
113 timing finstr
70 timing InStringL
209 timing msvcrt strstr
111 timing finstr
70 timing InStringL
213 timing msvcrt strstr
110 timing finstr
70 timing InStringL
----------------------
208 average msvcrt strstr
112 average finstr
70 average InStringL
----------------------
Press any key to continue ...


Regards,
Lingo


[attachment deleted by admin]
Title: Re: InString Speed Test!
Post by: ecube on November 22, 2006, 04:39:26 AM
114 timing msvcrt strstr
94 timing finstr
49 timing InStringL
113 timing msvcrt strstr
94 timing finstr
49 timing InStringL
112 timing msvcrt strstr
95 timing finstr
49 timing InStringL
113 timing msvcrt strstr
93 timing finstr
50 timing InStringL
113 timing msvcrt strstr
95 timing finstr
49 timing InStringL
113 timing msvcrt strstr
94 timing finstr
48 timing InStringL
113 timing msvcrt strstr
99 timing finstr
50 timing InStringL
112 timing msvcrt strstr
95 timing finstr
49 timing InStringL
----------------------
112 average msvcrt strstr
94 average finstr
49 average InStringL
----------------------
Press any key to continue ...
Title: Re: InString Speed Test!
Post by: Seb on November 22, 2006, 12:36:24 PM
Very nice timings! :U


107 timing msvcrt strstr
90 timing finstr
47 timing InStringL
107 timing msvcrt strstr
90 timing finstr
47 timing InStringL
107 timing msvcrt strstr
89 timing finstr
46 timing InStringL
107 timing msvcrt strstr
89 timing finstr
47 timing InStringL
107 timing msvcrt strstr
90 timing finstr
47 timing InStringL
107 timing msvcrt strstr
89 timing finstr
47 timing InStringL
106 timing msvcrt strstr
102 timing finstr
47 timing InStringL
106 timing msvcrt strstr
90 timing finstr
46 timing InStringL
----------------------
106 average msvcrt strstr
91 average finstr
46 average InStringL
----------------------
Title: Re: InString Speed Test!
Post by: hutch-- on November 23, 2006, 08:10:35 AM
I have rewritten the scanloop part of finstr using a different technique to determine if the 4 bytes being tested contain a null termniator or not. i have been testing against Lingo's latest which is a very fast algo but it does have some anomolies in terms of what is pointed at it and these make it inconsistent in its speed. The match loop is a lot faster in his latest version but for some reason it has problems with text that starts with a blank space " word " etc ...

I have not added the final part of finstr which is to test up to the last 3 bytes so in its current form it will miss a search pattern up to 3 bytes long if it is at the end of the source being searched. It does not effect any of the examples in the test piece.

These are the results on my dev PIV.


Source length = 185
Mismatch length = 290

280 finstr mismatch recovery
21 finstr short
81 finstr medium
128 finstr long
175 finstr end
----------------------
Short Search
----------------------
74 average finstr
114 average InStringL
----------------------
----------------------
Medium Search
----------------------
195 average finstr
173 average InStringL
----------------------
----------------------
Long Search
----------------------
385 average finstr
467 average InStringL
----------------------
------------------------------------------
Full Length search on high ANSI characters
------------------------------------------
353 average finstr
505 average InStringL
----------------------
----------------------
Mismatch Recovery Test
----------------------
1186 average finstr
1365 average InStringL
----------------------
Press any key to continue ...



[attachment deleted by admin]
Title: Re: InString Speed Test!
Post by: UlliN on November 23, 2006, 02:44:04 PM
Hi,

that's my results on an AMD FX57

Source length = 185
Mismatch length = 290

280 finstr mismatch recovery
21 finstr short
81 finstr medium
128 finstr long
175 finstr end
----------------------
Short Search
----------------------
86 average finstr
86 average InStringL
----------------------
----------------------
Medium Search
----------------------
240 average finstr
111 average InStringL
----------------------
----------------------
Long Search
----------------------
464 average finstr
355 average InStringL
----------------------
------------------------------------------
Full Length search on high ANSI characters
------------------------------------------
426 average finstr
306 average InStringL
----------------------
----------------------
Mismatch Recovery Test
----------------------
1416 average finstr
928 average InStringL
----------------------
Press any key to continue ...

Ulli
Title: Re: InString Speed Test!
Post by: Jimg on November 23, 2006, 05:06:45 PM
Good job Hutch.

Can you explain how this part works?
  ; ----------------------------------------------
  ; test if the next 4 bytes contain an ascii zero
  ; ----------------------------------------------
    add esi, 4
    mov edx, [esi]
    lea ecx, [edx-01010101h]
    not edx
    and ecx, esp        ; modify ECX
    test ecx, edx       ; test with no modify
    jnz no_match



The old method of doing one load of a double into edx and testing dl and dh  for example-

  ; ----------------------------------------
  ; test each byte for a patn 1st char match
  ; ----------------------------------------
    mov ebx,[esi]
    mov edx,[esi+2]
    sub bl,al
    jz si0
    sub bh,al
    jz si1
    sub dl,al
    jz si2
    sub dh,al
    jz si3

is faster on an AMD, but slower on the celeron.

One problem-

Try

    Mainstr db "The time has come the walrus said to speak of many things,"
            db " of birds and bugs and ceiling wax, of cabbages and kings."
            db " And if the sea is boiling hot and whether pigs have wings «¤»¼µþ®§© z",0

    Substr1  db "z",0
   
   
It never finds the "z"



Timing results:

your latest on AMD     latest code with old  your latest on        Old method on
Athlon XP 3000+         method of match as    Celeron 2.0 Ghz        Celeron
at 2.16 Ghz              above on AMD

Source length = 185    Source length = 185   Source length = 185   Source length = 185                                                 
Mismatch length = 290  Mismatch length = 290 Mismatch length = 290 Mismatch length = 290                                               
                                                                                                                                       
280 finstr mismatch r  280 finstr mismatch r 280 finstr mismatch r 280 finstr mismatch r                                               
21 finstr short        21 finstr short       21 finstr short       21 finstr short                                                     
81 finstr medium       81 finstr medium      81 finstr medium      81 finstr medium                                                     
128 finstr long        128 finstr long       128 finstr long       128 finstr long                                                     
175 finstr end         175 finstr end        175 finstr end        175 finstr end                                                       
---------------------  --------------------- --------------------- ---------------------                                               
Short Search           Short Search          Short Search          Short Search                                                         
---------------------  --------------------- --------------------- ---------------------                                               
110 average finstr     105 average finstr    70 average finstr     88 average finstr                                                   
113 average InStringL  113 average InStringL 112 average InStringL 116 average InStringL                                               
---------------------  --------------------- --------------------- ---------------------                                               
---------------------  --------------------- --------------------- ---------------------                                               
Medium Search          Medium Search         Medium Search         Medium Search                                                       
---------------------  --------------------- --------------------- ---------------------                                               
304 average finstr     255 average finstr    190 average finstr    239 average finstr                                                   
142 average InStringL  141 average InStringL 169 average InStringL 173 average InStringL                                               
---------------------  --------------------- --------------------- ---------------------                                               
---------------------  --------------------- --------------------- ---------------------                                               
Long Search            Long Search           Long Search           Long Search                                                         
---------------------  --------------------- --------------------- ---------------------                                               
478 average finstr     421 average finstr    388 average finstr    444 average finstr                                                   
377 average InStringL  377 average InStringL 461 average InStringL 453 average InStringL                                               
---------------------  --------------------- --------------------- ---------------------                                               
---------------------  --------------------- --------------------- ---------------------                                               
Full Length search on  Full Length search on Full Length search on Full Length search on                                               
---------------------  --------------------- --------------------- ---------------------                                               
557 average finstr     419 average finstr    345 average finstr    425 average finstr                                                   
539 average InStringL  539 average InStringL 502 average InStringL 492 average InStringL                                               
---------------------  --------------------- --------------------- ---------------------                                               
---------------------  --------------------- --------------------- ---------------------                                               
Mismatch Recovery Tes  Mismatch Recovery Tes Mismatch Recovery Tes Mismatch Recovery Tes                                               
---------------------  --------------------- --------------------- ---------------------                                               
1676 average finstr    1710 average finstr   1151 average finstr   1245 average finstr                                                 
1166 average InString  1168 average InString 1322 average InString 1309 average InString                                               
---------------------  --------------------- --------------------- ---------------------                                               
Press any key to cont  Press any key to cont Press any key to cont Press any key to cont                                               

So looking at the numbers, I would say your new method is the best overall if you fix the problem with finding the last character.
Title: Re: InString Speed Test!
Post by: hutch-- on November 23, 2006, 10:51:18 PM
Jim,

Thanks for doing all of the tests.

I posted the reason why above.
Quote
I have not added the final part of finstr which is to test up to the last 3 bytes so in its current form it will miss a search pattern up to 3 bytes long if it is at the end of the source being searched. It does not effect any of the examples in the test piece.

The block of code that determines if a null is within the 4 bytes read in is AND masking. Agner Fog used it in his classic StrLen algo. I used the preset value in ESP because its a normal speedup for Agner's algo.
Title: Re: InString Speed Test!
Post by: Jimg on November 24, 2006, 12:05:20 AM
QuoteI posted the reason why above.
opps  :red
Title: Re: InString Speed Test!
Post by: MichaelW on November 24, 2006, 02:02:12 AM
P3:

Source length = 185
Mismatch length = 290

280 finstr mismatch recovery
21 finstr short
81 finstr medium
128 finstr long
175 finstr end
----------------------
Short Search
----------------------
107 average finstr
114 average InStringL
----------------------
----------------------
Medium Search
----------------------
295 average finstr
191 average InStringL
----------------------
----------------------
Long Search
----------------------
503 average finstr
491 average InStringL
----------------------
------------------------------------------
Full Length search on high ANSI characters
------------------------------------------
561 average finstr
493 average InStringL
----------------------
----------------------
Mismatch Recovery Test
----------------------
1647 average finstr
1343 average InStringL
----------------------

Pentium D (family 15, model 4, stepping 7):

Source length = 185
Mismatch length = 290

280 finstr mismatch recovery
21 finstr short
81 finstr medium
128 finstr long
175 finstr end
----------------------
Short Search
----------------------
98 average finstr
96 average InStringL
----------------------
----------------------
Medium Search
----------------------
256 average finstr
180 average InStringL
----------------------
----------------------
Long Search
----------------------
434 average finstr
416 average InStringL
----------------------
------------------------------------------
Full Length search on high ANSI characters
------------------------------------------
518 average finstr
524 average InStringL
----------------------
----------------------
Mismatch Recovery Test
----------------------
1416 average finstr
1358 average InStringL
----------------------