News:

MASM32 SDK Description, downloads and other helpful links
MASM32.com New Forum Link
masmforum WebSite

Compare two strings

Started by yvansoftware, March 30, 2010, 07:40:20 PM

Previous topic - Next topic

dedndave

#30
Quote"coop" and "co-op" stay together within a sorted list

one is a place for chickens - the other is a group of guys that raise them   :bg

jj2007

#31
Hi Dave,
You tested the old version. Here is a new one, including a case-insensitive algo.

Intel(R) Pentium(R) 4 CPU 3.40GHz (SSE3)
String comparison: short string 10 bytes, long string 5050
3533    cycles for SSE with null check, long string
4507    cycles for SSE with null check, long string, case-insensitive
6926    cycles for Lingo, long string
14194   cycles for crt_strcmp, long string
51312   cycles for crt__stricmp, long string, case-insensitive
33241   cycles for movzx check null, long string
20460   cycles for repe cmpsb, long string
154405  cycles for lstrcmp, long string

40      cycles for SSE with null check, 10 bytes
50      cycles for SSE with null check, 10 bytes, case-insensitive
23      cycles for Lingo, 10 bytes
41      cycles for crt_strcmp, 10 bytes
242     cycles for crt__stricmp, 10 bytes, case-insensitive


Edit: Correct version is StrCompSSE_ci3.zip - removed, it did not always return correct results. MasmBasic StringsDiffer and FilesDiffer are more stable.

dedndave

Intel Pentium 4 Prescott CPU 3.00GHz (SSE3)
String comparison: short string 10 bytes, long string 5050
3564    cycles for SSE with null check, long string
4509    cycles for SSE with null check, long string, case-insensitive
72008   cycles for Lingo, long string
14371   cycles for crt_strcmp, long string
50830   cycles for crt__stricmp, long string, case-insensitive
32967   cycles for movzx check null, long string
-44643  cycles for repe cmpsb, long string
808120  cycles for lstrcmp, long string

40      cycles for SSE with null check, 10 bytes
52      cycles for SSE with null check, 10 bytes, case-insensitive
23      cycles for Lingo, 10 bytes
41      cycles for crt_strcmp, 10 bytes
244     cycles for crt__stricmp, 10 bytes, case-insensitive
100     cycles for movzx check null, 10 bytes
123     cycles for repe cmpsb, 10 bytes
740     cycles for lstrcmp, 10 bytes

3542    cycles for SSE with null check, long string
4568    cycles for SSE with null check, long string, case-insensitive
6953    cycles for Lingo, long string
14017   cycles for crt_strcmp, long string
115829  cycles for crt__stricmp, long string, case-insensitive
98018   cycles for movzx check null, long string
85415   cycles for repe cmpsb, long string
809285  cycles for lstrcmp, long string

43      cycles for SSE with null check, 10 bytes
55      cycles for SSE with null check, 10 bytes, case-insensitive
26      cycles for Lingo, 10 bytes
47      cycles for crt_strcmp, 10 bytes
246     cycles for crt__stricmp, 10 bytes, case-insensitive
101     cycles for movzx check null, 10 bytes
125     cycles for repe cmpsb, 10 bytes
745     cycles for lstrcmp, 10 bytes

dedndave

i get crazy results, Jochen
ok - closed all other applications and ran it 10 times - something is amiss...

dedndave

i am not sure i understand the Dummy1 var
doesn't that misalign the data strings ?

jj2007

Yep, the dummies are for misaligning the strings.

Your results are really crazy. You are not by accident running a P4?  :wink
Try changing the invoke Sleep, 100 to invoke Sleep, 200 - on my P4 it helps. And take the updated version with 285 bytes size - you were the only downloader yet, so I exchanged it silently.

[Off topic: Nice comics here. Say bye to your mouse :green]

dedndave

i have played with that Sleep value a little bit after i saw MichaelW use it
it doesn't seem to make much difference until you get up to at least Sleep,500
Michael was using Sleep,1500 if i recall
let me try the updated virgin...

oh - running a P4 .... on purpose   :P

dedndave

here is my result for the new one - lol
30 diff at pos 4999 (zero-based)
looks like you fixed it  :bg  at least i get the same result every time

ohhhh - i get it - you got me   :P
i was gonna take up a collection to buy Jochen a cuppa cappuccino

jj2007

Oops, it seems I posted the one with the test for correctness activated :red
New one attached above, see StrCompSSE_ci3.zip

lingo

"i was gonna take up a collection to buy Jochen a cuppa cappuccino"

Will be better to advise him to not adjust his Chlorpromazine dose (from 50 mg three times daily) unless his doctor specifically instructs him to do so... :wink 
 

dedndave

is that like Librium ?

i still get crazy data, JJ

jj2007

Quote from: dedndave on April 01, 2010, 02:53:09 PM
i still get crazy data, JJ

Typical behaviour for a P4 - many outliers. The lowest values are supposed to be the correct ones; for me they are 3600, 4500, 6900 cycles for the first three. Later I will test it on my Celeron.

dedndave

i dunno - i have never seen numbers like "-69000" before

jj2007

Celeron is more stable, as usual:

Intel(R) Celeron(R) M CPU        420  @ 1.60GHz (SSE3)
String comparison: short string 10 bytes, long string 5050
2902    cycles for SSE with null check, long string
3672    cycles for SSE with null check, long string, case-insensitive
5834    cycles for Lingo, long string
13933   cycles for crt_strcmp, long string
16508   cycles for crt__stricmp, long string, case-insensitive
30042   cycles for movzx check null, long string
20108   cycles for repe cmpsb, long string
89881   cycles for lstrcmp, long string

18      cycles for SSE with null check, 10 bytes
25      cycles for SSE with null check, 10 bytes, case-insensitive
16      cycles for Lingo, 10 bytes
32      cycles for crt_strcmp, 10 bytes
108     cycles for crt__stricmp, 10 bytes, case-insensitive
51      cycles for movzx check null, 10 bytes
88      cycles for repe cmpsb, 10 bytes
453     cycles for lstrcmp, 10 bytes

frktons


Intel(R) Core(TM)2 CPU          6600  @ 2.40GHz (SSE4)
1405    cycles for LingoCMP, 1000 bytes
519     cycles for frktonsCMP , 1000 bytes

--- ok ---



I have the strange idea that this method could be quite fast:


;--------------------------------------------------------------------------------------------------------------
; compare_4bytes
; -------------------------------------------------------------------------------------------------------------
; comparing 2 4 bytes strings the fast way
;--------------------------------------------------------------------------------------------------------------
;
; Method proposed by frktons
;--------------------------------------------------------------------------------------------------------------
; 16 june 2010 - masmforum
;--------------------------------------------------------------------------------------------------------------


include \masm32\include\masm32rt.inc

.686


.data

Src1 db "This", 0
Src2 db "This", 0


.code
start:

mov ecx, offset Src1
mov eax, offset Src2
@@:
        mov edx, [ecx] ; Comparing two strings in edx and esi
        mov  esi,  [eax]
        xor    edx,  esi
        jnz    end_check
print   "The strings are identical",13,10

        jmp end_of_game
   


     
end_check:     

        print "The strings are different",13,10

end_of_game:


       inkey chr$(13, 10, "--- ok ---", 13)
exit


end start


If you put this method inside a loop, it could be
nice to see how it compares with the other methods.
To be optimal, the cycle has to be 125 times, and the code should
be modified like:

...............

mov ecx, offset Src1
mov eax, offset Src2
        xor ebx, ebx         
        mov ebx, 125
@@:
mov edx, [ecx] ; Comparing two strings in edx and esi
        mov          esi, [eax]
        xor            edx, esi
        jnz             @f
add ecx, 4
        add          eax,4
        mov         edx, [ecx]
        mov         esi, [eax]
        xor           edx, esi
        jnz            @f
        add           ecx, 4
        add           eax, 4
        dec            ebx
        jnz            @b 


@@:
.........


If anyone is kind enough to try it and let me know, I'm quite curious.  :P

Being an Assembly beginner, it could be that I've messed up things without
realizing it.  :lol

Frank

Mind is like a parachute. You know what to do in order to use it :-)