The MASM Forum Archive 2004 to 2012

General Forums => The Workshop => Topic started by: zemtex on December 20, 2010, 05:08:11 PM

Title: What is the most effective way
Post by: zemtex on December 20, 2010, 05:08:11 PM
What is the most effective way to compare a null terminated string in a STRUCT up against a null terminated string that is not based in a STRUCT.

Can anyone give me a code example. Unfortunately its been awhile since i've coded assembly and I have to refresh my mind again.

Let's say the string is located 32 bytes into a struct and is 256 bytes long, the last one is a zero. Can anyone give me an efficient piece of code for this, I'm not interested in the code itself, I just want to see the logic and philosophy behind it.
Title: Re: What is the most effective way
Post by: raymond on December 20, 2010, 06:02:30 PM
The CMPS instruction would use the ESI and EDI registers for pointing to the two strings needing to be compared. The ECX register must also contain the exact number of items to be compared if the REP prefix is also used.

Load one of those two registers with the memory address of one of the strings and the other register with the memory address of the other string. The fact that one string may reside within a struct is totally immaterial for the comparison.

repz cmpsb   ;to compare bytes
jz   identical
...   ;the two strings are NOT identical
      ;the remainder in ECX is the count of items following the first item found different

identical:
...
Title: Re: What is the most effective way
Post by: dedndave on December 20, 2010, 06:15:38 PM
Ray has the right idea - CMPSB is the simple way

there are more complex algo's for this, that are faster - use the forum search tool to find them
some use SSE instructions that may not run on older machines, but are quite fast

a lot comes into play, here - such as address alignment and string length

if you have very long strings and/or are going to compare strings thousands of times in a program, then it may be worth the effort
but for typical use, CMPSB is simple and reasonably fast

as for whether the string is in a structure or not makes little difference
Title: Re: What is the most effective way
Post by: ToutEnMasm on December 20, 2010, 06:21:29 PM

If it is to compare two words,get the adress of the two and use lstrcmp ou lstrcmpi.
Quote
lea edx,mastructure.chain
invoke lstrcmpi,edx,addr anotherchain ;text compare
.if eax == 0
   ;the two chains are the same
.endif
Title: Re: What is the most effective way
Post by: MichaelW on December 20, 2010, 06:29:01 PM
zemtex,

Whether or not one of the strings is in a structure should make no difference – you just need the addresses. Is the comparison supposed to be case-sensitive, or case-insensitive? Do the strings have a known length, or do you need to determine the lengths from the positions of the null terminators?
Title: Re: What is the most effective way
Post by: zemtex on December 20, 2010, 07:53:04 PM
Case insensitive. ATM I am using "szCmpi proc src:DWORD, dst:DWORD, ln:DWORD" from the included Masm32 library

Thanks raymond.
Title: Re: What is the most effective way
Post by: oex on December 20, 2010, 08:23:22 PM
You can use Cmpi instead if you dont already have the string length

ie instead of szLen, Src1 then szCmpi, Src1, Src2, Len use just Cmpi, Src1, Src2
Title: Re: What is the most effective way
Post by: MichaelW on December 20, 2010, 09:54:35 PM
My quick search of the forum did not turn up any case-insensitive string comparison optimization "contests". Running on my P3 Cmpi is somewhat faster than szCmpi, and at longer string lengths the CRT _stricmp function is faster than either. In the time I had to work on this I could not find any way to test RtlCompareString.

;====================================================================
    include \masm32\include\masm32rt.inc
    .686
    include \masm32\macros\timers.asm
;====================================================================
    .data
        str1 db "my other brother darryl",0
        str2 db 4 dup("my other brother darryl"),0
        str3 db 8 dup("my other brother darryl"),0
    .code
;====================================================================
start:
;====================================================================

    invoke Sleep, 3000

    FOR string,<str1,str2,str3>

        print str$(SIZEOF string)," bytes",13,10

        counter_begin 10000, HIGH_PRIORITY_CLASS
            invoke szCmpi, ADDR string, ADDR string, SIZEOF string
        counter_end
        print str$(eax)," cycles, szCmpi",13,10

        counter_begin 10000, HIGH_PRIORITY_CLASS
            invoke Cmpi, ADDR string, ADDR string
        counter_end
        print str$(eax)," cycles, Cmpi",13,10

        counter_begin 10000, HIGH_PRIORITY_CLASS
            invoke crt__stricmp, ADDR string, ADDR string
        counter_end
        print str$(eax)," cycles, crt__stricmp",13,10,13,10

    ENDM

    inkey "Press any key to exit..."
    exit

;====================================================================
end start


24 bytes
150 cycles, szCmpi
128 cycles, Cmpi
164 cycles, crt__stricmp

93 bytes
496 cycles, szCmpi
413 cycles, Cmpi
370 cycles, crt__stricmp

185 bytes
961 cycles, szCmpi
800 cycles, Cmpi
654 cycles, crt__stricmp

Title: Re: What is the most effective way
Post by: hutch-- on December 20, 2010, 10:12:46 PM
This is the timing on my Core2 Quad.


24 bytes
121 cycles, szCmpi
99 cycles, Cmpi
98 cycles, crt__stricmp

93 bytes
488 cycles, szCmpi
394 cycles, Cmpi
252 cycles, crt__stricmp

185 bytes
948 cycles, szCmpi
762 cycles, Cmpi
439 cycles, crt__stricmp

Press any key to exit...
Title: Re: What is the most effective way
Post by: jj2007 on December 21, 2010, 08:44:00 AM
24 bytes
47 cycles, MasmBasic
135 cycles, szCmpi
111 cycles, Cmpi
154 cycles, crt__stricmp

93 bytes
95 cycles, MasmBasic
535 cycles, szCmpi
403 cycles, Cmpi
366 cycles, crt__stricmp

185 bytes
143 cycles, MasmBasic
988 cycles, szCmpi
767 cycles, Cmpi
666 cycles, crt__stricmp

Correctness:
equal for MasmBasic
non-equal for szCmpi
non-equal for Cmpi
non-equal for crt__stricmp


Check the code to find out why the correctness test comes to different results.
Title: Re: What is the most effective way
Post by: FORTRANS on December 21, 2010, 01:44:40 PM
Hi,

   P3, Win2k.

Regards,

Steve N.

G:\WORK\TEMP>stricmp
24 bytes
47 cycles, MasmBasic
152 cycles, szCmpi
131 cycles, Cmpi
172 cycles, crt__stricmp

93 bytes
47 cycles, MasmBasic
503 cycles, szCmpi
423 cycles, Cmpi
384 cycles, crt__stricmp

185 bytes
67 cycles, MasmBasic
967 cycles, szCmpi
805 cycles, Cmpi
664 cycles, crt__stricmp

Correctness:
non-equal for MasmBasic ???
non-equal for szCmpi
non-equal for Cmpi
non-equal for crt__stricmp
Press any key to exit...
Title: Re: What is the most effective way
Post by: jj2007 on December 21, 2010, 01:53:21 PM
Quote from: FORTRANS on December 21, 2010, 01:44:40 PM
Correctness:
non-equal for MasmBasic ???
non-equal for szCmpi
non-equal for Cmpi
non-equal for crt__stricmp
Press any key to exit...

Steve,
afaik the P3 is SSE1 only... sorry.
Title: Re: What is the most effective way
Post by: FORTRANS on December 21, 2010, 02:14:37 PM
Quote from: jj2007 on December 21, 2010, 01:53:21 PM
afaik the P3 is SSE1 only... sorry.

Hi,

   Right, probably should not have used "???".  Just noted
for completeness...  Does MasmBasic use an initialization
routine?

Regards,

Steve
Title: Re: What is the most effective way
Post by: jj2007 on December 21, 2010, 02:35:13 PM
Yes it does, but I didn't use it in this example:
  db 0Fh, 0A2h ; cpuid 1
  bt edx, 26 ; edx bit 26, SSE2
  .if !Carry?
invoke MessageBox, 0, chr$("Sorry, MasmBasic needs SSE2"), 0, MB_OK
invoke ExitProcess, -99
  .endif

Triggered when user starts code with Init (recommended), or uses Let, Print etc.

I am more puzzled with the treatment of Umlauts:

Quoteif 1
       sLow db "thüs is ä töst", 0
       sUpp db "THÜS IS Ä TÖST", 0
    else
       sLow db "this is a test", 0
       sUpp db "THIS IS A TEST", 0
    endif
IMHO a case-insensitive string comparison routine should declare the first two as "equal". That is what MasmBasic does. However, the other three candidates claim they are different... where is Angie when the German language has to be defended...???
Title: Re: What is the most effective way
Post by: MichaelW on December 21, 2010, 03:58:07 PM
For the strings with the umlauts, _stricmp returns 1, but if I first define the locale with:

LC_ALL equ 0
invoke crt_setlocale, LC_ALL, chr$("English")


Then it returns 0, as it should. I didn't test any of the other possible parameters for setlocale, but with these the cost is that _stricmp slows down:

24 bytes
150 cycles, szCmpi
128 cycles, Cmpi
1687 cycles, crt__stricmp

93 bytes
498 cycles, szCmpi
406 cycles, Cmpi
6325 cycles, crt__stricmp

185 bytes
959 cycles, szCmpi
873 cycles, Cmpi
12510 cycles, crt__stricmp


http://msdn.microsoft.com/en-us/library/k59z8dwe(v=VS.71).aspx

Title: Re: What is the most effective way
Post by: jj2007 on December 21, 2010, 08:39:02 PM
Interesting that a CRT algo understands German umlauts when set to "English" :wink

Intel(R) Celeron(R) M CPU        420  @ 1.60GHz (SSE3)
24 bytes
47 cycles, MasmBasic
135 cycles, szCmpi
111 cycles, Cmpi
1063 cycles, crt__stricmp

93 bytes
93 cycles, MasmBasic
535 cycles, szCmpi
405 cycles, Cmpi
4040 cycles, crt__stricmp

185 bytes
144 cycles, MasmBasic
993 cycles, szCmpi
767 cycles, Cmpi
8010 cycles, crt__stricmp

Correctness:

sLow db "thüs is ä töst with èéìàù", 0
sUpp db "THÜS IS Ä TÖST WITH ÈÉÌÀÙ", 0

equal for MasmBasic
non-equal for szCmpi
non-equal for Cmpi
equal for crt__stricmp


szCmpi and Cmpi suffer from the same problem - this table:
      szCmpi_tbl \
      db   0,  1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11, 12, 13, 14, 15
      db  16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31
      db  32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47
      db  48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63
      db  64, 97, 98, 99,100,101,102,103,104,105,106,107,108,109,110,111
      db 112,113,114,115,116,117,118,119,120,121,122, 91, 92, 93, 94, 95
      db  96, 97, 98, 99,100,101,102,103,104,105,106,107,108,109,110,111
      db 112,113,114,115,116,117,118,119,120,121,122,123,124,125,126,127
      db 128,129,130,131,132,133,134,135,136,137,138,139,140,141,142,143
      db 144,145,146,147,148,149,150,151,152,153,154,155,156,156,158,159
      db 160,161,162,163,164,165,166,167,168,169,170,171,172,173,173,175
      db 176,177,178,179,180,181,182,183,184,185,186,187,188,189,190,191
      db 192,193,194,195,196,197,198,199,200,201,202,203,204,205,206,207
      db 208,209,210,211,212,213,214,215,216,217,218,219,220,221,222,223
      db 224,225,226,227,228,229,230,231,232,233,234,235,236,237,238,239
      db 240,241,242,243,244,245,246,247,248,249,250,251,252,253,254,255
Title: Re: What is the most effective way
Post by: hutch-- on December 21, 2010, 09:22:10 PM
> szCmpi and Cmpi suffer from the same problem

They use a user specified replacement table so the user CAN specify the replacement table.  :bg
Title: Re: What is the most effective way
Post by: jj2007 on December 22, 2010, 12:07:21 AM
Well, I must have missed that part of the Cmpi documentation :bg

Anyway, it could be done with a generated table that behaves as it should:

Intel(R) Celeron(R) M CPU        420  @ 1.60GHz (SSE3)
24 bytes
46 cycles, MasmBasic
77 cycles, CmpiJJ
168 cycles, szCmpi
112 cycles, Cmpi
1067 cycles, crt__stricmp

93 bytes
93 cycles, MasmBasic
299 cycles, CmpiJJ
531 cycles, szCmpi
404 cycles, Cmpi
4026 cycles, crt__stricmp

185 bytes
144 cycles, MasmBasic
576 cycles, CmpiJJ
978 cycles, szCmpi
769 cycles, Cmpi
7983 cycles, crt__stricmp

Correctness (expected: equal):
equal for MasmBasic
equal for CmpiJJ
non-equal for szCmpi
non-equal for Cmpi
equal for crt__stricmp

Code size:
310      bytes for Cmpi
152      bytes for CmpiJJ


Half as long, correct results, and a bit faster. And it's not even SSE2...  :wink
Title: Re: What is the most effective way
Post by: jj2007 on December 22, 2010, 09:45:21 PM
That algo of mine was eccessively bloated, apologies. Here is a version that takes 8 paras (8*16=128 bytes).
Timings exactly as before.