News:

MASM32 SDK Description, downloads and other helpful links
MASM32.com New Forum Link
masmforum WebSite

What is the most effective way

Started by zemtex, December 20, 2010, 05:08:11 PM

Previous topic - Next topic

zemtex

What is the most effective way to compare a null terminated string in a STRUCT up against a null terminated string that is not based in a STRUCT.

Can anyone give me a code example. Unfortunately its been awhile since i've coded assembly and I have to refresh my mind again.

Let's say the string is located 32 bytes into a struct and is 256 bytes long, the last one is a zero. Can anyone give me an efficient piece of code for this, I'm not interested in the code itself, I just want to see the logic and philosophy behind it.
I have been puzzling with lego bricks all my life. I know how to do this. When Peter, at age 6 is competing with me, I find it extremely neccessary to show him that I can puzzle bricks better than him, because he is so damn talented that all that is called rational has gone haywire.

raymond

The CMPS instruction would use the ESI and EDI registers for pointing to the two strings needing to be compared. The ECX register must also contain the exact number of items to be compared if the REP prefix is also used.

Load one of those two registers with the memory address of one of the strings and the other register with the memory address of the other string. The fact that one string may reside within a struct is totally immaterial for the comparison.

repz cmpsb   ;to compare bytes
jz   identical
...   ;the two strings are NOT identical
      ;the remainder in ECX is the count of items following the first item found different

identical:
...
When you assume something, you risk being wrong half the time
http://www.ray.masmcode.com

dedndave

Ray has the right idea - CMPSB is the simple way

there are more complex algo's for this, that are faster - use the forum search tool to find them
some use SSE instructions that may not run on older machines, but are quite fast

a lot comes into play, here - such as address alignment and string length

if you have very long strings and/or are going to compare strings thousands of times in a program, then it may be worth the effort
but for typical use, CMPSB is simple and reasonably fast

as for whether the string is in a structure or not makes little difference

ToutEnMasm


If it is to compare two words,get the adress of the two and use lstrcmp ou lstrcmpi.
Quote
lea edx,mastructure.chain
invoke lstrcmpi,edx,addr anotherchain ;text compare
.if eax == 0
   ;the two chains are the same
.endif

MichaelW

zemtex,

Whether or not one of the strings is in a structure should make no difference – you just need the addresses. Is the comparison supposed to be case-sensitive, or case-insensitive? Do the strings have a known length, or do you need to determine the lengths from the positions of the null terminators?
eschew obfuscation

zemtex

Case insensitive. ATM I am using "szCmpi proc src:DWORD, dst:DWORD, ln:DWORD" from the included Masm32 library

Thanks raymond.
I have been puzzling with lego bricks all my life. I know how to do this. When Peter, at age 6 is competing with me, I find it extremely neccessary to show him that I can puzzle bricks better than him, because he is so damn talented that all that is called rational has gone haywire.

oex

You can use Cmpi instead if you dont already have the string length

ie instead of szLen, Src1 then szCmpi, Src1, Src2, Len use just Cmpi, Src1, Src2
We are all of us insane, just to varying degrees and intelligently balanced through networking

http://www.hereford.tv

MichaelW

My quick search of the forum did not turn up any case-insensitive string comparison optimization "contests". Running on my P3 Cmpi is somewhat faster than szCmpi, and at longer string lengths the CRT _stricmp function is faster than either. In the time I had to work on this I could not find any way to test RtlCompareString.

;====================================================================
    include \masm32\include\masm32rt.inc
    .686
    include \masm32\macros\timers.asm
;====================================================================
    .data
        str1 db "my other brother darryl",0
        str2 db 4 dup("my other brother darryl"),0
        str3 db 8 dup("my other brother darryl"),0
    .code
;====================================================================
start:
;====================================================================

    invoke Sleep, 3000

    FOR string,<str1,str2,str3>

        print str$(SIZEOF string)," bytes",13,10

        counter_begin 10000, HIGH_PRIORITY_CLASS
            invoke szCmpi, ADDR string, ADDR string, SIZEOF string
        counter_end
        print str$(eax)," cycles, szCmpi",13,10

        counter_begin 10000, HIGH_PRIORITY_CLASS
            invoke Cmpi, ADDR string, ADDR string
        counter_end
        print str$(eax)," cycles, Cmpi",13,10

        counter_begin 10000, HIGH_PRIORITY_CLASS
            invoke crt__stricmp, ADDR string, ADDR string
        counter_end
        print str$(eax)," cycles, crt__stricmp",13,10,13,10

    ENDM

    inkey "Press any key to exit..."
    exit

;====================================================================
end start


24 bytes
150 cycles, szCmpi
128 cycles, Cmpi
164 cycles, crt__stricmp

93 bytes
496 cycles, szCmpi
413 cycles, Cmpi
370 cycles, crt__stricmp

185 bytes
961 cycles, szCmpi
800 cycles, Cmpi
654 cycles, crt__stricmp

eschew obfuscation

hutch--

This is the timing on my Core2 Quad.


24 bytes
121 cycles, szCmpi
99 cycles, Cmpi
98 cycles, crt__stricmp

93 bytes
488 cycles, szCmpi
394 cycles, Cmpi
252 cycles, crt__stricmp

185 bytes
948 cycles, szCmpi
762 cycles, Cmpi
439 cycles, crt__stricmp

Press any key to exit...
Download site for MASM32      New MASM Forum
https://masm32.com          https://masm32.com/board/index.php

jj2007

24 bytes
47 cycles, MasmBasic
135 cycles, szCmpi
111 cycles, Cmpi
154 cycles, crt__stricmp

93 bytes
95 cycles, MasmBasic
535 cycles, szCmpi
403 cycles, Cmpi
366 cycles, crt__stricmp

185 bytes
143 cycles, MasmBasic
988 cycles, szCmpi
767 cycles, Cmpi
666 cycles, crt__stricmp

Correctness:
equal for MasmBasic
non-equal for szCmpi
non-equal for Cmpi
non-equal for crt__stricmp


Check the code to find out why the correctness test comes to different results.

FORTRANS

Hi,

   P3, Win2k.

Regards,

Steve N.

G:\WORK\TEMP>stricmp
24 bytes
47 cycles, MasmBasic
152 cycles, szCmpi
131 cycles, Cmpi
172 cycles, crt__stricmp

93 bytes
47 cycles, MasmBasic
503 cycles, szCmpi
423 cycles, Cmpi
384 cycles, crt__stricmp

185 bytes
67 cycles, MasmBasic
967 cycles, szCmpi
805 cycles, Cmpi
664 cycles, crt__stricmp

Correctness:
non-equal for MasmBasic ???
non-equal for szCmpi
non-equal for Cmpi
non-equal for crt__stricmp
Press any key to exit...

jj2007

Quote from: FORTRANS on December 21, 2010, 01:44:40 PM
Correctness:
non-equal for MasmBasic ???
non-equal for szCmpi
non-equal for Cmpi
non-equal for crt__stricmp
Press any key to exit...

Steve,
afaik the P3 is SSE1 only... sorry.

FORTRANS

Quote from: jj2007 on December 21, 2010, 01:53:21 PM
afaik the P3 is SSE1 only... sorry.

Hi,

   Right, probably should not have used "???".  Just noted
for completeness...  Does MasmBasic use an initialization
routine?

Regards,

Steve

jj2007

Yes it does, but I didn't use it in this example:
  db 0Fh, 0A2h ; cpuid 1
  bt edx, 26 ; edx bit 26, SSE2
  .if !Carry?
invoke MessageBox, 0, chr$("Sorry, MasmBasic needs SSE2"), 0, MB_OK
invoke ExitProcess, -99
  .endif

Triggered when user starts code with Init (recommended), or uses Let, Print etc.

I am more puzzled with the treatment of Umlauts:

Quoteif 1
       sLow db "thüs is ä töst", 0
       sUpp db "THÜS IS Ä TÖST", 0
    else
       sLow db "this is a test", 0
       sUpp db "THIS IS A TEST", 0
    endif
IMHO a case-insensitive string comparison routine should declare the first two as "equal". That is what MasmBasic does. However, the other three candidates claim they are different... where is Angie when the German language has to be defended...???

MichaelW

For the strings with the umlauts, _stricmp returns 1, but if I first define the locale with:

LC_ALL equ 0
invoke crt_setlocale, LC_ALL, chr$("English")


Then it returns 0, as it should. I didn't test any of the other possible parameters for setlocale, but with these the cost is that _stricmp slows down:

24 bytes
150 cycles, szCmpi
128 cycles, Cmpi
1687 cycles, crt__stricmp

93 bytes
498 cycles, szCmpi
406 cycles, Cmpi
6325 cycles, crt__stricmp

185 bytes
959 cycles, szCmpi
873 cycles, Cmpi
12510 cycles, crt__stricmp


http://msdn.microsoft.com/en-us/library/k59z8dwe(v=VS.71).aspx

eschew obfuscation