What is the most effective way to compare a null terminated string in a STRUCT up against a null terminated string that is not based in a STRUCT.
Can anyone give me a code example. Unfortunately its been awhile since i've coded assembly and I have to refresh my mind again.
Let's say the string is located 32 bytes into a struct and is 256 bytes long, the last one is a zero. Can anyone give me an efficient piece of code for this, I'm not interested in the code itself, I just want to see the logic and philosophy behind it.
The CMPS instruction would use the ESI and EDI registers for pointing to the two strings needing to be compared. The ECX register must also contain the exact number of items to be compared if the REP prefix is also used.
Load one of those two registers with the memory address of one of the strings and the other register with the memory address of the other string. The fact that one string may reside within a struct is totally immaterial for the comparison.
repz cmpsb ;to compare bytes
jz identical
... ;the two strings are NOT identical
;the remainder in ECX is the count of items following the first item found different
identical:
...
Ray has the right idea - CMPSB is the simple way
there are more complex algo's for this, that are faster - use the forum search tool to find them
some use SSE instructions that may not run on older machines, but are quite fast
a lot comes into play, here - such as address alignment and string length
if you have very long strings and/or are going to compare strings thousands of times in a program, then it may be worth the effort
but for typical use, CMPSB is simple and reasonably fast
as for whether the string is in a structure or not makes little difference
If it is to compare two words,get the adress of the two and use lstrcmp ou lstrcmpi.
Quote
lea edx,mastructure.chain
invoke lstrcmpi,edx,addr anotherchain ;text compare
.if eax == 0
;the two chains are the same
.endif
zemtex,
Whether or not one of the strings is in a structure should make no difference – you just need the addresses. Is the comparison supposed to be case-sensitive, or case-insensitive? Do the strings have a known length, or do you need to determine the lengths from the positions of the null terminators?
Case insensitive. ATM I am using "szCmpi proc src:DWORD, dst:DWORD, ln:DWORD" from the included Masm32 library
Thanks raymond.
You can use Cmpi instead if you dont already have the string length
ie instead of szLen, Src1 then szCmpi, Src1, Src2, Len use just Cmpi, Src1, Src2
My quick search of the forum did not turn up any case-insensitive string comparison optimization "contests". Running on my P3 Cmpi is somewhat faster than szCmpi, and at longer string lengths the CRT _stricmp function is faster than either. In the time I had to work on this I could not find any way to test RtlCompareString.
;====================================================================
include \masm32\include\masm32rt.inc
.686
include \masm32\macros\timers.asm
;====================================================================
.data
str1 db "my other brother darryl",0
str2 db 4 dup("my other brother darryl"),0
str3 db 8 dup("my other brother darryl"),0
.code
;====================================================================
start:
;====================================================================
invoke Sleep, 3000
FOR string,<str1,str2,str3>
print str$(SIZEOF string)," bytes",13,10
counter_begin 10000, HIGH_PRIORITY_CLASS
invoke szCmpi, ADDR string, ADDR string, SIZEOF string
counter_end
print str$(eax)," cycles, szCmpi",13,10
counter_begin 10000, HIGH_PRIORITY_CLASS
invoke Cmpi, ADDR string, ADDR string
counter_end
print str$(eax)," cycles, Cmpi",13,10
counter_begin 10000, HIGH_PRIORITY_CLASS
invoke crt__stricmp, ADDR string, ADDR string
counter_end
print str$(eax)," cycles, crt__stricmp",13,10,13,10
ENDM
inkey "Press any key to exit..."
exit
;====================================================================
end start
24 bytes
150 cycles, szCmpi
128 cycles, Cmpi
164 cycles, crt__stricmp
93 bytes
496 cycles, szCmpi
413 cycles, Cmpi
370 cycles, crt__stricmp
185 bytes
961 cycles, szCmpi
800 cycles, Cmpi
654 cycles, crt__stricmp
This is the timing on my Core2 Quad.
24 bytes
121 cycles, szCmpi
99 cycles, Cmpi
98 cycles, crt__stricmp
93 bytes
488 cycles, szCmpi
394 cycles, Cmpi
252 cycles, crt__stricmp
185 bytes
948 cycles, szCmpi
762 cycles, Cmpi
439 cycles, crt__stricmp
Press any key to exit...
24 bytes
47 cycles, MasmBasic
135 cycles, szCmpi
111 cycles, Cmpi
154 cycles, crt__stricmp
93 bytes
95 cycles, MasmBasic
535 cycles, szCmpi
403 cycles, Cmpi
366 cycles, crt__stricmp
185 bytes
143 cycles, MasmBasic
988 cycles, szCmpi
767 cycles, Cmpi
666 cycles, crt__stricmp
Correctness:
equal for MasmBasic
non-equal for szCmpi
non-equal for Cmpi
non-equal for crt__stricmp
Check the code to find out why the correctness test comes to different results.
Hi,
P3, Win2k.
Regards,
Steve N.
G:\WORK\TEMP>stricmp
24 bytes
47 cycles, MasmBasic
152 cycles, szCmpi
131 cycles, Cmpi
172 cycles, crt__stricmp
93 bytes
47 cycles, MasmBasic
503 cycles, szCmpi
423 cycles, Cmpi
384 cycles, crt__stricmp
185 bytes
67 cycles, MasmBasic
967 cycles, szCmpi
805 cycles, Cmpi
664 cycles, crt__stricmp
Correctness:
non-equal for MasmBasic ???
non-equal for szCmpi
non-equal for Cmpi
non-equal for crt__stricmp
Press any key to exit...
Quote from: FORTRANS on December 21, 2010, 01:44:40 PM
Correctness:
non-equal for MasmBasic ???
non-equal for szCmpi
non-equal for Cmpi
non-equal for crt__stricmp
Press any key to exit...
Steve,
afaik the P3 is SSE1 only... sorry.
Quote from: jj2007 on December 21, 2010, 01:53:21 PM
afaik the P3 is SSE1 only... sorry.
Hi,
Right, probably should not have used "???". Just noted
for completeness... Does MasmBasic use an initialization
routine?
Regards,
Steve
Yes it does, but I didn't use it in this example:
db 0Fh, 0A2h ; cpuid 1
bt edx, 26 ; edx bit 26, SSE2
.if !Carry?
invoke MessageBox, 0, chr$("Sorry, MasmBasic needs SSE2"), 0, MB_OK
invoke ExitProcess, -99
.endif
Triggered when user starts code with Init (recommended), or uses Let, Print etc.
I am more puzzled with the treatment of Umlauts:
Quoteif 1
sLow db "thüs is ä töst", 0
sUpp db "THÜS IS Ä TÖST", 0
else
sLow db "this is a test", 0
sUpp db "THIS IS A TEST", 0
endif
IMHO a case-insensitive string comparison routine should declare the first two as "equal". That is what MasmBasic does. However, the other three candidates claim they are different... where is Angie when the German language has to be defended...???
For the strings with the umlauts, _stricmp returns 1, but if I first define the locale with:
LC_ALL equ 0
invoke crt_setlocale, LC_ALL, chr$("English")
Then it returns 0, as it should. I didn't test any of the other possible parameters for setlocale, but with these the cost is that _stricmp slows down:
24 bytes
150 cycles, szCmpi
128 cycles, Cmpi
1687 cycles, crt__stricmp
93 bytes
498 cycles, szCmpi
406 cycles, Cmpi
6325 cycles, crt__stricmp
185 bytes
959 cycles, szCmpi
873 cycles, Cmpi
12510 cycles, crt__stricmp
http://msdn.microsoft.com/en-us/library/k59z8dwe(v=VS.71).aspx
Interesting that a CRT algo understands German umlauts when set to "English" :wink
Intel(R) Celeron(R) M CPU 420 @ 1.60GHz (SSE3)
24 bytes
47 cycles, MasmBasic
135 cycles, szCmpi
111 cycles, Cmpi
1063 cycles, crt__stricmp
93 bytes
93 cycles, MasmBasic
535 cycles, szCmpi
405 cycles, Cmpi
4040 cycles, crt__stricmp
185 bytes
144 cycles, MasmBasic
993 cycles, szCmpi
767 cycles, Cmpi
8010 cycles, crt__stricmp
Correctness:
sLow db "thüs is ä töst with èéìàù", 0
sUpp db "THÜS IS Ä TÖST WITH ÈÉÌÀÙ", 0
equal for MasmBasic
non-equal for szCmpi
non-equal for Cmpi
equal for crt__stricmp
szCmpi and Cmpi suffer from the same problem - this table:
szCmpi_tbl \
db 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15
db 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31
db 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47
db 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63
db 64, 97, 98, 99,100,101,102,103,104,105,106,107,108,109,110,111
db 112,113,114,115,116,117,118,119,120,121,122, 91, 92, 93, 94, 95
db 96, 97, 98, 99,100,101,102,103,104,105,106,107,108,109,110,111
db 112,113,114,115,116,117,118,119,120,121,122,123,124,125,126,127
db 128,129,130,131,132,133,134,135,136,137,138,139,140,141,142,143
db 144,145,146,147,148,149,150,151,152,153,154,155,156,156,158,159
db 160,161,162,163,164,165,166,167,168,169,170,171,172,173,173,175
db 176,177,178,179,180,181,182,183,184,185,186,187,188,189,190,191
db 192,193,194,195,196,197,198,199,200,201,202,203,204,205,206,207
db 208,209,210,211,212,213,214,215,216,217,218,219,220,221,222,223
db 224,225,226,227,228,229,230,231,232,233,234,235,236,237,238,239
db 240,241,242,243,244,245,246,247,248,249,250,251,252,253,254,255
> szCmpi and Cmpi suffer from the same problem
They use a user specified replacement table so the user CAN specify the replacement table. :bg
Well, I must have missed that part of the Cmpi documentation :bg
Anyway, it could be done with a generated table that behaves as it should:
Intel(R) Celeron(R) M CPU 420 @ 1.60GHz (SSE3)
24 bytes
46 cycles, MasmBasic
77 cycles, CmpiJJ
168 cycles, szCmpi
112 cycles, Cmpi
1067 cycles, crt__stricmp
93 bytes
93 cycles, MasmBasic
299 cycles, CmpiJJ
531 cycles, szCmpi
404 cycles, Cmpi
4026 cycles, crt__stricmp
185 bytes
144 cycles, MasmBasic
576 cycles, CmpiJJ
978 cycles, szCmpi
769 cycles, Cmpi
7983 cycles, crt__stricmp
Correctness (expected: equal):
equal for MasmBasic
equal for CmpiJJ
non-equal for szCmpi
non-equal for Cmpi
equal for crt__stricmp
Code size:
310 bytes for Cmpi
152 bytes for CmpiJJ
Half as long, correct results, and a bit faster. And it's not even SSE2... :wink
That algo of mine was eccessively bloated, apologies. Here is a version that takes 8 paras (8*16=128 bytes).
Timings exactly as before.