News:

MASM32 SDK Description, downloads and other helpful links
MASM32.com New Forum Link
masmforum WebSite

Compare two strings

Started by yvansoftware, March 30, 2010, 07:40:20 PM

Previous topic - Next topic

jj2007

Quote from: clive on June 17, 2010, 04:33:25 PM
Problem is, with strcmp() is that you usually don't know what the length of the string is, unless you actually check it first (at additional cost) or haul the metadata around with the string. Further the real strcmp() actually provides an equal, greater than, less than answer.

The two SSE2 variants, which have since been included in MasmBasic, do the null check; they return in eax the equal, greater than, less than info, and in edx the position where the first mismatch occurred.

Re algos used for timings:
Quoteinclude \masm32\macros\timers.asm         ; get them from the Masm32 Laboratory

clive

Quote from: jj2007
The two SSE2 variants, which have since been included in MasmBasic, do the null check; they return in eax the equal, greater than, less than info, and in edx the position where the first mismatch occurred.

Indeed, but the routines that Frank wrote do not, and in the context of the quote they further assume the length of the string, and that the strings are of equal length. In the general solution you can't make such assumptions, or you would be using memcmp().
It could be a random act of randomness. Those happen a lot as well.

jj2007

Quote from: clive on June 17, 2010, 05:37:00 PM
Quote from: jj2007
The two SSE2 variants, which have since been included in MasmBasic, do the null check; they return in eax the equal, greater than, less than info, and in edx the position where the first mismatch occurred.

Indeed, but the routines that Frank wrote do not, and in the context of the quote they further assume the length of the string, and that the strings are of equal length. In the general solution you can't make such assumptions, or you would be using memcmp().

For the sake of fairness, I should add that Lingo's algo does the null check, although it's not mentioned in the output.

Re assuming equal length: The shorter string would trigger a mismatch, since the other one is non-zero; the real danger is that in 99% of all cases that will give the false impression that the algo works fine. Until you run into that rare case where you load two equal short files into a VirtualAlloc'ed buffer, and bang, access violation.

frktons

Very well indeed.  :U
When I'll be able to understand all the stuff you are talking about
I'll be probably able to solve these problems as well, I hope at least.  :P

So far I'll be happy just to understand why, what and when to ALIGN data
and how to do it, I mean how and where I declare  the ALIGN directive?

Thanks

Frank
Mind is like a parachute. You know what to do in order to use it :-)

clive

Quote from: jj2007
Re assuming equal length: The shorter string would trigger a mismatch, since the other one is non-zero; the real danger is that in 99% of all cases that will give the false impression that the algo works fine. Until you run into that rare case where you load two equal short files into a VirtualAlloc'ed buffer, and bang, access violation.

Plus the potential for byte(s) beyond the NUL to have any value, which would could potentially mean a pair of 2 character string which are otherwise identical to be miscompared if they were examined four bytes (or 8, whatever) at a time.


str1  db 'it',0,'x'
str2  db 'it',0,'y'

It could be a random act of randomness. Those happen a lot as well.

qWord

Quote from: clive on June 17, 2010, 07:14:54 PMPlus the potential for byte(s) beyond the NUL to have any value, which would could potentially mean a pair of 2 character string which are otherwise identical to be miscompared if they were examined four bytes (or 8, whatever) at a time.


str1  db 'it',0,'x'
str2  db 'it',0,'y'


jj is using bsf for finding the zero byte. Bytes following the termination zero are ignored (AFAIKS)
FPU in a trice: SmplMath
It's that simple!

clive

Quote from: frktons
So far I'll be happy just to understand why, what and when to ALIGN data
and how to do it, I mean how and where I declare  the ALIGN directive?

You use the ALIGN directive immediately before the data/code you want to fall on a specific alignment boundary. MASM will insert padding to achieve this based on the current address ($).

If you use .486, the alignment for sections will be 16 bytes, so you can't use ALIGN 32 for example. You could explicitly define the segments and give them page alignment. The linker and the order of the objects linked, will impact where the code actually ends up, MASM passes down this alignment information, but the first object to be linked is the one that the linker uses.

The alignment relates to cache line boundaries (the width of data fetched into cache in a single operation, and other operations internal to the CPU), depends on CPU, but often 32 bytes wide.

Aligining branch targets on 16-byte boundaries helps with the branch prediction, and how the CPU tracks recent branches taken/not-taken.

Data should be aligned on at least 4-byte boundaries, or the data bus width of the processor. If a whole string fits in a single cache line, all the better. There is a penalty if they cross a cache line, Intel tends to be worse than AMD.

Other gains can be had by performing the bulk of 4-byte accesses on 4-byte boundaries, so library routines will often check for this and handle unaligned bytes first.
It could be a random act of randomness. Those happen a lot as well.

clive

Quote from: qWord
he use bsf for finding the zero byte. Bytes following the termination zero are ignored (AFAIKS)

Not suggesting this is a flaw in all implementations, but should be part of a valid test suite.
It could be a random act of randomness. Those happen a lot as well.

jj2007

Quote from: qWord on June 17, 2010, 07:32:39 PM
jj is using bsf for finding the zero byte. Bytes following the termination zero are ignored (AFAIKS)
Correct. It works as it should.

frktons

I've tried with 2600 bytes strings, and the proportion of performance
on my pc, with my original routine vs Lingo's one is still 2.5:1 more
or less.


Intel(R) Core(TM)2 CPU          6600  @ 2.40GHz (SSE4)
3268    cycles for LingoCMP, 2600 bytes
1331    cycles for frktonsCMP , 2600 bytes

--- ok ---


I wander why with the modification of Jochen the proportion changes
so much.  ::)
New version included,  with the ALIGN suggested.  :P
Mind is like a parachute. You know what to do in order to use it :-)

Ghandi

Frank, can i ask a question mate?

You set EBX to 325 and then decrement by one on each iteration, right? With a multiple of 4, this gives us 1300 bytes scanned, not the 2600 which you've logged, please correct me if i'm wrong?

Also, you are using a fixed length compare, to make it a fair comparison you would need to remove the zero check from Lingo's code and make it so it knew exactly how long the array was to compare against. Keep working on this though, its always good to see people trying new ideas.

HR,
Ghandi

clive

#71
Quote from: frktons
I wander why with the modification of Jochen the proportion changes

I think the word you are looking for is wonder, or perhaps not, but I always imagine someone walking around.<G>

http://dictionary.reference.com/browse/wander
"to  think  or  speak  confusedly  or  incoherently."

http://dictionary.reference.com/browse/wonder
"to  think  or  speculate  curiously"
It could be a random act of randomness. Those happen a lot as well.

clive

Quote from: Ghandi
You set EBX to 325 and then decrement by one on each iteration, right? With a multiple of 4, this gives us 1300 bytes scanned, not the 2600 which you've logged, please correct me if i'm wrong?

8 bytes per iteration
It could be a random act of randomness. Those happen a lot as well.

lingo

Take a look that cmp is faster than xor instruction... :wink
Intel(R) Core(TM)2 Duo CPU     E8500  @ 3.16GHz (SSE4)
3274    cycles for LingoCMP, 2600 bytes
1318    cycles for NewfrktonsWithoutXOR  , 2600 bytes
1334    cycles for frktonsCMP , 2600 bytes

--- ok ---

frktons

Quote from: clive on June 17, 2010, 09:51:11 PM
Quote from: frktons
I wander why with the modification of Jochen the proportion changes

I think the word you are looking for is wonder, or perhaps not, but I always imaging someone walking around.<G>

http://dictionary.reference.com/browse/wander
"to  think  or  speak  confusedly  or  incoherently."

http://dictionary.reference.com/browse/wonder
"to  think  or  speculate  curiously"

Yes Clive, I probably should use wonder; bear with me, this is
my first semester in Assembly, C language and English language as well  :P

wander is a bit more lunatic than I usually am  :lol

I use wander in a figurative way, like my mind going around
without knowing what to do or how to interpret things.
But I really don't know when or where I got that meaning.  ::)
Maybe this translation:
to  ramble  without  a  definite  purpose  or  objective;  roam,  rove,  or  stray:  to  wander  over  the  earth.  is close to the meaning I intend.

Thanks for your corrections in all the fields and the explanations and
examples you give me.  :U
Mind is like a parachute. You know what to do in order to use it :-)