News:

MASM32 SDK Description, downloads and other helpful links
MASM32.com New Forum Link
masmforum WebSite

Checking whether a string contains another string

Started by uosunsetter, April 18, 2012, 11:30:01 PM

Previous topic - Next topic

SteveAsm

@ uosunsetter
I was looking at your code, trying to debug it and I decided it was easier to just rewrite it, since I knew what you were trying to do.
Maybe somebody can insert this as a replacement for MyTestCode and time it.

dedndave

hi Steve

timing code is pretty easy
i know Jochen uses some rather "advanced" conditional assembly techniques   :red

most of us use MichaelW's timers.asm file
it may be found in the first post of the first thread in the Laboratory sub-forum
http://www.masm32.com/board/index.php?topic=770.msg5281#msg5281
i have included the timers.asm file in the attachment, though

we generally place the timers.asm file in the \masm32\macros folder

here is a very simple example of timing some code...

jj2007

Quote from: SteveAsm on April 21, 2012, 02:38:44 AM
@ uosunsetter
I was looking at your code, trying to debug it and I decided it was easier to just rewrite it, since I knew what you were trying to do.
Maybe somebody can insert this as a replacement for MyTestCode and time it.


Doesn't even assemble without errors on my Masm32 installation. Could you rewrite it in the form used by the others? Thanks.

InstrSteve proc pString, pPattern
  .. code ..
  ret   ; return position in eax
InstrSteve endp

uosunsetter

Thanks dedndave!

Quote from: SteveAsm on April 21, 2012, 02:38:44 AM
@ uosunsetter
I was looking at your code, trying to debug it and I decided it was easier to just rewrite it, since I knew what you were trying to do.
Maybe somebody can insert this as a replacement for MyTestCode and time it.


Actually, as I said, on jj2007's benchmark attachments, he has the fixed version of my code. The errors simply caused by using lea instead of using mov and offset.

What I was doing:
lea esi, str1

What should have been:
mov esi, offset str1

After the replacement I got proper benchmark results which were quite disappointing(and normal :P).

jj2007

Quote from: uosunsetter on April 21, 2012, 09:52:29 AM
After the replacement I got proper benchmark results which were quite disappointing(and normal :P).

Your code is already faster than the Windows API StrStr version. Comment out two lines, and it's 100 cycles faster:

PrepareForNextComp:
   dec cl
   ;cmp cl, 0
   je notfound      ; cl=0: NOT FOUND
   mov edi, pPattern
   inc edx
   mov esi, edx
   mov bh, ch

Compare:
   mov al, [esi]
   mov ah, [edi]
   cmp al, ah
   jne PrepareForNextComp
   dec bh
;   cmp bh, 0 ;if done pPattern's len times, than found
   je found
   inc esi
   inc edi
   jmp Compare

uosunsetter

Ewww wouldn't it crash or loops infinetely if I remove them?

dedndave

no - the DEC instruction sets the Zero Flag if the result is zero - no need to compare with 0

also - if the upper 3 bytes of EBX and ECX are 0, then DEC EBX and DEC ECX are probably slightly faster
that's because DWORD operations are generally prefered on 32-bit processors
and - DEC CL/BL are 2 byte instructions - DEC ECX/EBX are 1 byte instructions

uosunsetter

I just needed that kind of informations! Very good tips. Also intresting but makes sense. I learned a lot from this topic  :U  :8)

dedndave

ok - another tip   :P

it seems that DEC ECX is a little slow on some processors
actually, the instruction is fast if you do not have to wait for the flags
        dec     ecx
        jz      notfound

the processor has to wait for the flags to be altered before it can decide which way to branch

        sub     ecx,1
        jz      notfound

may be slightly faster, even though SUB ECX,1 is 3 bytes

another way around it is to put some other instruction in there that does not affect the flags
in this code, you don't have one to stick in there   :P
but, to illustrate...
        dec     ecx
        mov     esi,123456
        jnz     SomeLabel

is probably faster than
        mov     esi,123456
        dec     ecx
        jnz     SomeLabel

the processor can do something else while it waits for the flags to be set

FORTRANS

Hi,

   As Dave says, DEC and INC will set the Zero Flag properly.
But unlike ADD and SUB they do not affect the Carry Flag.
So you cannot use all of the conditioal jumps with INC and DEC,
while you can with ADD and SUB.  Just a heads up.

$0.02,

Steve N.

uosunsetter

Quotethe processor can do something else while it waits for the flags to be set

It makes sense. This is, however, a bit confusing for me. I always thought that only one thing can be done at a time on CPU. Again I thought that maybe there as some exceptions on multi-threading; but as far as I know, even multi-threading only organize and put things in queue without blocking each other.

And 1 other thing that I've been always wondering about is waiting. All these stuff including waiting(sleep functions, or waiting for a flag etc) do they rapidly check for the condition to be satisfied?

dedndave

modern CPU's execute many instructions "out-of-order" to achieve speed
they look ahead to see what instructions may be executed without being affected by those before them
and, they execute code in multiple pipe-lines
a major part of optimizing code is understanding how the CPU executes instructions
with modern CPU's - that is not a simple thing - i am mystified by what is faster quite often - lol

as for the Sleep function - that is unrelated to code execution time
when your process (or program, or thread) is asleep, the CPU and operating system allow other threads to execute
a good example is waiting for a key to be pressed by the user
if i call the operating system function to get a key in a loop, it executes very fast
and - it hogs all the processor power waiting for a key   :P
so - we insert a call to the Sleep function to allow other threads to execute
i typically use something like 40 or 50 milliseconds
40 mS will accommodate a typist going 250 WPM   :bg
even faster, actually...
i write the loop so that, if a keypress is received, it does not Sleep before the next test to see if a key is present
only when no keypress is in the buffer do i call Sleep