News:

MASM32 SDK Description, downloads and other helpful links
MASM32.com New Forum Link
masmforum WebSite

Optimizing string procedures

Started by brethren, November 29, 2010, 03:30:39 PM

Previous topic - Next topic

brethren

I'm looking for a fast algorithm for searching a string for a single character, does anyone have any examples?

for example when searching a string for the null terminator you can search the string in dword increments and use an algorithm like this one.
After running the algo y contains 0 until a null is encountered (nulls are converted to 80h, perfect for using bsf)

    x= DWORD to examine
    y=(x & 0x7F7F7F7F) + 0x7F7F7F7F;
    y=~(y | x | 0x7F7F7F7F);


i'm looking for something similar but for arbitrary characters. Or some source examples for a fast StrChr routine would be great

dedndave

not sure that algo is all that fast  ;)

in any case, you are now looking for a variable, rather than a constant
and, you have to look for null, too - so you don't go past the end of the string

dedndave

looking again - i guess that algo does have some merit   :P

well - use the same algo
if you are looking for the first space, XOR EAX,20202020h - then all spaces are nulls  :bg

brethren


brethren

yes using a 32 bit mask is the way to go

<SNIP! LOGIC ERRORS>
its just a quickly thought out algo are there any errors you can see?

Tight_Coder_Ex

Eventhough SCAS B or D use 15 & 13 cycles respectively combined with REPNZ, would these be a viable alternative as pointer adjustment, character comparision and counter evaluation are all done with one instruction.  This assumes of course the purpose of the algo is to return a pointer.

brethren

i want the algorith to work just like C's strchr (just faster :P) so a pointer is what i need for a return value or NULL pointer if the char is not found. using the string instructions you'd need to scan a byte at a time (i think?) and thats what slows these type of algos down

jj2007

By far the fastest option is SSE2. Search the forum for pcmpeqb. Below the innermost loop of a stringlen algo.

@@: add edx, 16 ; next chunk, please
pcmpeqb xmm0, [edx] ; any nullbytes in there?
pmovmskb eax, xmm0 ; show them in eax
test eax, eax
jz @B


brethren

thanks for the hint jochen but i would rather stick with plain vanilla asm. that sse makes my eyes bleed :eek

jj2007

Quote from: brethren on November 30, 2010, 12:40:58 AM
thanks for the hint jochen but i would rather stick with plain vanilla asm. that sse makes my eyes bleed :eek

I also needed some time to get used to it :bg

dedndave

you practically have it written in your original post   :bg
write that algo to test for null
copy/paste and stick XOR in front of it to test for the search char

hutch--

Somewhere I have seen a variation of Agner Fog's StrLen algo that could be used to find characters other than ascii 0.
Download site for MASM32      New MASM Forum
https://masm32.com          https://masm32.com/board/index.php

MichaelW

Quote from: hutch-- on November 30, 2010, 01:43:28 AM
Somewhere I have seen a variation of Agner Fog's StrLen algo that could be used to find characters other than ascii 0.

I didn't think to do a search of the forum before I did this quick modification of Agner Fog's code. I had no time to do a through test (and after I posted I tested a no-match and discovered that the code faults, but no time to fix it).

;==============================================================================
    include \masm32\include\masm32rt.inc
    .686
    include \masm32\macros\timers.asm
;==============================================================================
    .data
      str1  db "my other brother darryl x",0
      char  dd "x"
    .code
;==============================================================================

OPTION PROLOGUE:NONE
OPTION EPILOGUE:NONE

align 4

strchr PROC pstring:DWORD,dwchar:DWORD
    push ebx                    ; ebx must be saved

    push esi
    ;--------------------------------------
    ; Replicate search character into esi.
    ;--------------------------------------
    mov eax, 01010101h
    mov ecx, [esp+16]
    mul ecx
    mov esi, eax

    mov ecx, [esp+12]           ; get pointer to string
    mov eax, ecx                ; copy pointer
    and ecx, 3                  ; lower 2 bits, check alignment
    jz L2                       ; s is aligned by 4. Go to loop
    and eax, -4                 ; align pointer by 4
    mov ebx, [eax]              ; read from preceding boundary
    shl ecx, 3                  ; *8 = displacement in bits
    mov edx, -1
    shl edx, cl                 ; make byte mask
    not edx                     ; mask = 0FFH for false bytes
    or ebx, edx                 ; mask out false bytes
    ; check first four bytes for zero

    xor ebx, esi

    lea ecx, [ebx-01010101H]    ; subtract 1 from each byte
    not ebx                     ; invert all bytes
    and ecx, ebx                ; and these two
    and ecx, 80808080H          ; test all sign bits
    jnz L3                      ; zero-byte found
    ; Main loop, read 4 bytes aligned
    L1: add eax, 4              ; increment pointer
    L2: mov ebx, [eax]          ; read 4 bytes of string

    xor ebx, esi

    lea ecx, [ebx-01010101H]    ; subtract 1 from each byte
    not ebx                     ; invert all bytes
    and ecx, ebx                ; and these two
    and ecx, 80808080H          ; test all sign bits
    jz L1                       ; no zero bytes, continue loop
    L3: bsf ecx, ecx            ; find right-most 1-bit
    shr ecx, 3                  ; divide by 8 = byte index

    ;----------------------------------
    ; Return pointer instead of index.
    ;----------------------------------
    ;sub eax, [esp+12]          ; subtract start address

    add eax, ecx                ; add index to byte

    pop esi

    pop ebx                     ; restore ebx
    ret 8                       ; return value in eax
strchr ENDP

OPTION PROLOGUE:PrologueDef
OPTION EPILOGUE:EpilogueDef

;==============================================================================
start:
;==============================================================================
    print str$(ADDR str1),13,10

    invoke crt_strchr, ADDR str1, char
    ;movzx edx, BYTE PTR [eax]
    ;print str$(edx),13,10
    print str$(eax),13,10

    invoke strchr, ADDR str1, char
    ;movzx edx, BYTE PTR [eax]
    ;print str$(edx),13,10
    print str$(eax),13,10,13,10

    invoke Sleep, 3000

    counter_begin 1000, HIGH_PRIORITY_CLASS
    counter_end
    print str$(eax),13,10

    counter_begin 1000, HIGH_PRIORITY_CLASS
      invoke crt_strchr, ADDR str1, char
    counter_end
    print str$(eax),13,10

    counter_begin 1000, HIGH_PRIORITY_CLASS
      invoke strchr, ADDR str1, char
    counter_end
    print str$(eax),13,10

    inkey "Press any key to exit..."
    exit
;==============================================================================
end start


Running on a P3:

0
87
58

eschew obfuscation

brethren

thanks everybody :U

i'm thinking of use this algo to search for the character. its benefit is that i won't have to test individual character in the search loop
.data
buf BYTE "asm", 0

.code
main PROC
mov ebx, 73737373h ;mask to search for 's'
mov ecx, DWORD PTR [buf] ;move 4 chars into ECX

mov edi, 7EFEFEFFh
xor ecx, ebx
add edi, ecx
xor ecx, 0FFFFFFFFh
xor ecx, edi
and ecx, 81010100h
;ECX=0 if char is not found, not-zero if found (use jz, jnz)

exit
main ENDP
END main


ps hutch, thanks i'll have a look for the agner fog modified strlen algorithm

brethren

Quote from: dedndave on November 30, 2010, 01:22:26 AM
you practically have it written in your original post   :bg
write that algo to test for null
copy/paste and stick XOR in front of it to test for the search char

thank you dave :thumbu

i thought that algo was just for testing for null but after reading your post i realise it is a general purpose algo for finding any character
you just do what you said use xor with a mask for the character you want to find(if you don't use a mask it'll find the null). now i can use that same algo to find the char and the null :8)