News:

MASM32 SDK Description, downloads and other helpful links
MASM32.com New Forum Link
masmforum WebSite

How to CMP to more than one value?

Started by joemc, April 03, 2010, 01:51:56 AM

Previous topic - Next topic

Bill Cravener

This is the Campus forum correct? You guys surely have poor joemc confused with tables, cycle counts, aligning, branch predictions, benchmarks, etc.

There are numerous methods to locate and/or remove specific characters from a string some of them I've seen can be quite elegant and very clever but I like to do things the simple way and it also needs to be easily understood by others of less experience.

Here is a simple example that will strip out the junk joemc referred too from a string then the resulting cleaned string is then displayed in a message box.
My MASM32 Examples.

"Prejudice does not arise from low intelligence it arises from conservative ideals to which people of low intelligence are drawn." ~ Isaidthat

clive

#16
Actually in his initial reply he had already achieved that, and was looking/hoping for a "better" way. I'll let joemc decide if the suggested methods need more explanation.

To be honest this sounds more like a parsing issue where you want to skip whitespace, rather than simply extract it from a string which isn't hugely helpful in the real world unless you are trying to make compound German words.

-Clive
It could be a random act of randomness. Those happen a lot as well.

jj2007

Quote from: clive on April 03, 2010, 02:45:36 PM
To be honest this sounds more like a parsing issue where you want to skip whitespace

Dave and I had tested some strip whitespace algos here.

Damos

hutch's suggestion is all you need, elegant. :bg
Any intelligent fool can make things bigger, more complex, and more violent. It takes a touch of genius -- and a lot of courage -- to move in the opposite direction. - Albert Einstien

hutch--

Put the algo to a benchmark testing a large memory buffer and its a snail, it only runs a bit over 500 meg/zec on this quad.

On my system the buffer size is 84 meg and the test is done in one pass without looping or other assumptions that interfere with accurate results.


84978000 bytes sample
156
Press any key to continue ...


544 meg/sec.


IF 0  ; ¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤
                      Build this template with "CONSOLE ASSEMBLE AND LINK"
ENDIF ; ¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤

    include \masm32\include\masm32rt.inc

    nowhsp PROTO :DWORD

    .code

start:
   
; ¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤

    call main
    inkey
    exit

; ¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤

main proc

    LOCAL hMem  :DWORD
    LOCAL rval  :DWORD
    LOCAL flen  :DWORD
    LOCAL pMem  :DWORD
    LOCAL blen  :DWORD

    push ebx
    push esi
    push edi

    mov hMem, InputFile("\masm32\include\windows.inc")

    mov flen, len(hMem)

    mov eax, flen
    lea eax, [eax+eax*4]
    add eax, eax            ; mul by 10
    lea eax, [eax+eax*4]
    add eax, eax            ; mul by 10
    mov blen, eax

    mov pMem, alloc(blen)    ; 100 times original

    print str$(blen)," bytes sample",13,10

    invoke szMultiCat,25,pMem,hMem,hMem,hMem,hMem,hMem,hMem,hMem,hMem,hMem,hMem,hMem,hMem,hMem,hMem,hMem,hMem,
    hMem,hMem,hMem,hMem,hMem,hMem,hMem,hMem,hMem

    invoke szMultiCat,25,pMem,hMem,hMem,hMem,hMem,hMem,hMem,hMem,hMem,hMem,hMem,hMem,hMem,hMem,hMem,hMem,hMem,
    hMem,hMem,hMem,hMem,hMem,hMem,hMem,hMem,hMem

    invoke szMultiCat,25,pMem,hMem,hMem,hMem,hMem,hMem,hMem,hMem,hMem,hMem,hMem,hMem,hMem,hMem,hMem,hMem,hMem,
    hMem,hMem,hMem,hMem,hMem,hMem,hMem,hMem,hMem

    invoke szMultiCat,25,pMem,hMem,hMem,hMem,hMem,hMem,hMem,hMem,hMem,hMem,hMem,hMem,hMem,hMem,hMem,hMem,hMem,
    hMem,hMem,hMem,hMem,hMem,hMem,hMem,hMem,hMem

    invoke GetTickCount
    push eax

    invoke nowhsp,pMem

    invoke GetTickCount
    pop ecx
    sub eax, ecx
    print str$(eax),13,10

    free pMem
    free hMem

    pop edi
    pop esi
    pop ebx

    ret

main endp

; ¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤

OPTION PROLOGUE:NONE
OPTION EPILOGUE:NONE

nowhsp proc ptxt:DWORD

    mov ecx, [esp+4]        ; ptxt
    mov edx, [esp+4]        ; ptxt
    sub ecx, 1
    jmp lead

  pre:
    mov BYTE PTR [edx], al
    add edx, 1

  lead:
    add ecx, 1
    movzx eax, BYTE PTR [ecx]
    test eax, eax
    jz quit
    cmp eax, 32
    jg pre

  subloop:
    add ecx, 1
    movzx eax, BYTE PTR [ecx]
    test eax, eax
    jz quit
    cmp eax, 32
    jle subloop
    mov BYTE PTR [edx], 32
    add edx, 1
    jmp pre

  quit:
    mov BYTE PTR [edx], 0
    ret 4

nowhsp endp

OPTION PROLOGUE:PrologueDef
OPTION EPILOGUE:EpilogueDef

; ¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤

end start
Download site for MASM32      New MASM Forum
https://masm32.com          https://masm32.com/board/index.php

bozo

@OP

I was thinking you could use some SSE2 for this but unless it were for large blocks of memory, probably overkill.

the following is just rough idea btw  :wink

.data

white_space dd 4 dup (21212121h)

source_data db "0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZ",0
found db 10,"Found whitespace",10,0

.code

start:
      movdqa xmm0,[white_space]
      movdqu xmm1,[source_data]

      pcmpgtb xmm0,xmm1
      pmovmskb eax,xmm0

      test eax,eax
      jz exit_program

      push offset found
      call printf
      pop ecx

exit_program:
      push eax
      call ExitProcess

      end start

jj2007

Quote from: Kernel_Gaddafi on April 03, 2010, 11:04:09 PM

the following is just rough idea btw  :wink


In the sense that you haven't made the effort to test your code. Here is an example that works. You need Jwasm or a recent version of ml.exe (6.15 chokes).

include \masm32\include\masm32rt.inc
.686
.xmm

.data
white_space OWORD 21212121212121212121212121212121h
sources dd source1, source2, source3
source1 db "0123456789ABCDEFG",0
source2 db "0",32,"23456789ABCDEFG",0
source3 db "0123456789ABCDE", 32, "G",0

.code
start:
  mov ebx, 3
  mov esi, offset sources
  .Repeat
movdqa xmm0, white_space
lodsd
movdqu xmm1, [eax]
pcmpgtb xmm0, xmm1
pmovmskb edx, xmm0
test edx, edx
.if Zero?
print eax, " has NO whitespace", 13, 10
.else
print eax, " has whitespace", 13, 10
.endif
dec ebx
  .Until Zero?
  exit

end start

joemc

Thank you for all the replies.  I have not read through them all yet but do appreciate them.

Bill Cravener

Quote from: joemc on April 09, 2010, 06:27:26 PM
Thank you for all the replies.  I have not read through them all yet but do appreciate them.

Didn't know quite what you were getting yourself into, did you joemc?
My MASM32 Examples.

"Prejudice does not arise from low intelligence it arises from conservative ideals to which people of low intelligence are drawn." ~ Isaidthat

KeepingRealBusy

One way would be to build a look up table with a 1 at the character offset for the white space characters, a 0 elsewhere (256 byte table). Then four instructions would test for any occurrence:

    mov     esi,OFFSET string
    mov     ebx,OFFSET table

    movzx eax,BYTE PTR [esi]
    test     BYTE PTR [ebx+eax],1
    jnz       Whitespace



This method also can classify characters as alphabetics, upper case, lower case, numerics, control, etc all different cases bit mapped to different bits. If more than 8 classifications are needed, make the table a WORD table instead of a BYTE table and you have 16 different choices. Of course, more than one bit can be set in the table for a single character, i.e. 'a' would be alphabetic and lower case, and printable, and hex, etc.

Dave.

joemc

Dave,

I have not used a look up table often.  Seems so much simpler in masm format. Honestly I think pointers are simpler in general when using masm.  Thanks for the example.

edit: or it may be that pointers just make more sense now that i have tried learning masm.