News:

MASM32 SDK Description, downloads and other helpful links
MASM32.com New Forum Link
masmforum WebSite

Smple algo to strip CRLF combinations from text.

Started by hutch--, January 02, 2007, 01:15:35 PM

Previous topic - Next topic

hutch--

Nothing excitig but it seems to work OK.


; ¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤

OPTION PROLOGUE:NONE
OPTION EPILOGUE:NONE

align 16

removelf proc src:DWORD

  ; -------------------------------------------------------------
  ; replaces any conbination of ascii 13 & 10 with a single space.
  ; -------------------------------------------------------------
    mov ecx, [esp+4]
    mov edx, [esp+4]
    sub ecx, 1

  @@:
    add ecx, 1
  backin:
    movzx eax, BYTE PTR [ecx]
    test eax, eax                   ; test for terminator
    jz zero
    cmp eax, 13                     ; test for CR
    je crlfeed
    cmp eax, 10                     ; test for LF
    je crlfeed
    mov [edx], al                   ; write byte to destination
    add edx, 1
    jmp @B

  crlfeed:
    add ecx, 1
    movzx eax, BYTE PTR [ecx]
    cmp eax, 13                     ; test for CR
    je crlfeed
    cmp eax, 10                     ; test for LF
    je crlfeed
    test eax, eax                   ; test for terminator
    jz zero

    mov BYTE PTR [edx], 32          ; write a space to replace
    add edx, 1                      ; the CRLF combination
    jmp backin

  zero:
    mov BYTE PTR [edx], 0
    mov eax, [esp+4]

    ret 4

removelf endp

OPTION PROLOGUE:PrologueDef
OPTION EPILOGUE:EpilogueDef

; ¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤
Download site for MASM32      New MASM Forum
https://masm32.com          https://masm32.com/board/index.php

Jibz

Sorry to be pedantic, but it looks like any trailing ascii 13 and 10 are not replaced with a single space, but only the zero terminator is written?

This does make sense, since you probably don't want a trailing space, but it means the description 'replaces any conbination of ascii 13 & 10 with a single space' is not entirely true :U.

hutch--

Hi Jibz, pleasure to hear from you. Unless its too early in the day, the algo appears to be delivering the right results.

Result.

line 1

line 2
line 4

line 1 line 2 line 3 line 4
6C 69 6E 65 20 31 20 6C - 69 6E 65 20 32 20 6C 69
6E 65 20 33 20 6C 69 6E - 65 20 34 00
Press any key to continue ...


Test code.

; «««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««
    include \masm32\include\masm32rt.inc
; «««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««

comment * -----------------------------------------------------
                        Build this  template with
                       "CONSOLE ASSEMBLE AND LINK"
        ----------------------------------------------------- *

    removelf PROTO :DWORD

    .data
      item db "line 1",13,10,10,"line 2",10,"line 3",13,"line 4",10,13,0

    .code

start:
   
; «««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««

    call main
    inkey
    exit

; «««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««

main proc

    LOCAL pbuf  :DWORD
    LOCAL blen  :DWORD

    mov eax, LENGTHOF item
    add eax, eax
    add eax, eax
    mov pbuf, alloc(eax)

    print ADDR item,13,10               ; display before

    invoke removelf,ADDR item

    print ADDR item,13,10               ; display after

    mov blen, len(ADDR item)            ; modified string length
    add blen, 1                         ; to see the zero as well

    invoke bin2hex,ADDR item,blen,pbuf  ; convert to hex

    print pbuf,13,10                    ; display result

    free pbuf

    ret

main endp

; ¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤

OPTION PROLOGUE:NONE
OPTION EPILOGUE:NONE

align 16

removelf proc src:DWORD

  ; -------------------------------------------------------------
  ; replaces any conbination of ascii 13 & 10 with a single space.
  ; -------------------------------------------------------------
    mov ecx, [esp+4]
    mov edx, [esp+4]
    sub ecx, 1

  @@:
    add ecx, 1
  backin:
    movzx eax, BYTE PTR [ecx]
    test eax, eax                   ; test for terminator
    jz zero
    cmp eax, 13                     ; test for CR
    je crlfeed
    cmp eax, 10                     ; test for LF
    je crlfeed
    mov [edx], al                   ; write byte to destination
    add edx, 1
    jmp @B

  crlfeed:
    add ecx, 1
    movzx eax, BYTE PTR [ecx]
    cmp eax, 13                     ; test for CR
    je crlfeed
    cmp eax, 10                     ; test for LF
    je crlfeed
    test eax, eax                   ; test for terminator
    jz zero

    mov BYTE PTR [edx], 32          ; write a space to replace
    add edx, 1                      ; the CRLF combination
    jmp backin

  zero:
    mov BYTE PTR [edx], 0
    mov eax, [esp+4]

    ret 4

removelf endp

OPTION PROLOGUE:PrologueDef
OPTION EPILOGUE:EpilogueDef

; ¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤

end start
Download site for MASM32      New MASM Forum
https://masm32.com          https://masm32.com/board/index.php

Jibz

Quote from: hutch--item db "line 1",13,10,10,"line 2",10,"line 3",13,"line 4",10,13,0
...
6C 69 6E 65 20 31 20 6C - 69 6E 65 20 32 20 6C 69
6E 65 20 33 20 6C 69 6E - 65 20 34 00

It may very well be too early in the day for me, at least I am still on my first cup of coffee :green .. but if I am not mistaken, the last 10,13 in the data should be replaced with a space if the function were to replace all occurrences of 13 and 10 with a single space, that is:

6C 69 6E 65 20 31 20 6C - 69 6E 65 20 32 20 6C 69
6E 65 20 33 20 6C 69 6E - 65 20 34 _20_ 00


I know the actual behavior of stripping trailing spaces is preferable, but you know how pedantic I can be about functionality descriptions :U.