News:

MASM32 SDK Description, downloads and other helpful links
MASM32.com New Forum Link
masmforum WebSite

didnt know what to call this question..

Started by sixleafclover, June 10, 2006, 03:41:43 PM

Previous topic - Next topic

sixleafclover

Hey i would say i'm new to masm but i dont evn have that excuse, just trying to get back into it after a while out and i've tried analysing my old source codes that do similar stuff but im so stupid i cant even figure it out from my old messy stuff, so here i am.

Basically i just want to search along a string byte by byte until i find the character i'm after (i'm trying to read a file in and then split it up into lines, so searching for linefeeds)

heres my attempt that completely doesnt work:

invoke ReadFile, hFile, addr readbuffer, 512, addr bytesread, 0
mov eax, offset readbuffer
endofline1:
inc eax
cmp eax,13
je exitit
jmp endofline1

this doesnt work, and i tried using old code i had that could write to strings using mov [buffer+edx], eax, but reversing that the otherway round doesnt seem to be working, i must be missing something very obvious.

Sorry for sillyness of question

Ossa

If you want to do it a bit like yours:

invoke ReadFile, hFile, addr readbuffer, 512, addr bytesread, 0
mov edx, offset readbuffer
endofline1:
mov al, [edx]
cmp al, 0Dh
je exitit
add edx, 1
jmp endofline1


note that there is a bug:

it will probably crash if there are no linefeeds in the string as it doesn't check to see if it is at the end of the buffer

[edit] The following will deal with those issues

invoke ReadFile, hFile, addr readbuffer, 512, addr bytesread, 0
cmp bytesread, 0
je noneread
mov edx, offset readbuffer
mov ecx, bytesread
endofline1:
mov al, [edx]
cmp al, 0Dh
je exitit
add edx, 1
sub ecx, 1
jnz endofline1

; No LFs in string - handle end here
noneread:

; No bytes were read - handle error here
exitit:

; Completed OK - address contained in edx


[/edit]

Ossa
Website (very old): ossa.the-wot.co.uk

asmfan

Setup the regs and...
rep scas[b|w|d]
Watch the logic and write more fast analog...
Russia is a weird place

sixleafclover

ahh figured out one problem where i was going wrong. silly silly me :red wont even say what i did wrong, but thanks for the help i think i nearly sorted it, man this is really simple i shouldnt be such a retard.

Ossa

asmfan is right, something like this might be easier:

invoke ReadFile, hFile, addr readbuffer, 512, addr bytesread, 0
mov ecx, bytesread
test ecx, ecx
jz noneread

push edi

mov edi, offset readbuffer
mov al, 0Dh
repne scasb

mov edx, edi
pop edi

test ecx, ecx
jnz exitit

; No LFs in string - handle end here
noneread:

; No bytes were read - handle error here
exitit:

; Completed OK - address contained in edx


Ossa
Website (very old): ossa.the-wot.co.uk

sixleafclover

yay ok got working find-the-endofline-code, now while i'm here i might as well ask....how can i split the string up at that address, kinda like Right() or rather Left() in VB, is there a direct function for doing it, or should i output byte by byte as i'm searching, and that way save time? in which case how do i get the second line out?

wow two dumb questions in one day, i'm going for the high score!!

Ossa

Quote from: sixleafclover on June 10, 2006, 04:53:29 PM
yay ok got working find-the-endofline-code, now while i'm here i might as well ask....how can i split the string up at that address, kinda like Right() or rather Left() in VB, is there a direct function for doing it, or should i output byte by byte as i'm searching, and that way save time? in which case how do i get the second line out?

It depends what you want to do:

1) If you want them to be stored outside the buffer that you are reading them into, then outputting them byte-by-byte would be best.
2) Otherwise, you could just replace the LFs with NULLs and keep an array of pointers to the sub-strings.

Ossa
Website (very old): ossa.the-wot.co.uk

asmfan

Actually you search CR - 0Dh and not always there can be LF... So i suggest to search full pair CR+LF <MOV AX, 0A0Dh> and then get the correct result (Win strings of course;)
Russia is a weird place

Mark Jones

Hello, here's a procedure to copy chunks of string data from one null-terminated buffer into another buffer. You could use a similar idea but switch where EDI is pointing when CR+LFs are detected and thus put each "line" into separate buffer(s). Left as learning excercise. :wink


szCopyMJ3 proc uses esi edi szDest:DWORD,szSource:DWORD
    mov esi,szSource
    mov edi,szDest
    cld                             ; copy forwards
@@:
    cmp byte ptr [esi+03],00        ; look-ahead for null
    jz CopyFour
    cmp byte ptr [esi+02],00
    jz CopyThree
    cmp byte ptr [esi+01],00
    jz CopyTwo
    cmp byte ptr [esi+00],00
    jz CopyOne
    movsd                           ; else move DWORD from ESI --> EDI & increment both 4
    jmp @B

CopyFour:                           ; finish by copying these remaining number of bytes
    movsd
    ret
CopyTwo:
    movsw
    ret
CopyThree:
    movsw
CopyOne:
    movsb
    ret
szCopyMJ3 endp



.data
    myString1  db  'Hello world!',0
.data?
    myString2  db  256 dup (?)
.code
    invoke szCopyMJ3,addr myString2,addr myString1
    invoke MessageBox,0,addr myString2,0,MB_OK
"To deny our impulses... foolish; to revel in them, chaos." MCJ 2003.08

Casper

sixleafclover,
Since you seem to be familiar with VB, left$, right$, ltrim$, rtrim$ and many others all exist as macros in masm32.  Look in \masm32\macros\macros.asm

Casper

raymond

Quoterepne scasb

   mov edx, edi
   pop edi
   
   test ecx, ecx
   jnz exitit

   ; No LFs in string - handle end here
noneread:

   ; No bytes were read - handle error here
exitit:

   ; Completed OK - address contained in edx

One problem with this approach. If the byte being scanned is the last one of the buffer, ECX=0 but the scan would have succeeded.

The test ecx, ecx/jnz exitit sequence must be replaced simply by jz exitit

Raymond
When you assume something, you risk being wrong half the time
http://www.ray.masmcode.com

Ossa

Quote from: raymond on June 11, 2006, 02:38:24 AM
The test ecx, ecx/jnz exitit sequence must be replaced simply by jz exitit

Thanks for pointing that out Raymond - I haven't used the string instructions in a while, it seems i need to review them.  ::)

Ossa
Website (very old): ossa.the-wot.co.uk

Ratch

sixleafclover,

QuoteBasically i just want to search along a string byte by byte until i find the character i'm after

     The subprogram below will search a string for as many characters as you PUSH onto the stack.  It will return the address of the first found byte in EAX.  It searches in DWORD chuncks instead of byte by byte.  The program can be easily modified to keep the search parameters on the stack for repeated searches, but then the programmer assumes the task of balancing the stack when search operations are completed.  The @ character in the code is EQU'd to OFFSET and the POPIT (multiple pops) and RPUSHIT(reverse multiple pops)  MACROs are easily written.  Or I can supply you with the MACRO definitions.  Ask if any questions.  Ratch


;-------------------------------------------------------------------------------
;*******************************************************************************
; MFNDCHR:   Returns the address of the first character found within a string  *
;            from a list of characters PUSHed onto the stack.                  *
;                                                                              *
; Called by: INVOKIT MFNDCHR,@ STRADR,<char>,<char>,<char>,<char>,...,0        *
;                                                                              *
; Example:   INVOKIT MFNDCHR,@ STRADR,'a','A','e','E','i','I','o','O','u','U',0*
;                                                                              *
; Returns:   EAX=address, NOT the index of the character within the string. To *
;            get the index, subtract the string address from EAX.              *
;                                                                              *
; Notes:     A parameter of 0 MUST be PUSHed onto the stack FIRST!             *
;            This marks the end of the stack and also searches for the string  *
;            terminator.  Failure to do this will result in performance error. *
;                                                                              *
;            This subroutine conforms with WIN32 conventions regarding         *
;            register preservation and stack balancing.                        *
;                                                                              *
; Coder:     Ratch                                                             *
;*******************************************************************************

MFNDCHR$1 STRUCT
  DWORD 4 DUP (?) ;register save area
  RETURN DWORD ?
  STRADR DWORD ?
  FIRSTP DWORD ?  ;first parameter
; ....   DWORD ?  ;any number of parameter chars can be pushed onto stack
MFNDCHR$1 ENDS

S$1 EQU ESP.MFNDCHR$1          ;save some typing

.CODE

MFNDCHR:                       ;it all begins here
  RPUSHIT EBX,EBP,ESI,EDI      ;save those registers
  MOV EBP,01010101H            ;replication constant
  MOV EDI,[S$1.STRADR]         ;address of string

  .REPEAT                      ;searching string ....
    LEA ESI,[S$1.FIRSTP]       ;pointer to char list
    XOR EBX,EBX                ;clear accumulation mask

    .REPEAT                    ;searching DWORD for chars from list
      MOV EAX,[ESI]            ;next char to check
      MUL EBP                  ;replication operation
      MOV ECX,EBP              ;@@@ prepare for propagation operation
      MOV EDX,[EDI]            ;next DWORD
      NEG ECX                  ;@@@ prepare for propagation operation
      XOR EDX,EAX              ;add to accumulation mask
      ADD ECX,EDX              ;@@@ propagation operation
;      LEA ECX,[EDX-01010101H] ;this instruction replaces the 3 above marked with @@@
      NOT EDX                  ;prepare for propagation operation
      AND EDX,ECX              ;propagation operation
      ADD ESI,DWORD            ;increment pointer for chars to check
      OR EBX,EDX               ;update accumulation mask
      TEST EAX,EAX             ;check if done with char list
    .UNTIL ZERO?

    ADD EDI,DWORD              ;increment pointer for DWORD of string
    AND EBX,80808080H          ;sieve out zero bytes from accumulation mask
  .UNTIL !ZERO?                ;check for successful search

  .REPEAT                      ;searching current DWORD ...
    INC EDI                    ;increase char count if no carry from shift
    ROR EBX,8                  ;bring leftmost bit of each byte into carry flag
  .UNTIL CARRY?                ;if carry flag set, at least one byte is zero


  LEA EDX,[S$1.STRADR]         ;prepare to compute number of params pushed
  LEA EAX,[EDI-DWORD-BYTE]     ;adjust found char address
  SUB EDX,ESI                  ;EDX=negative value of number of params pushed

  POPIT EBX,EBP,ESI,EDI        ;restore those registers
  POP ECX                      ;ECX=return address
  SUB ESP,EDX                  ;balance stack
  JMP ECX                      ;return to sender
;-------------------------------------------------------------------------------