Okay, here's the issue. I was having some problems with this program a while back, but we figured out the problem and now there's a new one . . .
The program: This program is supposed to get two strings from a user (at the moment the strings are predetermined for the sake of keeping it simple while I try to figure out the problem), the first string will be the "source string," which will be searched by the "target string," which is the second string the user will enter. A Str_find PROC will then run, which will see if it can find the correct string match in the source string. It's supposed to return with the location of where it found the target in the source, and then pass that along to Str_remove, which will remove the found string in the source.
The problem: I can't seem to get the Str_find to pass the location of the found string properly. Below is what I have so far, I know something is wrong but I've tried a dozen different things and none of them worked. I thought maybe putting some kind of a counter in Str_find to count how many spaces from the beginning it found the target string, but I can't figure out where I would put it as the rep movsb doesn't have any space to put in extra code or anything . . . very confused . . . anyone?
TITLE ch09_03 (ch09_03.asm)
INCLUDE Irvine32.inc
Str_find PROTO,
sourcePtr:PTR BYTE,
targetPtr:PTR BYTE
Str_remove PROTO,
sourcePtr:PTR BYTE,
removeN:DWORD
.data
sourceS BYTE "123ABCDEFG",0
targetS BYTE "BCD",0
pos DWORD ?
sLength DWORD ?
tLength DWORD ?
startpoint DWORD ?
counter DWORD 0
.code
main PROC
INVOKE Str_find,
ADDR sourceS,
ADDR targetS
FOUND:
mov edx,eax
call writestring
call crlf
call crlf
mov eax, OFFSET sourceS
add eax,tLength
call dumpregs
call crlf
call crlf
mov edx,esi
call writestring
call crlf
call crlf
INVOKE Str_remove,
esi,
tLength
mov edx, OFFSET sourceS
call writestring
call crlf
call crlf
jmp quit
quit:
main ENDP
exit
;------------------------------------------------------------------------------------------------------------------
;------------------------------------------------------------------------------------------------------------------
;------------------------------------------------------------------------------------------------------------------
Str_find PROC, sourcePtr:PTR BYTE, targetPtr:PTR BYTE
mov eax, sourcePtr
mov ebx, targetPtr
; SET ecx = Length of [targetPtr] (excluding null terminator) [spans 8 lines]
mov esi, targetPtr ; esi = beginning of the string
xor ecx, ecx ; ecx = 0
L0:
cmp BYTE PTR [esi], 0 ; [esi] == 0
je L1 ; TRUE: jump to L1
inc esi ; esi++
inc ecx ; ecx++
jmp L0 ; FALSE:Jump to L0
L1:
mov edx, ecx ; edx holds ecx so we don't lose it when looping (REPE)
L2:
mov esi, eax ; esi used in cmpsb
mov edi, ebx ; edi used in cmpsb
repe cmpsb ; Effect: while([esi++]==[edi++] && ecx-- > 0) {Zero flag stays set}
jz FOUND ; If ZF is still set then the above line found a match
mov ecx, edx ; ecx was changed 2 lines up. Put its original value back
cmp BYTE PTR [eax + ecx], 0 ; if we end up comparing NULLs we're at the end and we've failed
jz NOT_FOUND ; if the above is true, we've failed to find the string.
inc eax ; increment eax to be the next character in the sourcePtr string
jmp L2 ; Do it all again
NOT_FOUND:
or eax, 1 ; un-set the zero flag
ret ; return nothing
FOUND:
; mov eax, ebx ; mov the address of the found string into eax
mov tLength,edx
ret ; How do I return eax?
Str_find ENDP
;------------------------------------------------------------------------------------------------------------------
;------------------------------------------------------------------------------------------------------------------
;------------------------------------------------------------------------------------------------------------------
;----------------------------------------------------------------------------
; Str_remove - Removes removeN characters after sourcePtr
; If programmer wants to remove more characters than there ARE before
; the null terminator then only the characters up to NULL are removed
Str_remove PROC, sourcePtr:PTR BYTE, removeN:DWORD
;----------------------------------------------------------------------------
mov esi, sourcePtr ; esi = beginning of the string
xor ecx, ecx
L0: ; Get the # of characters until the null terminator
cmp BYTE PTR [esi], 0 ; [esi] == 0
je L1 ; TRUE: jump to L1
inc esi ; esi++
inc ecx ; ecx++
add counter,1
jmp L0 ; FALSE:Jump to L0
L1:
inc ecx ; ecx = # chars EXCEPT the null terminator so add 1 more
cmp ecx, removeN ; user wants to remove more characters than there ARE left?
jb L3 ; TRUE: jump to L3
mov edi, sourcePtr ; edi points to the first byte we want to remove
mov esi, eax ; esi points to the 1st byte after what
sub esi, removeN
cld ; movsb auto-INCREMENTS after each operation
rep movsb ; Effect: while(ecx > 0) { mov [esi++], [edi++] }
ret ; return
L3:
mov esi, sourcePtr
mov BYTE PTR [esi], 0
ret
Str_remove ENDP
END main
Nevermind, I got it, though I would like to know if there is a more efficient way of doing this...
Here's what I did. This is the finished product.
TITLE ch09_03 (ch09_03.asm)
INCLUDE Irvine32.inc
Str_find PROTO,
sourcePtr:PTR BYTE,
targetPtr:PTR BYTE
Str_remove PROTO,
sourcePtr:PTR BYTE,
removeN:DWORD
.data
sourceS BYTE 100 DUP (?),0
targetS BYTE 100 DUP (?),0
pos DWORD ?
sLength DWORD ?
tLength DWORD ?
startpoint DWORD ?
counter DWORD 0
prompt1 BYTE "Please enter a string: ",0
prompt2 BYTE "Please enter a search: ",0
response_found BYTE "The string was found. Removing . . .",0
response_removed BYTE "The string was removed: ",0
.code
main PROC
mov edx, OFFSET prompt1
call writestring
mov ecx, (SIZEOF sourceS)
mov edx, OFFSET sourceS
call Readstring
call crlf
call crlf
mov edx, OFFSET prompt2
call writestring
mov ecx, (SIZEOF targetS)
mov edx, OFFSET targetS
call readstring
call crlf
call crlf
INVOKE Str_find,
ADDR sourceS,
ADDR targetS
jnz FOUND
jz NOT_FOUND
FOUND:
mov edx, offset response_found
call writestring
call crlf
call crlf
INVOKE Str_remove,
esi,
tLength
mov edx, OFFSET response_removed
call writestring
mov edx, OFFSET sourceS
call writestring
call crlf
call crlf
jmp quit
NOT_FOUND:
jmp quit
quit:
main ENDP
exit
;------------------------------------------------------------------------------------------------------------------
;------------------------------------------------------------------------------------------------------------------
;------------------------------------------------------------------------------------------------------------------
Str_find PROC, sourcePtr:PTR BYTE, targetPtr:PTR BYTE
mov eax, sourcePtr
mov ebx, targetPtr
; SET ecx = Length of [targetPtr] (excluding null terminator) [spans 8 lines]
mov esi, targetPtr ; esi = beginning of the string
xor ecx, ecx ; ecx = 0
L0:
cmp BYTE PTR [esi], 0 ; [esi] == 0
je L1 ; TRUE: jump to L1
inc esi ; esi++
inc ecx ; ecx++
jmp L0 ; FALSE:Jump to L0
L1:
mov edx, ecx ; edx holds ecx so we don't lose it when looping (REPE)
L2:
mov esi, eax ; esi used in cmpsb
mov edi, ebx ; edi used in cmpsb
repe cmpsb ; Effect: while([esi++]==[edi++] && ecx-- > 0) {Zero flag stays set}
jz FOUND ; If ZF is still set then the above line found a match
mov ecx, edx ; ecx was changed 2 lines up. Put its original value back
cmp BYTE PTR [eax + ecx], 0 ; if we end up comparing NULLs we're at the end and we've failed
jz NOT_FOUND ; if the above is true, we've failed to find the string.
inc eax ; increment eax to be the next character in the sourcePtr string
jmp L2 ; Do it all again
NOT_FOUND:
or eax, 1 ; un-set the zero flag
ret ; return nothing
FOUND:
; mov eax, ebx ; mov the address of the found string into eax
mov tLength,edx
sub esi,tLength
ret ; How do I return eax?
Str_find ENDP
;------------------------------------------------------------------------------------------------------------------
;------------------------------------------------------------------------------------------------------------------
;------------------------------------------------------------------------------------------------------------------
;----------------------------------------------------------------------------
; Str_remove - Removes removeN characters after sourcePtr
; If programmer wants to remove more characters than there ARE before
; the null terminator then only the characters up to NULL are removed
Str_remove PROC, sourcePtr:PTR BYTE, removeN:DWORD
;----------------------------------------------------------------------------
mov esi, sourcePtr ; esi = beginning of the string
xor ecx, ecx
L0: ; Get the # of characters until the null terminator
cmp BYTE PTR [esi], 0 ; [esi] == 0
je L1 ; TRUE: jump to L1
inc esi ; esi++
inc ecx ; ecx++
add counter,1
jmp L0 ; FALSE:Jump to L0
L1:
inc ecx ; ecx = # chars EXCEPT the null terminator so add 1 more
cmp ecx, removeN ; user wants to remove more characters than there ARE left?
jb L3 ; TRUE: jump to L3
mov edi, sourcePtr ; edi points to the first byte we want to remove
mov esi, sourcePtr ; esi points to the 1st byte after what
add esi, removeN
cld ; movsb auto-INCREMENTS after each operation
rep movsb ; Effect: while(ecx > 0) { mov [esi++], [edi++] }
ret ; return
L3:
mov esi, sourcePtr
mov BYTE PTR [esi], 0
ret
Str_remove ENDP
END main
hello,
it seems that we both are using assembly for intel by irvine, so i could see where you're coming from. :)
my comments are all referring to the search procedure:
one thing i can say is to write your "string length" portion of code as a seperate procedure. that could be used in many more places and can save time for you in the long run.
you might want to free up some registers instead of "reserving" them. meaning, instead of moving the pointers of the strings to eax and ebx at the beginning, rather you may want to move them into the registers when they're needed. this way, you wont have to be forced to work with less registers than available and eliminate.
it looks like you're using the brute force searching algorithm. if you are concerned with the speed of your program, you might want to look into other algorithms or at least make a few modifications to your code. what you have runs on the order of O(n^2). rather than using the string comparing mnemonics, compare each character and break the search the moment you find a mismatch.
its like searching for 100000000 in 000000000000000111110000000000. it will be very inefficient to compare the characters [0,8], then [1,9], [2,10] and so on especially for this case. it would be better to compare source[0] with target[0]... fail... source[0] with target[1]... fail... and so on. sure it would require more coding and results in slightly larger program size, it would be a lot more efficient.
i had just finished an assignment i had that was based on string searching. to give you a better idea on what im trying to say:
BruteForce PROC
PUSHAD
MOV ecx,0
MOV edx,iTextSize
SUB edx,iPatternSize
.REPEAT
INVOKE Find, ecx
.IF eax == 1
MOV Position,ecx
JMP @F
.ENDIF
INC ecx
.UNTIL ecx > edx
@@:
POPAD
RET
BruteForce ENDP
Find PROC Location:DWORD
LOCAL Found:DWORD
MOV Found,1
PUSHAD
MOV esi,0
MOV ebx,bPattern
MOV edx,bText
ADD edx,Location
.REPEAT
MOV al,BYTE PTR [ebx+esi]
.IF al != BYTE PTR [edx+esi]
MOV Found,0
JMP @F
.ENDIF
INC esi
.UNTIL esi >= iPatternSize
@@:
POPAD
MOV eax,Found
RET
Find ENDP
bPattern and bText are pointers to the buffers containing the strings, iPatternSize is the size of bPattern and likewise for iTextSize.
my assignment was to read in files (pattern and text) and find if pattern was contained in text using different search algorithms. i dynamically allocated the memory for the buffers which complicated a few things. :)
i hope i've been some help to you.