News:

MASM32 SDK Description, downloads and other helpful links
MASM32.com New Forum Link
masmforum WebSite

A little trouble with string primitives

Started by omdown, May 13, 2005, 06:34:56 AM

Previous topic - Next topic

omdown

So, I'm having a little trouble here.  I'm working on something that creates a procedure that will concatenate two strings together.  My problem isn't so much in the procedure as how I can get it to add the second string without overwritting the first string.

Here's what I've got:

TITLE Concact                    (Concact.asm)

INCLUDE Irvine32.inc

Str_concat PROTO,
source1:PTR BYTE, ; source string
source2:PTR BYTE, ; target string
target:PTR BYTE,
maxChars:DWORD

Str_length PROTO,
pString:PTR BYTE ; pointer to string

.data
VAL1 BYTE "ABCDE",0
VAL2 BYTE "FGHIJ",0
CONCAT BYTE ?

.code
main PROC

call Clrscr

INVOKE Str_concat,
ADDR val1,
ADDR val2,
ADDR concat,
sizeof val1 + sizeof val2



mov edx, OFFSET concat
call writestring

exit
main ENDP

Str_concat PROC USES eax ecx esi edi,
source1:PTR BYTE,
source2:PTR BYTE,
target:PTR BYTE,
maxChars:DWORD

mov ecx,maxChars
mov esi,source1
mov edi,target
cld
rep movsb

mov ecx,maxChars
mov esi,source2
mov edi,target + (lengthof VAL1)
cld
rep movsb

ret
Str_concat ENDP

END main


When I run the program, I just get a windows error.  I know it has something to do with trying to reference to "target + (lengthof VAL1)", but I can't think of any other way to write to it without overwritting it like it does if I don't include the plus.  Anyone have any idea how I could fix that?

MichaelW

You are not allocating enough space for your target buffer. You need at least 10 bytes for the concatenated strings and 1 additional byte for the null terminator.

The point of using null terminated strings is so you can recognize the end of the string. String concatenation procedures are generally coded to move the bytes in a loop, and stop when they find the null terminator.

The maximum length argument should be the length of the target buffer. It is included as a safeguard so the procedure can avoid writing past the end of the buffer when the passed strings are too big to fit in the buffer.
eschew obfuscation

Vortex

Hi omdown,

Also, have a look at string concatenation functions provided by masm32.lib

AeroASM

The error with mov edi,target + (lengthof VAL1) is that there is no opcode for it. Consider:

mov reg, mem + imm                         doesn't exist

You will have to do:

mov reg, mem
add reg, imm

Also, you never need to preserve eax or ecx, (or edx), and if you are a bit sneaky you can modify esi and edi as long as only you use that function and you know what you are doing.

omdown

New problem . . . I used AeroASM's idea of adding to the positioning, and it seemed to work find, until I discovered it ONLY worked if the first string was five characters or more.  For some reason, I have no idea why, but if the first string is less then four characters, it only prints the first string and doesn't print the second.  Then, if the first one is more then four characters, it only prints the first four and then the rest gets printed over by the second string which used to not show up.  I'm wayyyy confused about this.  Why would this happen?  Here's the updated code, and I don't think it has anything to do with the fact that it's now prompting the user for the strings because I tried just putting them in as variables and was getting the same problems.

TITLE Concact                    (Concact.asm)

INCLUDE Irvine32.inc

Str_concat PROTO,
source1:PTR BYTE, ; source string
source2:PTR BYTE, ; target string
target:PTR BYTE,
maxChars:DWORD

Str_length PROTO,
pString:PTR BYTE ; pointer to string

.data
VAL1 BYTE 100 DUP (?)
VAL2 BYTE 100 DUP (?)
CONCAT BYTE 202 DUP (?)
prompt1 BYTE "Please enter a string: ",0
prompt2 BYTE "Please enter a second string: ",0
result BYTE "The concated strings are: ",0

.code
main PROC

call Clrscr

mov edx, OFFSET prompt1
call writestring

mov edx, OFFSET VAL1
mov ecx, (sizeof VAL1)
call readstring

call crlf

mov edx, OFFSET prompt2
call writestring

mov edx, OFFSET VAL2
mov ecx, (sizeof VAL2)
call readstring



INVOKE Str_concat,
ADDR val1,
ADDR val2,
ADDR concat,
100

mov edx, OFFSET result
call writestring

mov edx, OFFSET concat
call writestring

call crlf

exit
main ENDP

Str_concat PROC USES eax ecx esi edi,
source1:PTR BYTE,
source2:PTR BYTE,
target:PTR BYTE,
maxChars:DWORD

mov ecx,maxChars
mov esi,source1
mov edi,target
cld
rep movsb

mov ecx,maxChars
mov esi,source2
mov edi,target
add edi,(sizeof source1)
cld
rep movsb

ret
Str_concat ENDP

END main

AeroASM

When you put "sizeof source1", it always comes out as four, because you did not declare its size and the default is dword, which is 4 bytes. you need to dynamically work out the length of string 1.

Also, if you have worked out the length of string 1, you may as well work out the length of string 2 and cut out hte dependance on maxChars. This is much neater.

tenkey

Or to put it another way, SIZEOF gives you a compile-time size, which is fixed, and not the runtime size, which varies.
A programming language is low level when its programs require attention to the irrelevant.
Alan Perlis, Epigram #8

Infro_X

Sizeof puts a number into your program when you MAKE the program, and not when the user runs the program.
Examine the following code.

StringCopier proc uses esi,edi,ecx,SourceA,SourceB,Destination,MaxChars
xor ecx,ecx   ;ecx=0
lea esi,SourceA    ;esi = &SourceA
;While SourceA's char at ecx is not 0
;or
;While(SourceA[x]!=0x0)
compare4zero:
add ecx,1
cmp [esi+ecx],0
jnz compare4zero

This first peice is quite simple, Get SourceA into esi.
Get sizeof SourceA into ecx.
Get Destination into edi.
now if we cleared the direction flag and repeted movsb we'd copy only the string and null terminator into EDI (destination)

cld
rep movsb

Destination holds all the chars of SourceA
now, we know where SourceA ends, and we can copy SourceB into Destination directly after it, but note, we have to delete the old null terminator, or anything reading it will think it ended in the middle of SourceA and SourceB

sub edi,1 ;Go back and write over the NULL terminator
lea esi,SourceB
compare4zeroversion2:
add ecx,1
cmp [esi+ecx],0
jnz compare4zeroversion2
rep movsb


Also note, it'd probably more logical to create a function to get the length of the string.


StringCopier proc uses esi,ecx,edi,SourceA,SourceB,Destination,MaxChars
invoke StringLength,SourceA
mov ecx,eax
sub ecx,1 ;don't even copy over the null terminator
mov esi,[SourceA]
mov edi,Destination
cld
rep movsb
invoke StringLength,SourceB
mov esi,[SourceB]
rep movsb
ret
StringCopier endp

StringLength proc Source
mov eax,[Source]
compare4zero:
add eax,1
cmp [eax],0
jnz compare4zero
sub eax,[Source]
ret
StringLength endp