how would i go about removing commas from a string. I can add commas but cant figure out how to remove them.
Start scanning the string at the beginning, no need for a second buffer the resulting string will never be longer than the original. When you encounter a comma shift the remainder of the string one place to the left (overwriting the comma) and continue scanning until you reach the end of the string.
This should work...
// Strip commas
mov edi,offset String
dec edi
:
inc edi
mov al,[edi]
test al,al
jz >.EOS
cmp al,","
jne <
mov esi,edi
.SHIFT
// shift the string
inc esi
mov al,[esi]
mov [esi-1],al
test al,al
jnz <.SHIFT
dec edi
jmp <
.EOS
I guess I should post MASM syntax too...
; Strip commas
mov edi,offset String
dec edi
HUNT:
inc edi
mov al,[edi]
test al,al
jz SHORT EOS
cmp al,","
jne SHORT HUNT
mov esi,edi
SHIFT:
; shift the string
inc esi
mov al,[esi]
mov [esi-1],al
test al,al
jnz SHORT SHIFT
dec edi
jmp SHORT HUNT
EOS:
awww i was somewhat on the right track. Thanks for the help and it works like a charm :)
Actually, there is no need to shift all of the remaining string everytime you encounter a coma. You could do it also this way, assuming a null-terminated string:
mov edi,offset String
mov esi,edi
toploop:
lodsb ;get character
cmp al,"," ;is it a coma
jz @F ;disregard if it is
stosb ;store character
@@:
test al,al ;is it end of string
jnz toploop ;continue until end of string
;the string is now free of comas and null-terminated
Raymond
What about removing multiple items from a string? For example, say you have the string: "Hello! I'm really hungry!" How would you go about removing anything that wasn't either an uppercase character or number (commas, spaces, and any other punctuation or special character)?
well - to make it faster....
i would not modify the original string
instead, make a new string in an empty buffer
copy only the characters you want and skip the ones you do not want
Ahhh, thanks! I'll try that instead
i guess it isn't going to be any faster :P
but, it may be desirable to retain the original string, anyways
write the routine so that it accepts an input buffer address and an output buffer address...
then, make it so they can be the same
that way, the caller can choose to overwrite the original string, if desired
an another style
Quote
Lea edx,string
@@:
.if byte ptr [edx] == 0
jmp @F
.elseif byte ptr [edx] == ","
mov byte ptr [edx]," " ;or other char
.elseif byte ptr [edx] == "Other"
mov byte ptr [edx],"other char"
.endif
inc edx
jmp @B
@@:
I need to keep the original string in tact so I can use it as part of the output. I think writing a procedure to keep the characters I want will be much easier than writing one that deletes the ones I don't want. I'm basically writing a program to evaluate if a string is a palindrome. So, I need to keep only letters and numbers, convert all letters to either uppercase or lowercase, and compare the new modified string to itself... one being front to back and the other being back to front. At least that's my thoughts on it.
Quote
Lea edx,string
lea ecx,output
@@:
.if byte ptr [edx] == 0
jmp @F
.elseif byte ptr [edx] >= "a" && byte ptr [edx] <= "z"
mov al,byte ptr [edx]
mov byte ptr [ecx],al ;or other char
inc ecx
;...
.endif
inc edx
jmp @B
@@:
mov byte ptr [ecx],0
Do both, use 2 buffers if you need the original, overwrite it if you don't. If you need to handle multiple characters, use a BYTE table.
IF 0 ; ¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤
Build this template with "CONSOLE ASSEMBLE AND LINK"
ENDIF ; ¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤
include \masm32\include\masm32rt.inc
proc_no_commas PROTO src:DWORD,dst:DWORD
nocommas MACRO arg1
invoke proc_no_commas,arg1,arg1
EXITM <eax>
ENDM
.code
start:
; ¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤
call main
inkey
exit
; ¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤
main proc
LOCAL str1 :DWORD
mov str1, chr$("This, is, a, test,,.,")
mov str1, nocommas(str1)
print str1,13,10
ret
main endp
; ¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤
OPTION PROLOGUE:NONE
OPTION EPILOGUE:NONE
proc_no_commas proc src:DWORD,dst:DWORD
mov ecx, [esp+4] ; src
mov edx, [esp+8] ; dst
@@:
movzx eax, BYTE PTR [ecx]
add ecx, 1
cmp eax, ","
je @B
mov [edx], al
add edx, 1
test eax, eax
jnz @B
mov eax, [esp+8] ; return the dst address for the macro.
ret 8
proc_no_commas endp
OPTION PROLOGUE:PrologueDef
OPTION EPILOGUE:EpilogueDef
; ¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤
end start
Palindromes, eh? Interesting.
Not an answer to your question, but the attached file has a little program that lets you play around with palindromes with an edit box. Might give you some ideas. It's something I've been thinking about for a long time, a palindrome-making program.
"Straw? No, too stupid a fad: I put soot on warts!"
on a side note, anyone know how to concatenate strings together? I'm trying to put three strings together to form one long string for my final output.
For example:
String1 byte 'The string ',0
String2 byte ' is a palindrome.',0
String3 byte ' is not a palindrome.',0
I need to concatenate string1 with the original string the user entered, then concatenate that string with either string2 or string3
so the final string should look something like this:
The string "taco cat" is a palindrome.
Quote from: usafman2006 on October 21, 2011, 03:48:03 PM
on a side note, anyone know how to concatenate strings together?
I normally use Let esi="First string "+offset String2+" third string" ;-)
If you prefer it hand-made:
- put edi to the end of the first string
- put esi to the start of the second one
- mov ecx, len(second string)
- rep movsb
there is no need to concatonate the strings in this case
change the strings to
String1 byte 'The string "',0
String2 byte '" is',0
String3 byte ' not',0
String4 byte ' a palindrome.',13,10,0
1) display String1
2) display the variable palindrome string
3) display String2
4) if it is not a palindrome, display String3 - otherwise, skip this step
5) display String4
invoke lstrcat,addr chaine,addr chaine1 ;zero terminate
invoke lstrcat,addr chaine,addr chaine2 ;and so on
For this program, I have to concatenate the three strings together and call the printstring procedure only 1 times to print out the entire sentence. I'm getting closer to figuring it out though; thanks for all the help!
well -
Jochen is showing you a method using MasmBasic
Yves is showing you a method using Masm32
:P
the instructions you want to learn about are refered to as the "string" instructions
MOVS - move string - moves a value from [ESI] to [EDI]
SCAS - scan string - compares values in AL/AX/EAX with the value at [EDI]
CMPS - compare string - compares the value at [ESI] with the value at [EDI]
STOS - store string - stores the value in AL/AX/EAX to [EDI]
LODS - load string - loads the value at [ESI] into AL/AX/EAX
they each come in different sizes
for example, MOVS can be used in different forms
movsb ;byte
movsw ;word
movsd ;dword
they can also be used with a label that infers the size
the assembler will choose the correct opcode, based on how the label is defined
if String1 is defined as a byte array...
movs String1
the assembler uses MOVSB
SCAS, STOS, and LODS use the AL, AX, or EAX register
the instructions use ESI and EDI as source and destination indexes (addresses)
MOVS and CMPS use both ESI and EDI
SCAS and STOS use EDI only
LODS uses ESI only
any of the string instructions may be combined with a REP prefix to cause repeat operations
the number of iterations is controlled by the count in ECX
the REP mneumonic is used with MOVS and STOS
REPZ or REPNZ are used with CMPS and SCAS
REP is not typically used with the LODS instruction, as it makes no sense to repeatedly load AL with a byte :P
REPZ (or REPE) repeats while the result of the compare or scan is zero, or until the count in ECX expires
REPNZ (or REPNE) repeats while the result of the compare or scan is not zero, or until the count in ECX expires
for all the string instructions, the addresses in ESI and/or EDI are incremented or decremented according to size
for example, if MOVSD is used, both ESI and EDI are incremented or decremented by 4 as each dword is moved
the direction flag is used to control direction
CLD - clears the direction flag (up)
STD - sets the direction flag (down)
now - for your specific case
you can create an empty buffer and copy each string into it by using MOVSB
to copy the constant strings, you can use SIZEOF String1-1 as the count (ECX)
we use -1 because we do not want to copy the 0 terminator
for the palindrome string, you can use the StrLen proc to get the size
.DATA
String1 byte 'The string "',0
String2 byte '" is',0
String3 byte ' not',0
String4 byte ' a palindrome.',13,10,0
.DATA?
Buffer db 256 dupp(?)
.CODE
cld
mov edi,offset Buffer
mov esi,offset String1
mov ecx,sizeof String1-1
rep movsb
mov edx,offset Target
call StrLen
mov esi,edx
rep movsb
mov esi,offset String2
mov ecx,sizeof String2-1
rep movsb
;test for palindrome=true here - if so, jump to SkipNot
mov esi,offset String3
mov ecx,sizeof String3-1
rep movsb
SkipNot:
mov esi,offset String4
mov ecx,sizeof String4-1
rep movsb
mov byte ptr [edi],0 ;terminate the string
mov edx,offset Buffer
call WriteString
of course, there is almost always a simpler way :8)
it just may not be as fast...
;---------------------------------
PutStr PROC
push eax
jmp short PutSt1
PutSt0: inc ecx
inc edx
PutSt1: mov al,[ecx]
mov [edx],al
or al,al
jnz PutSt0
dec edx
pop eax
ret
PutStr ENDP
;---------------------------------
;
;
;
mov edx,offset Buffer
mov ecx,offset String1
call PutStr
mov ecx,offset Target
call PutStr
mov ecx,offset String2
call PutStr
;test for palindrome=true here - if so, jump to SkipNot
mov ecx,offset String3
call PutStr
SkipNot:
mov ecx,offset String4
call PutStr
mov edx,offset Buffer
call WriteString
curious, where do you have your StrLen storing it's value? I have mine being stored in a variable I created in the data segment. When I try to copy that value into the ecx register, the program crashes. Any ideas?
sorry about that :red
i was refering to the routines i posted in this thread...
http://www.masm32.com/board/index.php?topic=17557.msg148015#msg148015
the length is returned in the ECX register, where it is convenient for other operations
;-----------------------------------------------------------
StrLen PROC
;Call With: EDX = address of string
;
; Returns: ECX = length of string
PUSH EAX
PUSH EDI
MOV AL, 0
MOV EDI, EDX
MOV ECX, 0FFFFFFFFh
REPNZ SCASB
NEG ECX
SUB ECX,2
POP EDI
POP EAX
RET
StrLen ENDP
;-----------------------------------------------------------
i didn't put any such comments in the PutStr routine
but, it does not preserve ECX (not needed) and it returns EDX pointing to the new null terminator
this is handy for the next PutStr operation
also, notice that the routines preserve EAX
this is not normal win32 behaviour, but it is compatible with Kip Irvine's libraries
you could use EAX to hold a value indicating whether or not the string is a palindrome
i wrote a similar routine (posted somewhere) that returns the value in EAX
that is more or less how things are normally done in win32 - the result is in EAX
however, i modified this approach to "act more like what Kip would do" :P
Thanks for the help!