News:

MASM32 SDK Description, downloads and other helpful links
MASM32.com New Forum Link
masmforum WebSite

Putting a null at the end of a string

Started by allynm, July 29, 2009, 01:37:52 AM

Previous topic - Next topic

allynm

Hello everyone -

When we initialize a string using the conventional notation var  db "    ", 0 the assembler kindly places a null at the end of the string in memory.  Let's change the scenario slightly.  Suppose I read into memory a string of ASCII characters using, for example, ReadFile.  When I look at memory I will discover that 0D, 0A have been appended to the string because obviously I hit the carriage return at the end of the input and a newline was generated and the readfile command counts these characters as if they were part and parcel of the string.  I would like to know how you folks think I could get rid of the newline charactes and replace them with a NULL so that the string looks like it "should" if it were initialized as such. 

A small, probably silly, question. 

Thanks as always,
Mark Allyn

dedndave

for line input from the console, they are always there, i think (for real mode DOS, it was only a ODh)
probably the fastest way is to use the NumberOfBytesRead value as an index and replace the ODh with a 0

mov ecx,NumberOfBytesRead
mov byte ptr InpReadBuffer[ecx-2],0

i think that will work - lol
well, you get the idea, anyway

hutch--

Mark,

Just scan the string and when you find the ASCII 13, write an ASCII zero in its place.
Download site for MASM32      New MASM Forum
https://masm32.com          https://masm32.com/board/index.php

dedndave

i was gonna say that, too, Hutch
but it is slower, and requires use of edi
and, you still wind up putting the count in ecx - may as well just go there and terminate it
i suppose you could use std and start at the end to speed it up a bit

allynm

Hi dedndave and hutch -

I will try both of your suggestions.  Thanks for your help.  As dedndave points out it does require edi, but I am still experimenting with the Intel instructions so it will be fun to see what happens.  This discussion does remind me of something that I think JJ (jochen?) mentioned (might have been MichaelW) in a recent posting concerning the number of cycles consumed by various instructions.  I think he reported results for a Celeron processor...but, more generally, is there some code around that can compute cycles for a 386?

Mark

hutch--

The "input" macro already removes the CRLF.


; ¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤
    include \masm32\include\masm32rt.inc
; ¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤

comment * -----------------------------------------------------
                        Build this  template with
                       "CONSOLE ASSEMBLE AND LINK"
        ----------------------------------------------------- *

    .data?
      value dd ?

    .data
      item dd 0

    .code

start:
   
; ¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤

    call main
    inkey
    exit

; ¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤

main proc

    LOCAL inp$  :DWORD

    mov inp$, input("Hmmmm, type something")

    print inp$,13,10

    print str$(len(inp$))," characters long",13,10

    ret

main endp

; ¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤

end start


It uses this library module.


; ¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤

    .486
    .model flat, stdcall
    option casemap :none   ; case sensitive

    .code

; ¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤

OPTION PROLOGUE:NONE
OPTION EPILOGUE:NONE

StripLF proc src:DWORD

    mov eax, [esp+4]
    sub eax, 1
  @@:
    add eax, 1
    cmp BYTE PTR [eax], 0
    je tlfout
    cmp BYTE PTR [eax], 13
    jne @B

    mov BYTE PTR [eax], 0
    ret 4

  tlfout:
    ret 4

StripLF endp

OPTION PROLOGUE:PrologueDef
OPTION EPILOGUE:EpilogueDef

; ¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤

    end
Download site for MASM32      New MASM Forum
https://masm32.com          https://masm32.com/board/index.php

allynm

Hi Hutch -

Thanks for clarifying what INPUT will do.  I really did not know that this macro would accomplish this.  I am reading the code to mean that I don't actually need to use edi.  Is this correct?
Mark

hutch--

Mark,

The only time you are required to use either EDI or ESI specifically is when an instruction requires it, mainly the older string instruction like movsb, scasb etc .... and their WORD and DWORD counterparts. In that context ESI EDI are respectively the source and destination indexes. Apart from these usages they can be used as general purpose registers like any of the others. Just remember if you need to use either in a proc that you must preserve and restore their content at the beginning and end of the proc.
Download site for MASM32      New MASM Forum
https://masm32.com          https://masm32.com/board/index.php

allynm

Hi Hutch -

This question may deserve a new thread....but, I'll take the plunge anyway.

I knew the ESI:DSI requirement on the string instructions.  I'm curious to know what you meant when you characterized movsb, scasb, etc as "older"...if you have a moment, what were you thinking of as "newer"?

Regards,
Mark


Slugsnack

as time went on the x86 instruction set was expanded to add functionality. a few examples of 'newer' instructions is PUSHAD/POPAD, MOVZX are just a few off the top of my head

here you go though :
http://en.wikipedia.org/wiki/X86_instruction_listings

ToutEnMasm


another soluce to do it is (if you know the number of bytes  written):
get the adress of buffer
add the number of bytes written to this address.
sub 2 to this adress and put the zero at this adress,this replace the 0D by zero.


dedndave


hutch--

Mark,

The string instructions (MOVSB, SCASB etc ....) were designed in the 8086 days and required specific registers to function. Since the beginning of 32 bit x86 there have usually been faster ways to do this type of function, usually by loading the address into a register and performing the operation on each data size (BYTE WORD DWORD) then incrementing the pointer to the next data item. Ther are a couple of exceptions with MOVSD etc ... but only with the prefix REP and only over a certain data length.

Apart from the stack pointer ESP and most of the time the base pointer EBP you can use any register for anything, you can use EBP if you write a "no stack frame" proc and if you are really desperate and know what you are doing you can occasionally even use ESP but the general drift is with freestyle code that uses the registers available without being limited by instruction choice, you can write anything you like.

Another factor is chip silicon space, many of the slower instructions live in microcode where the fast simpler instructions live on direct silicon pathways and this gives you a pseudo RISC range if you need to keep your code fast. In most other things it does not matter if you use the older slower opcodes as long as your code is reliable and properly preserves registers.
Download site for MASM32      New MASM Forum
https://masm32.com          https://masm32.com/board/index.php

allynm

Hi dedndave, Yves (tout en Masm), and Hutch--

I actually LIKED dedndave's solution quite a bit.  I coded up dedndave's solution and also the search via scasb and dedndave's soution was quite elegant in comparison.  ON THE OTHER HAND, for those of us (Me!) still coming to grips with how to use the string instuctions that Hutch has described, it is worth going thru the scasb thing.  I wrote the code both ways and profited each time I did it.

Thanks,

Mark Allyn

dedndave

i am a bit behind the times, Mark
i have years of experience with all the microprocessors that you could think of that are obsolete - lol
so, these guys have a definite edge on me when it comes to knowing when to use which instructions
but, i thought the [ecx-2] thing was a sure winner ! - lol