News:

MASM32 SDK Description, downloads and other helpful links
MASM32.com New Forum Link
masmforum WebSite

Null terminated strings?

Started by unktehi, February 24, 2009, 04:03:20 PM

Previous topic - Next topic

unktehi

I just have a simple question.  If a null terminated string ends a string with a 0 byte - what is its purpose? Is it always needed or just best practice?

ecube

null terminator is the destinated termination character, that's the standard for strings. You can desinate whatever character you want to end a string, doesn't have to be a 0, but you'll have to write your own string handling functions to be able to understand your changes, as normal strcmp,strlen, etc functions all expect a null terminator. This is a common problem with dealing with encryption and similar on data, people forget encryption usually in unbiases to the data it's dealing with and can encrypt a byte anywhere in the string to a 0, which would throw off normal string handling functions, as you'll only reach a part of the total string. A good way to help overcome this is use something like base64 to en/decode it.

unktehi

I don't understand everything you wrote about, but basically, a 0 should be used as a default string terminator. If I use another character to end the string I'd have to write additional functions so that the program would understand it. Is that correct?

I don't understand what you mean about the encryption and using base 64 to en/decode?

BogdanOntanu

When you store a string in memory then you must have a way or a method to determine the length of the string.

Basically there are two such methods:
1) Have a special character at the end of the string
2) Store the length of the string somewhere

The first method with a ZERO (aka null character) at the end of the string is mainly used in C/C++
The second method with the length of the string stored at the start of the string (a byte, a word or a dword) is mainly used by Pascal and Basic.

If I remember correctly in DOS the "$" character was used by some routines as a string terminator.

In ASM you can use what method you like but the first one that uses the zero / null terminator is more common and many routines in the OS do expect or assume that the strings are in this format / convention.

Of course that you can use your own variations for this but then you must have your own routines to handle this custom kind of strings and anyway you will have to use null terminated strings when you will interface with the OS or other code library.

Ask yourself what are the advantages and the disadvantages of each method... this is the first step towards understanding.
Ambition is a lame excuse for the ones not brave enough to be lazy.
http://www.oby.ro

unktehi

That makes sense! Thanks for the explanation!

Vortex

Hi unktehi,

Try to examine the NULL terminated strings under a debugger like Ollydbg. This will help you to understand the internals of NULL string processing functions.

unktehi

Thanks for the suggestion, I'll download that and take a look when I have a chance!

ecube

Quote from: unktehi on February 24, 2009, 08:37:53 PM
That makes sense! Thanks for the explanation!
he said what I said...just abit extra, guess i'm speaking a different language :p

mitchi

zero is a good terminator for a string because the OS gives you zeroed pages of memory. Most of the time you don't need to add that zero yourself.

unktehi

Thanks guys - this has helped me understand better!