Programmatic "LengthOf"

00100b · January 18, 2005, 03:42:01 AM

With the linear read of bytes, I have the concept down for iterating through the elements of a string, counting until the null-terminator is encountered.

I also know that MASM provides the LengthOf, SizeOf, and Type directives.

My question is... When iterating through the bytes (of either an array or a string, which I know is just an array) that doesn't have a null-terminator or an apparent indication that the upper bound of the array has been achieved, how do I determine the length of/size of said array?

Thank you

hutch-- · January 18, 2005, 05:43:53 AM

4b,

There is no obvious way at runtime to test this value but if the memory is dynamically allocated there may be a trick from the OS that manages the memory. At the crudest you would set up and excepton handler and just keep reading the array members until it generated an exception. There is an API named something like GetBadCodePtr() that can be used to do spot address checks but finally from a bare address of the beginning of an array that does not store the data somewhere, its no joy to do what you are after.

tenkey · January 18, 2005, 10:00:31 AM

If an array or string does not use terminators, you will need to keep and manage bounds information. It can be as simple as a count, or you can keep both upper and lower bounds information.

If the array or string does not change in size, the size can be a constant used everywhere in your program. However, if you use nonterminated arrays and strings of varying sizes as arguments to a function, you will need to provide size information somehow, as there will not be any intrinsic property to test for.

00100b · January 18, 2005, 05:27:05 PM

Thanks for the responses.

Checking for the length of a parameter for a procedure is ultimately what I'm trying to accomplish and I envision that this procedure could eventually be used for strings passed from sources external to an application.

My guess is, that the LengthOf (and SizeOf) directive is using the approach that hutch mentioned in that it just keeps reading elements until an exception is generated.

A follow up question, if I may. Does it hurt anything to just append a null-terminator to any string, even if it already has a null-terminator? If not, then I would be able to iterate through the elements, counting until the first null-terminator. Ack, I just realized that it would be difficult to know where to put the terminator without knowing where the string ends in the first place.

Another question, if I may. Where would be a good reference/resource on exception handling that breaks it down a little more to my level. The closest documentation I have for this topic is in the book, "The Unabridged Pentium 4: IA32 Processor Geneology" (Good book by the way. Covers each processor from the 386 on up.), or in the AoA, a little section on it's implementation of a "Try...Exception...EndTry" construct that HLA provides.

I have tried to take a look at unassembled code for a test program that I used the LengthOf directive, but all I could see is that it placed the value in the MOV statement where I used it. I was hoping to see all of the instructions that would have been used by the assembler to discern the value.

Thanks again.

Robert Collins · January 18, 2005, 09:22:02 PM

Quote from: 00100b on January 18, 2005, 03:42:01 AM
My question is... When iterating through the bytes (of either an array or a string, which I know is just an array) that doesn't have a null-terminator or an apparent indication that the upper bound of the array has been achieved, how do I determine the length of/size of said array?

OK, I can't tell you for sure because I am not yet into assembly coding to know one way or another but it just simply appears to me that if you do not know the real length (which is your case in point) of a contiguous streatch of memory and you know that it does not end in a zero or you are not sure if it does or doesn't then I have to say that you just arn't going to be able to get the 'true' length because your iterating will just go on utill it does find a zero somehere up there or you have caused some sort of memory violation (even then it's no absolute asssurity that it is the 'real' end of your string).

00100b · January 19, 2005, 05:07:05 AM

Thanks for the response Robert Collins.

I think that was what hutch was getting at concerning an exception.

I've been looking at the BOUND instruction (which checks an index value to see if it falls within the memory bounds of an array), but it deals with WORD and not BYTE. I've also been looking at the ALIGN directive since it will pad the unused bytes with nulls.

I'm not entirely sure on how to trap for the Array Bound Check Exception (or any other exceptions for that matter, yet). And admittedly, ALIGN might just be grasping at straws on my part.

Thank you.

hutch-- · January 19, 2005, 05:17:34 AM

4b,

If its code you are writing yourself there are a number of tricks to fix the problem but they involve storing the data you need when you create the array. If you store the data size and the member count you are in business and you can do that as two DWORDS at the beginning of the array. It means two address variables, one for the original array address and another that is 8 bytes higher where the data starts. Its simple enough to do.

00100b · January 19, 2005, 05:52:04 AM

Thanks for the response hutch.

Something like a BSTR. I'll tuck that one away. Good idea, thanks.

This whole exercise is just that, a learning exercise. The LengthOf directive works well enough, I was just trying to figure out how it actually worked and could I duplicate it. I got off on this tangent when writing the step-logic for a procedure which one of the steps is to determine the length of a supplied string. That led me to the LengthOf and SizeOf directives, which got me, well, to where I'm currently at now.

Hmm, maybe my studies would go smoother, if not faster, if I didn't find myself going off on all of these tangents.

Thanks again.

farrier · January 19, 2005, 07:49:55 AM

00100b,

This may be too obvious but, I use it all the time.

Code Select

Thing   struct
           height   db   7  (?)
           weight  db   5  (?)
           color     db   10 (?)
Thing   ends

.data?
before  dd  ?
Inst      Thing   <?>
so_Inst  $ - Inst

Inst is an instance of the Thing structure, and the size of that instance is calculated as so_Inst. Also the address of so_Inst is the upper bound of Inst. You can use either the size of or Upper bound to restrict an alogrithm.

Using the so_Inst in your code instead of a hard coded value, you can make changes to the structure and not have to modify your code because you changed the size of the structure.

hth,

farrier

Ghirai · January 19, 2005, 10:48:30 AM

Like hutch said, you can use this API: http://msdn.microsoft.com/library/default.asp?url=/library/en-us/memory/base/isbadreadptr.asp

Robert Collins · January 19, 2005, 05:22:58 PM

Correct me if I am wrong but I somehow got the impression that what was the issue was that if you have no idea about the contiguous bytes of memory as far as it's size and in this contiguous stretch of memory is some data (either an all ASCII string or perhaps even binary) and you want to find the 'true' end of the data only. If you know that the data is pure ASCII then why couldn't you just bump through the bytes until a non-ASCII bytes is found? If the data is binary then I don't see how you can find the end of the data.

I don't see how that API (IsBadReadPtr) will do the job. One of the parameters is the ucb - Size of the memory block, in bytes. But here you must know ahead of time what the total size of the memory block is. My assumption is that you do not know the memory size. Again, correct me if I am wrong.

tenkey · January 20, 2005, 03:41:26 AM

The major problem with IsBadReadPtr is that the boundary between valid and invalid memory (what is actually being tested) does not correspond with the data boundary.

Robert Collins · January 20, 2005, 03:48:08 AM

Quote from: tenkey on January 20, 2005, 03:41:26 AM
The major problem with IsBadReadPtr is that the boundary between valid and invalid memory (what is actually being tested) does not correspond with the data boundary.

Which makes the usage of the API undesirable for this situation.

News:

Programmatic "LengthOf"

00100b

hutch--

tenkey

00100b

Robert Collins

00100b

hutch--

00100b

farrier

Ghirai

Robert Collins

tenkey

Robert Collins