News:

MASM32 SDK Description, downloads and other helpful links
MASM32.com New Forum Link
masmforum WebSite

wtok fails on small words (characters words)

Started by TNick, January 05, 2009, 10:17:16 AM

Previous topic - Next topic

TNick

Hello! It's been a long time since I've posted here... Hope you are all ok!

I found today a problem with wtok function in MASM32 package. When it cames to allocating a memory area for the array, this code will request the memory from the system:

    invoke StrLen,pText             ; get the buffer length
    add eax, eax                    ; set pointer array size

This will work well in most circumstances, but when it cames to short strings that you pass to this function, it will fail to get enough memory.
Examples:

- the string that you send is: "a",0
- => lenght is 1
- => memory requested from system: 2 bytes
- => memory needed: 4 bytes

- the string that you send is: "a b",0
- => lenght is 3
- => memory requested from system: 6 bytes
- => memory needed: 8 bytes

- the string that you send is: "a b c",0
- => lenght is 5
- => memory requested from system: 10 bytes
- => memory needed: 12 bytes

Apart from these examples (and similar ones) the allocation mechanism works just fine.

As a side note, this post is intended to be just a "heads-up!", since most probably this kind of strings are rarely tokenised.
If you do, the simplest solution is to copy entire code and to replace above code with:

    invoke StrLen,pText             ; get the buffer length
    add eax, eax                    ; will help with presented problem
    add eax, 2

or

    invoke StrLen,pText             ; get the buffer length
    shr  eax, 2 ; on large blocks of text will allocate way too much memory



Best Regards,
Nick

raymond

The system will normally allocate memory in multiples of a default minimum size (such as 16 bytes). Your problem may not be related to "short strings" but instead to strings with a length exactly equal to a multiple of that default minimum size.

There is a function to determine the actual size of the allocated memory.
When you assume something, you risk being wrong half the time
http://www.ray.masmcode.com

TNick

Hello, raymond!

Quote from: raymond on January 06, 2009, 02:00:49 AM
The system will normally allocate memory in multiples of a default minimum size (such as 16 bytes).

Yes, there's no doubt about it

Quote from: raymond on January 06, 2009, 02:00:49 AM
Your problem may not be related to "short strings" but instead to strings with a length exactly equal to a multiple of that default minimum size.

I solved my problem by modifing the function: I increase the amount of memory requested. As I said, the thread is intended to be a warning/help for those who will have in future same problem.

Fact is that, in it's original form, the function will allocate less memory than needed. It is true that the system will round requested size, but, as the function ends, there is a call to GlobalReAlloc. The function will fail because it will not find the marker for "end of chunk", and will return 0


  ; ----------------------
  ; truncate unused memory
  ; ----------------------
    invoke GlobalReAlloc,edx,eax,0  ; EDX is the allocate memory handle being reallocated

  ; -----------------------------------
  ; copy the local array handle back
  ; to the address of the passed handle
  ; -----------------------------------
    mov esi, pArray                 ; load passed handle address into EDX
    mov DWORD PTR [esi], eax        ; store local array handle at address of passed handle

    mov eax, ebx                    ; return array count in EAX

    pop edi
    pop esi
    pop ebx

    ret


This is the last part of wtok function. As you can see, the result is passed to function directly. This will lead, in case of a failure, to a (small) memory leak, too, since the initial allocated chunk is still viable and prezent in heap.

Once again, this is an extremly rare case, since you will pass most of the times a string consisting of wariable lenght words, and more than few words in string. Still, if there is a chance that the string that you pass to this function takes this form: "a b c" ... " n m "..., then the function WILL fail and WILL cause a small memory leak each time, so you should copy the original function and modify the first part that I was talking about in 1'st message.

Best regards,
Nick