News:

MASM32 SDK Description, downloads and other helpful links
MASM32.com New Forum Link
masmforum WebSite

stack vs memory

Started by loki_dre, April 21, 2008, 07:13:17 PM

Previous topic - Next topic

hutch--

Unless Microsoft change the timable operation of GlobalAlloc() in a later version, it still seems to have the legs against other memory strategies when you use the GMEM_FIXED flag. Apart from that there are a couple of legacy applications for it with other flags but its usually off the pace.

It has many characteristics that I find useful, fine granularity, will allocate up to the physical limit of the OS and with large allocations its fast. Recently while I was working on the dynamic array code I started with HeapAlloc() for the main pointer array but it crashed after about 300000 stored DWORD pointers. Changed it to GlobalAlloc and it is both fast and reliable. I used OLE string memory for the array members as it has the "garbage collection" characteristic that reduces memory fragmentation as well as storing the length 4 bytes below the start address.
Download site for MASM32      New MASM Forum
https://masm32.com          https://masm32.com/board/index.php

ic2


donkey

Quote from: hutch-- on April 27, 2008, 12:06:59 AM
Unless Microsoft change the timable operation of GlobalAlloc() in a later version, it still seems to have the legs against other memory strategies when you use the GMEM_FIXED flag. Apart from that there are a couple of legacy applications for it with other flags but its usually off the pace.

It has many characteristics that I find useful, fine granularity, will allocate up to the physical limit of the OS and with large allocations its fast. Recently while I was working on the dynamic array code I started with HeapAlloc() for the main pointer array but it crashed after about 300000 stored DWORD pointers. Changed it to GlobalAlloc and it is both fast and reliable. I used OLE string memory for the array members as it has the "garbage collection" characteristic that reduces memory fragmentation as well as storing the length 4 bytes below the start address.

Well, HeapAlloc and GlobalAlloc both use the heap to allocate memory so I don't know why you had problems with HeapAlloc, I've never experienced any stability issues with either of them. The speed difference is in the calls to ntdll.dll, Heapxxx functions are simple forwarding wrappers for NTDLL while the Globalxxx functions must perform various setups each time a memory management function is called in order to maintain compatibility with Windows 3.x. For the actual moving of data in and out there is no difference in speed as far as I can tell. The extra overhead involved in making the Globalxxx functions compatible with the 16bit legacy functions such as the clipboard and DDE make calls to those functions slower by comparison however, when you look at the disassembly of Kernel32.dll, the actual allocation schemes are identical so I can't see where one would fail and the other wouldn't.

As for granularity, both use the same heap and are subject to the same granularity so there cannot be any difference there. But both are subject to severe fragmentation when the block of memory gets too large and so in both cases allocating over 4 MB, though possible, is not recommended or you may experience slow downs.

Donkey
"Ahhh, what an awful dream. Ones and zeroes everywhere...[shudder] and I thought I saw a two." -- Bender
"It was just a dream, Bender. There's no such thing as two". -- Fry
-- Futurama

Donkey's Stable

zooba

The process heap is limited in size (see the /HEAP linker option).

If you create a growable heap using HeapCreate() then you won't hit an upper limit until the system runs out of memory.

Cheers,

Zooba :U

hutch--

Where I found the problem was with HeapAlloc() was the size of the allocation. I think Zooba is right that you can set up using HeapCreate() but I then woder whether its less overhead than GlobalAlloc() that easily allocates over a gigabyte with no hiccups at all. In the array system I needed fixed memory for the pointer array but movable emory for the members.

Where you don't use GlobalAlloc() is with the movable flag set as it is purely legacy code for DDE and the clipboard but with the fixed memory flag, nothing else is faster that I have found. I have also picked up that with later versions of XP and probably Vista that the low fragmentation heap option for HeapAlloc is particularly slow so I have yet to find good reason to use HeapAlloc() which has always sounds like a leftove4r from the old C code mentality of the DOS days.
Download site for MASM32      New MASM Forum
https://masm32.com          https://masm32.com/board/index.php

donkey

#35
Quote from: hutch-- on April 27, 2008, 05:08:10 AM
Where I found the problem was with HeapAlloc() was the size of the allocation. I think Zooba is right that you can set up using HeapCreate() but I then woder whether its less overhead than GlobalAlloc() that easily allocates over a gigabyte with no hiccups at all. In the array system I needed fixed memory for the pointer array but movable emory for the members.

Where you don't use GlobalAlloc() is with the movable flag set as it is purely legacy code for DDE and the clipboard but with the fixed memory flag, nothing else is faster that I have found. I have also picked up that with later versions of XP and probably Vista that the low fragmentation heap option for HeapAlloc is particularly slow so I have yet to find good reason to use HeapAlloc() which has always sounds like a leftove4r from the old C code mentality of the DOS days.

Why would you need moveable memory for the members ? unless you plan to copy them to the clipboard moveable memory is useless in Win32. Even Raymond Chen admits that there is no practical use for moveable memory outside of clipboard and DDE functions in the Old New Thing and here. And even then, the only reason clipboard functions may fail without GlobalAlloc is because of the rampant use of undocumented and unsupported behaviour by third party applications that may interact with yours, and those are becoming rarer as the users of co-operative multitasking disappear (thankfully).

GMEM_MOVEABLE was only ever useful in co-operative multitasking, since there was no virtual memory the memory manager had to be able to move blocks of memory from time to time and applications could not rely on pointers in real memory being stable, so they used a handle instead and had to ask the OS where the memory actually ended up each time they wanted to access it. The advent of virtual memory made this scheme pointless and for the most part (outside of a few legacy functions) it is never used in modern coding since it is always more efficient to use a fixed pointer. However, since in Windows 1.x applications were discouraged from using fixed pointers, when 32 bit Windows was introduced most applications used moveable memory and had to be accomodated so the kludge was added to the API and we're stuck with it, but that doesn't mean we have to use it, after all internally there are around 15 to 20 extra API calls just to allocate memory using the Global functions and the end result is exactly the same as the Heap functions with the exception of the return value type.

Donkey
"Ahhh, what an awful dream. Ones and zeroes everywhere...[shudder] and I thought I saw a two." -- Bender
"It was just a dream, Bender. There's no such thing as two". -- Fry
-- Futurama

Donkey's Stable

hutch--

Adgar,

> Why would you need moveable memory for the members ?

Simple, after thousands to millions of allocations,deallocations you get severe memory fragmentation. Use a method that the OS can move around in memory and the memory manager solves that problem, it compacts the memory splattered all over the place and keeps more memory available in usable sized blocks. The price is speed, fixed memory is faster but makes a much bgger mess when large numbers of allocations and deallocations occur.
Download site for MASM32      New MASM Forum
https://masm32.com          https://masm32.com/board/index.php

zooba

Quote from: hutch-- on April 27, 2008, 08:24:47 AM
Use a method that the OS can move around in memory and the memory manager solves that problem, it compacts the memory splattered all over the place and keeps more memory available in usable sized blocks.

That's assuming that Windows actually does that anymore. The LFH was slower when I last tested it (XP SP2) but it is supposed to have an advantage for lots of allocations and deallocations of varying (smallish) sizes. The theory behind how it works seems sound, as does movable memory. Both have performance hits over fixed memory and I am quite skeptical that GlobalAlloc() still behaves in the way it used to.

Cheers,

Zooba :U

hutch--

I should have made the distinction, GlobalAlloc() with the GMEM_FIXED flag versus SysAllocStringByteLen(), the opposite end characteristics. Unless you were writing legacy code with DDE or the clipboard you would not bother to use GlobalAlloc() with the GMEM_MOVABLE flag.
Download site for MASM32      New MASM Forum
https://masm32.com          https://masm32.com/board/index.php

donkey

Quote from: hutch-- on April 27, 2008, 08:24:47 AMUse a method that the OS can move around in memory and the memory manager solves that problem, it compacts the memory splattered all over the place and keeps more memory available in usable sized blocks. The price is speed, fixed memory is faster but makes a much bgger mess when large numbers of allocations and deallocations occur.

Windows has not done memory compaction since Windows 3.51, that ended with Windows 95/NT4.0. Read Raymonds blog that I linked...

Quote from: Raymond ChenMemory blocks still had a lock count, even though it didn't really accomplish anything since Win32 never compacted memory
"Ahhh, what an awful dream. Ones and zeroes everywhere...[shudder] and I thought I saw a two." -- Bender
"It was just a dream, Bender. There's no such thing as two". -- Fry
-- Futurama

Donkey's Stable

jj2007

Question: Does SysAllocStringByteLen do some kind of memory compaction or garbage collection? I.e. is it a good idea to use SysAllocStringByteLen for many small reallocations?

donkey

Quote from: jj2007 on April 27, 2008, 06:10:48 PM
Question: Does SysAllocStringByteLen do some kind of memory compaction or garbage collection? I.e. is it a good idea to use SysAllocStringByteLen for many small reallocations?

No, Win32 does not do any memory compaction at all. This simply allocates a BSTR (OLE) string of a given length, the memory allocated is not really moveable in the Windows 1.x sense since that has no meaning in a flat memory model. Beyond that it allocates one DWORD before the pointer to hold the string length. These functions have nothing at all to do with garbage collection or compaction, a disassembly of the functions and documentation at MSDN confirm this. If you need compaction and garbage collection use a database and the ODBC.

Alternatively you can use the DSA_xxx dynamic structure array functions, they provide a callback for garbage collection and are quite efficient when handling large arrays of small allocations.
"Ahhh, what an awful dream. Ones and zeroes everywhere...[shudder] and I thought I saw a two." -- Bender
"It was just a dream, Bender. There's no such thing as two". -- Fry
-- Futurama

Donkey's Stable

donkey

Just to re-itterate...

Win32 uses a flat memory model, the concepts of global and local memory in a flat model are meaningless as is the concept of moveable memory. Due to legacy functions Windows simulates the outdated 640K model at the cost of speed due to the overhead involved in the simulation. The GlobalAlloc function is by far the worst allocation scheme available to Windows, even LocalAlloc is more efficient at allocating memory (though not by much). You should always use the heap functions for small allocations and the Virtual memory functions for large ones unless you are stuck with a legacy function that requires the simulated moveable memory.

You should avoid using COM functions to allocate memory that you will only be using internally. COM performs marshalling on it's buffers to allow for interoperability between protected mode processes and this is very slow and creates a thread you have no control over further slowing your application, as well loading COM for things like BSTRs will add 100-200K to your footprint as well as the setup time for a COM application. In other words, avoid using COM functions unless you actually will be using COM.
"Ahhh, what an awful dream. Ones and zeroes everywhere...[shudder] and I thought I saw a two." -- Bender
"It was just a dream, Bender. There's no such thing as two". -- Fry
-- Futurama

Donkey's Stable

jj2007

Quote from: donkey on April 27, 2008, 06:49:44 PM
Just to re-iterate... You should avoid using COM functions

Edgar, I don't quite understand the context of your warning - does SysAllocStringByteLen have anything to do with COM??
I am inspired by Hutch's dynamic array project but a bit confused. SysAllocStringByteLen had never crossed my way before, but I see it's pretty old, see here an article from 1997. The Sys* functions seem to be handy and fast, so is there any reason not to use them for string management in a major application? Is the lack of mem compaction an issue? Another question, maybe a stupid one: Is there a need to de-allocate all strings with SysFreeString on ExitProcess, or does Windows handle that? Sorry for so many newbie questions...

EDIT: A sample app for SysAllocStringByteLen, assembling at 5632 bytes (it uses the str$ and chr$ macros as well as lstrcat). SysFreeString returns 48, no error. So my question again: Is there any argument NOT to use this fantastic API?

.nolist
include \masm32\include\masm32rt.inc

ShowWinErr PROTO:DWORD

crlf EQU 13, 10

.data?
My$ dd ?
ShowErrBuf db 200 dup(?) ; 200 chars for Windows

.data
MyText db "This is a little test string", 0
ShowErr db "Windows error:", crlf, 0

.code

start:
invoke SysAllocStringByteLen, addr MyText, sizeof MyText
mov My$, eax
invoke MessageBox, 0, str$(eax), chr$("String pointer; show the string now?"), MB_YESNO
.if eax==IDYES
invoke MessageBox, 0, My$, chr$("The String:"), MB_OK
.endif
invoke SysFreeString, My$
invoke MessageBox, 0, str$(eax), chr$("SysFreeString:"), MB_OK
invoke ShowWinErr, 1
invoke ExitProcess, 0

ShowWinErr proc ShowFormatted:DWORD
pushad
invoke GetLastError
.if ShowFormatted
invoke FormatMessage,FORMAT_MESSAGE_FROM_SYSTEM,
0, ; GetItFromSystem
eax, ; ErrNum
0, ; Default language
addr ShowErrBuf, ; where to send the string from system
200, 0          ; size 200, no arguments
.if ShowFormatted!=1
invoke lstrcat, addr ShowErrBuf, ShowFormatted
.endif
invoke MessageBox, 0, addr ShowErrBuf, addr ShowErr, MB_OK
.else
invoke MessageBox, 0, str$(eax), addr ShowErr, MB_OK
.endif
popad
ret
ShowWinErr endp

end start

donkey

Hi jj2007,

Any of  the Sys string functions are used to allocate strings to be used with COM, that's why the functions are in the oleaut32 DLL. But no, I don't believe that particular one uses IMalloc directly, at least as far as I know it only calls several other functions in OLE32 and OLEAUT32 but because of the nature of COM it is difficult to track what those functions are. I am always wary of any function that could call a COM interface and therefore load COM without you being aware. Simpler just to create a buffer using HeapAlloc and go from there.

No, Windows does not handle deallocating the string, you must use SysFreeString to deallocate the memory. The main reason not to use it is HeapAlloc is faster, by quite a large margin and in reality all you are doing with it is allocating memory and copying some bytes, I mean why do you need system strings for that ???

Donkey
"Ahhh, what an awful dream. Ones and zeroes everywhere...[shudder] and I thought I saw a two." -- Bender
"It was just a dream, Bender. There's no such thing as two". -- Fry
-- Futurama

Donkey's Stable