News:

MASM32 SDK Description, downloads and other helpful links
MASM32.com New Forum Link
masmforum WebSite

LocalAlloc memory begin

Started by bomz, June 09, 2011, 02:47:17 PM

Previous topic - Next topic

hutch--

#15
From memory all of the Windows memory allocation function are aligned by at least 4 bytes. GlobalAlloc() is by 8 and there are options from memory to go higher for SSE and similar applications. Its easy enough to do your own, allocate as much as you need PLUS the number of bytes to align by then align a start address from the allocation address. Read and write to the aligned address, deallocate from the original allocated address.
Download site for MASM32      New MASM Forum
https://masm32.com          https://masm32.com/board/index.php

dedndave

Quote1) request (N-1) more bytes than you need
2) save the original address for the handle to Free the allocated block
3) use the N-aligned address for data
   D = (A+N-1) AND (-N)

A = allocated block address
D = aligned data address
N = alignment

here is an example using HeapAlloc

        INCLUDE \masm32\include\masm32rt.inc

DesiredSize  EQU 4096
DesiredAlign EQU 16

        .DATA?

hHeap     dd ?
hBlock    dd ?
BlockUsed dd ?

        .CODE

_main   PROC

        INVOKE  GetProcessHeap
        mov     hHeap,eax

;if zeroed memory is not required, replace HEAP_ZERO_MEMORY with NULL

        INVOKE  HeapAlloc,eax,HEAP_ZERO_MEMORY,DesiredSize + DesiredAlign - 1
        mov     hBlock,eax
        add     eax,DesiredAlign - 1
        and     eax,-DesiredAlign
        mov     BlockUsed,eax

;BlockUsed is the aligned address used for data

;
;

;when done, use the original address like a handle to free the allocated block

        INVOKE  HeapFree,hHeap,NULL,hBlock

        INVOKE  ExitProcess,0

_main   ENDP

        END     _main

sinsi

Quote from: MicrosoftThe Windows heap managers (all versions) have always guaranteed that the heap allocations have a start address that is 8-byte aligned (on 64-bit platforms the alignment is 16-bytes).
http://support.microsoft.com/kb/286470

So use HeapAlloc, since Local/Global functions are only wrappers for heap functions.
Light travels faster than sound, that's why some people seem bright until you hear them.

dedndave

nice find sinsi
they should add that to the allocation functions dox   :P

jj2007

Here is a simple example how to get a 16-byte aligned LOCAL buffer for use with SSE2 instructions. TheBuffer must be the first LOCAL.


Quoteinclude \masm32\include\masm32rt.inc
.686
.xmm

.code
Pad16 proc arg1, arg2
LOCAL TheBuffer:OWORD   ; the SSE2 buffer
LOCAL locdw:DWORD   ; any number of other locals
LOCAL padding:OWORD   ; do not use this one
   push ebp
   and ebp, -16
   ; ... whatever code you need here
   m2m eax, "xxxx"
   movd xmm2, eax
   pshufd xmm2, xmm2, 0   ; 4*"0"
   movaps TheBuffer, xmm2   ; will choke if not
   movaps xmm3, TheBuffer   ; aligned to 16 bytes
   ; ... code finished
   pop ebp
   ret
Pad16 endp
start:
            REPEAT 9
               invoke Pad16, chr$("123456789"), 123
               print hex$(esp), 13, 10
               push eax
            ENDM
            add esp, 9*4
            inkey "OK"
            exit
end start