News:

MASM32 SDK Description, downloads and other helpful links
MASM32.com New Forum Link
masmforum WebSite

huge table access

Started by porphyry5, December 08, 2009, 04:20:43 PM

Previous topic - Next topic

jj2007

Quote from: porphyry5 on December 10, 2009, 01:12:15 AM
A question for Jochen:  Thank you, that did the trick.  But I take it you meant GetCurrentProcessId and OpenProcess, GetCurrentProcess produces an error in OpenProcess?

I knew this would put you on the right track, and you would eventually find the GetCurrentProcessId call. I had also hoped you would volunteer to find out what GetCurrentProcess is good for. It is the most optimised API call that Microsoft has ever produced
:wink

Mirno

A long time ago I had to write a searchable dictionary for a uni project (scrabble game).

Even then, it was easy to fit the dictionary in a tree - each node has 26 branches, and a boolean to denote a complete word. As it was scrabble, the word list was 15 characters long, and took 40 meg or so. Your memory foot print would go up (I think it was 30 or so meg for me), and you may have to modify it if you want to include punctuation, but it was blindingly fast.

The original project was in C, but I rewrote it in asm - it was fairly simple. Unfortunately I am working abroad, otherwise I could have possibly found the code.

Mirno

z941998

Mirno I would be interest in seeing your coding / app.

For that matter I would also like to see what Larry H. has also.

I have attached a file that has my feeble first atempt at multiway trees.  Included is a tool for calculating level and node sizes for trees, just disregard the information relating to powerball.

porphyry5

whoa, this is getting off the point.  asm itself I am finding to be quite straightforward, and very explicit, a welcome change.  My problem is with winapi functions, and their unspoken limitations; and their documentation has to be the most turgid prose ever written.  The code that builds the table and index is already written and tested, and works fine; but then the system overwrites my table, hence my first appeal to this forum.

The binary search techniques several of you refer to are already in hand, but thank you.

I did begin with a single huge file when I wrote the script version, but that created problems when separating concatenated words; eg the formation "abranch" would separate into "ab ranch" instead of "a branch" because "ab ranch" was the first valid combination found. A largely, but not entirely, successful solution was to use multiple lists, beginning with only the most common words, and progressing out to ever larger lists when a match was not found.  Sometimes, of course, you need the less common combination, but I don't think the solution to that problem is programmable.

A question for Hutch: don't the Global functions simply call the Heap functions to do the job for them, which in turn call the Virtual functions when the size requested exceeds about 0.5 meg?  Winapi32.hlp says so, but is that yet another of its generalizations that don't cover every situation?  I have progressed now as far as VirtualLock, which fails with the system message "Insufficient quota to complete the requested service" which I take it means it does not have enough available to lock the amount of space requested.

hutch--

Graham,

Most of the memory allocation API calls call the same function in NTDLL.DLL with minor variations. HeapAlloc() is basically designed for C programming but there are others. If its fixed memory that you need, GlobalAlloc() set with the GMEM_FIXED flag performs well and can easily handle over 1 gigabyte if you have the memory available. Virtual Alloc tends to be slow and while it can handle other situations like fragmented memory and other problems the OS may have, for large block high performance code usually GlobalAlloc() is a lot faster.

Have a play with them, there are various strategies available depending on the performance you need.
Download site for MASM32      New MASM Forum
https://masm32.com          https://masm32.com/board/index.php

Ghandi


The code that builds the table and index is already written and tested, and works fine; but then the system overwrites my table, hence my first appeal to this forum.


Are you certain it is the system overwriting your table and if so, where and why? Can you give an example of the code where it is occurring please?

HR,
Ghandi

redskull

From MSDN:

"The global and local functions are supported for porting from 16-bit code, or for maintaining source code compatibility with 16-bit Windows. Starting with 32-bit Windows, the global and local functions are implemented as wrapper functions that call the corresponding heap functions using a handle to the process's default heap. Therefore, the global and local functions have greater overhead than other memory management functions."

Since all MMU functions essentially do the same thing, the quickness probably depends on whether you commit or just reserve the memory, but one just trades speed now for slowness later (the pages will be committed as you use them, instead of all up front).  But as always, whatever you measure as fastest is just that.  Also re: the page locking: you probably need to increase the size of your working set before trying to lock the pages (you can lock pages 10 pages in memory if your only allowed to have 8 in there at one time).  But I renew my objection; Windows does an alright job of maintaining the working set sizes for you, and there's no gaurentee that locking a page into memory keeps it there; Windows can still override your lock if youre in a wait state, which means everytime your thread starts to run again it must reload all the pages, even if they are unused.


-ac
Strange women, lying in ponds, distributing swords, is no basis for a system of government

hutch--

 :bg

> A couple of things, use GlobalAlloc() with the GMEM_FiXED flag, someone will have a bleed and regurgitate that its not politically correct but it will routinely allocate over a gigabyte without batting an eyelid.


invoke GlobalAlloc,GMEM_FIXED,1024*1024*1024
mov gMem, eax  ; direct pointer to allocated memory.
........
invoke GlobalFree,gMem


Its allocated from NTDLL.DLL so why lose sleep over it.
Download site for MASM32      New MASM Forum
https://masm32.com          https://masm32.com/board/index.php

jj2007

Quote from: redskull on December 11, 2009, 12:36:13 PM
Starting with 32-bit Windows, the global and local functions are implemented as wrapper functions that call the corresponding heap functions using a handle to the process's default heap. Therefore, the global and local functions have greater overhead than other memory management functions.

Have a walk with Olly, it's hilarious. Must be organically grown code (TM).

invoke GlobalAlloc, GMEM_FIXED, 8000000

7C80FE69       FF15 0C10807C      call near [<&ntdll.RtlAllocate>; ntdll.RtlAllocateHeap
...
7C96D636       68 C0D8967C        push 7C96D8C0                  ; ASCII "RtlAllocateHeap"
...
7C94A2B1       E8 2832FCFF        call ZwAllocateVirtualMemory
...
7C94A2EB       E8 3B61FCFF        call RtlGetNtGlobalFlags
...

porphyry5

Sorry, but most of you are going completely over my head, I don't understand your terminology.

I gather that there are 3 competing possibilities, Heap Global and Virtual.  Virtual gives a problem so I will try Heap and Global instead.

jj2007

Quote from: porphyry5 on December 11, 2009, 04:18:44 PM
Sorry, but most of you are going completely over my head, I don't understand your terminology.
Apologies - we have a tendency to hijack threads for rants and philosophical reflections on the virtues of VirtualAlloc... ::)

Quote
I gather that there are 3 competing possibilities, Heap Global and Virtual.  Virtual gives a problem so I will try Heap and Global instead.

Just try what Hutch suggests. It will work fine.

invoke GlobalAlloc,GMEM_FIXED,1024*1024*1024
mov gMem, eax  ; direct pointer to allocated memory.
........
invoke GlobalFree,gMem

dedndave

Quotewe have a tendency to hijack threads for rants and philosophical reflections on the virtues of VirtualAlloc
this place would be boring if we didn't have "discussions" (aka "rants") - lol

Larry Hammick

If it's just a question of getting a big table into memory, well, LocalAlloc is fine unless maybe the loader has not given the program a big enough heap. But this works:


filemax = 4000000h  ;64 megs

.data
bigbuffer dd ?
amtread dd ?
hFile dd ?
filespec db "bigfile",0     ;for illustration
...

.code
invoke HeapCreate,0,filemax,filemax
test eax,eax
jz heapproblem
mov bigbuffer,eax
invoke CreateFileA,addr filespec,GENERIC_READ,0,0,OPEN_ALWAYS,FILE_ATTRIBUTE_NORMAL,0
inc eax
jz fileproblem
dec eax
jz fileproblem
mov hFile,eax
invoke ReadFile,eax,bigbuffer,filemax,addr amtread,0
test eax,eax
jz short readproblem
invoke CloseHandle,hFile
...


z941998, I'll put some code together to illustrate how to make and use indexes, and put in the laboratory in a day or two.

porphyry5

Sure can't argue with success, Global works and Heap works, Virtual does not.  Its interesting how some of you rooted for Global, some for Heap, but absolutely no one rooted for Virtual.  Should tell me something, eh!

Let's see, already I have: avoid Virtual, what's the point of GetCurrentProcess, and FormatMessage spurns eax.  Winapi is shaping up just like Notetab's script, riddled with undocumented quirks.

Thank you all again for your help, I really appreciate it.

dedndave

http://msdn.microsoft.com/en-us/library/aa366574(VS.85).aspx
QuoteNote  The global functions have greater overhead and provide fewer features than other memory management functions. New applications should use the heap functions unless documentation states that a global function should be used. For more information, see Global and Local Functions.

now, you know why i like the heap functions
although, VirtualAlloc should work - i prefer not to use it unless, perhaps, i want memory initialized to 0 (with HeapAlloc that is optional)

http://msdn.microsoft.com/en-us/library/aa366887(VS.85).aspx

as for GetCurrentProcess - it always returns -1
but - it would be bad coding practice to "hard-code" -1 into everything
what if it changes for some reason - or - what if i want to use the ensuing code in a thread
best to call the function - it is fast