News:

MASM32 SDK Description, downloads and other helpful links
MASM32.com New Forum Link
masmforum WebSite

Mem-Map File up to 18 exabytes(MASM)

Started by ecube, May 02, 2010, 06:44:34 AM

Previous topic - Next topic

ecube

This is a translation of http://www.masm32.com/board/index.php?topic=3486.0

Evenbit did a great job with this, this code is the *proper and arguably the best way(i'm gonna speed test in a moment) of opening
and interacting with very very large files. It uses the AllocationGranularity instead of just assuming the 64kb boundary.

the small changes in this translation are I have it call a callback function that passes on the mem and the length, I took out some of the error checking on the Api functions but you can add them back trivially.


include masm32rt.inc

MyCallBack proto :DWORD,:DWORD
MapLargeFile proto :DWORD,:DWORD

.data
thefile db "yourfile.txt",0

.code
start:
invoke MapLargeFile,addr thefile,addr MyCallBack
invoke ExitProcess,0

MapLargeFile proc iFile:DWORD,iFunc:DWORD
LOCAL bAlign:SYSTEM_INFO
LOCAL hFile:DWORD
LOCAL pbFile:DWORD
LOCAL iMapFile:DWORD
LOCAL qwFileSize:LARGE_INTEGER
LOCAL qwFileOffset:LARGE_INTEGER
LOCAL dwBytesInBlock:DWORD
invoke GetSystemInfo,addr bAlign

mov qwFileSize.LowPart,0
mov qwFileSize.HighPart,0

mov qwFileOffset.LowPart,0
mov qwFileOffset.HighPart,0


invoke CreateFile,iFile,GENERIC_READ or GENERIC_WRITE,FILE_SHARE_READ or FILE_SHARE_WRITE,NULL,OPEN_EXISTING,FILE_ATTRIBUTE_ARCHIVE,0
cmp eax,INVALID_HANDLE_VALUE
je @Error
mov hFile,eax

invoke CreateFileMapping,hFile,NULL,PAGE_READONLY,0,0,NULL
mov iMapFile,eax


invoke GetFileSize,hFile, addr qwFileSize.HighPart
mov qwFileSize.LowPart,eax

.while TRUE
                ;if the file is larger than AllocationGranularity(usually 64kb) then use AllocationGranuarity to start
                ;otherwise use the file size
mov eax,qwFileSize.HighPart
        .if eax == 0
            mov eax, qwFileSize.LowPart
            mov edx, bAlign.dwAllocationGranularity
            .if eax < edx
                mov dwBytesInBlock,eax
            .else
                mov dwBytesInBlock,edx
            .endif
        .else
            mov edx,bAlign.dwAllocationGranularity
            mov dwBytesInBlock,edx
        .endif
       
        invoke MapViewOfFile,iMapFile,FILE_MAP_READ,qwFileOffset.HighPart,qwFileOffset.LowPart,dwBytesInBlock
        mov pbFile,eax
       
push dwBytesInBlock
push pbFile
call iFunc

;we unmap the current view and mapview the next AllocationGranularity(usually 64kbs)
    invoke UnmapViewOfFile,pbFile
   
   ;Increment Offset quadword
        mov ecx,qwFileOffset.LowPart
        mov edx,qwFileOffset.HighPart
mov eax,dwBytesInBlock
mov ebx,0
add ecx,eax
adc edx,ebx
        mov qwFileOffset.LowPart,ecx
        mov qwFileOffset.HighPart,edx
       
        ;Decrement Size quadword
        mov ecx,qwFileSize.LowPart
        mov edx,qwFileSize.HighPart
        mov eax,dwBytesInBlock
        mov ebx,0
        sub ecx,eax
        sbb edx,ebx
        mov qwFileSize.LowPart,ecx
        mov qwFileSize.HighPart,edx
       
        .if edx == 0
            .if ecx == 0
                .break
            .endif
        .endif
.endw
invoke CloseHandle,iMapFile
invoke CloseHandle,hFile

@Error:
ret
MapLargeFile endp

MyCallBack proc iMem:DWORD,iLen:DWORD
;do whatever with mem
ret
MyCallBack endp
end start

Farabi

Those who had universe knowledges can control the world by a micro processor.
http://www.wix.com/farabio/firstpage

"Etos siperi elegi"

ecube

yes, because it can handle 64bit, and 18exabytes is the max on there. I just tested it on a 7gig file on my 32bit xp and it works great :)


it appears to only take alil over 1 second to process a 7 gig file  :eek that's incredible if its accurate. It did jump up on CPU usages abit(30%) here.

clive

Unless you actually touch the memory pages the OS/FSD will not be pinning any file data to the memory.

At a minimum the function you call to look at memory should touch 1 byte (or dword) in every 4K, to actually get the data mapped into memory.
It could be a random act of randomness. Those happen a lot as well.

KeepingRealBusy

E^cube,

Shouldn't you do some error checking on:
Quote
invoke GetFileSize,hFile, addr qwFileSize.HighPart
mov qwFileSize.LowPart,eax

eax could be 0FFFFFFFFh either because that was the low size, or if an error is being returned. You need to call GetLastError and check for NO_ERROR to insure that a return of of 0FFFFFFFFh was good.

I would be interested in timing if you touch all pages. That is what I was trying to do as reported in my prior post "Timing Large File Reads" below.

Dave

ecube

KeepingRealBusy yeah as I said I removed most error checking, out of lazyness and to quickly test, so it should be added back.

Also clive you made a good point about accessing the memory, I did that on a 358mb file it took 30 seconds to complete on my system the first time...the second it took 47 miliseconds, can someone explain this to me? I know it appears to be catching it, but how long does it do that? and the process terminated prior to the second test.

KeepingRealBusy

E^cube

If you read my prior post, you can see the times I got doing something similar. I had some questions about some of the timings I got, and  I suspect paging.

Is there any way to force the OS not to cache a read? Using FILE_ATTRIBUTE_TEMPORARY is only a "suggestion" (MSDN documemtation) for the system to not cache. It appears that it may only honor the flag for writes to prevent writing the un-necessary data back out to the file.

Another question. If I open the file as GENERIC_READ only (not GENERIC_READ+GENERIC_WRITE) would this make a difference in my timing, i.e. might this keep the system from caching the file?

Dave.

Farabi

So if I had a 64-bit machine and only had 4 GB of RAM, if I allocating memory using VirtualAlloc, it is look like I had a 1 EB of memory is not it?
Those who had universe knowledges can control the world by a micro processor.
http://www.wix.com/farabio/firstpage

"Etos siperi elegi"

KeepingRealBusy

Farabi,

I do not know about 64 bit operations, I am still running Windows XP. I think that you can have more virtual memory than 4GB, and I also think you can also have more physical memory than 4GB, but physical memory is still the limitation. If you allocate huge blocks of virtual memory and start filling it, you will be looking through a window at all of you memory. When you try to touch some location, if that page is paged out, the system will have to page something else out (probably using an LRU list) and then page your data in, and that something else may be one of your pages. When the system returns to you, the touched page is available, but something else is unavailable to someone, maybe you - at least until you touch that other page as well, and the cycle continues. That is why an indirect sort of a huge set of really random ordered data is bad. A direct Quicksort moves the data into smaller and smaller groups of pages as it goes on, eventually the data being looked at will all be on one page or at least on a small number of pages so that page thrashing will diminish.

Again, I may be totally wrong about 64 bit operations.

Dave.

ecube

http://blogs.msdn.com/khen1234/archive/2006/01/30/519483.aspx says

"Something to be careful of, though, is depleting virtual memory.  When mapping large files into virtual memory to perform I/O on them, be cognizant of the fact that every address you burn in virtual memory is another than can't be used by your app.  It's usually more efficient to use regular file I/O routines to perform read/write operations on large files."

whys there so much mixed info everywhere? ffs

ecube

Ok here's some things i've learned

"Memory mapped files are often faster for random access, but never for sequential access"

"Setting FILE_FLAG_NO_BUFFERING & FILE_FLAG_OVERLAPPED when creating the
mapping file give maximum performance for MapViewOfFile. "

I I tested this and this removed the caching issue for the mapped files, and gave the consistant 30seconds for the 300mb'ish file instead of the
10 miliseconds after the inital run like before.

invoke CreateFile,iFile,GENERIC_READ,FILE_SHARE_READ,NULL,OPEN_EXISTING,FILE_ATTRIBUTE_ARCHIVE or FILE_FLAG_NO_BUFFERING or FILE_FLAG_OVERLAPPED,0


so it appears memory mapped files aren't the fastest afterward in certain situations, like reading a large file top to bottom in chunks, however for random accesses throughout it's better. Here's some mixed comments though

although I read you could mimic mapped files via readfile by doing

"To get the desired 'all in RAM' effect, open
the file with FILE_FLAG_RANDOM_ACCESS and
read it from start to end. If there was enough
free memory then it will completely be hold in
RAM.'

"Windows actually uses file mapping for ReadFile/WriteFile due to how Cache Manager is designed. So they are not much different and you should use whichever fits your purpose best."

"A big advantage of file mapping is that it doesn't influence system cache. If your application does excessive I/O by means of ReadFile, your system cache will grow, consuming more and more physical memory."

clive

If you're going to be touching every byte/page then the data has to come off the disc. It might come directly from the disc, or a cached copy, but the fact it is read is unavoidable. With mapping it is only going to pin the 4K pages you actually touch. Therefore if your access is random/sparse (ie navigating internal structures) it can be very useful.

ReadFile/WriteFile will *not* use mapping/paging for small and unaligned reads (buffer and/or file offset), such accesses have to be double buffered.

If you are sorting a file with multiple passes you would be well advised to split it into multiple chunks by sieving the data into some order during the initial pass, then sort those and finally combine them. For example by starting a file for each letter of the alphabet the word starts with.
It could be a random act of randomness. Those happen a lot as well.