News:

MASM32 SDK Description, downloads and other helpful links
MASM32.com New Forum Link
masmforum WebSite

Moving Pointers - MapViewOfFile

Started by paulfaz, November 16, 2006, 12:13:26 PM

Previous topic - Next topic

paulfaz

Hi All,

Im having issues, again....

Im trying to map a file so i can open a large file, but if the file is really big, then writeconsole or any other api like sendmessage to an edit box crashes when referencing the memory location of the memory mapped file, when i run it through a debugger i usually get a not enough memory error.

So....

I read a doc on using a memory map file but it was in c++, i have some code, and i want to move through the memory map file 128k at a time and print it to the screen... Im really stuck, mainly with moving through the mapped file, and is most likely my lack of understanding for address space etc...

I basically have a loop that re-maps the view of the file and each time it should increment where it starts by x amount of bytes, im having major issues calculating and moving the pointers and at present it doesnt work at all....

Can you help? Or give me any pointers? < lol, excuse the pun.





.386
.model flat, stdcall
option casemap:none

include c:\masm32\include\windows.inc
include c:\masm32\include\kernel32.inc
include c:\masm32\include\user32.inc
include c:\asmdev\win32\msvcrt.inc
include c:\asmdev\win32\handleer.asm

include c:\masm32\include\comdlg32.inc
includelib c:\masm32\lib\kernel32.lib
includelib c:\masm32\lib\user32.lib
includelib c:\masm32\lib\msvcrt.lib

.const
BUFFER_SIZE equ 100
;MEMBLOCK equ 0x20000
.data
fn_BytesRead dd 0
buf_FileName db "c:\disp.txt",0
buf_FileContent db BUFFER_SIZE dup(0)
;MEMBLOCK DWORD 0x10000*2
MEMBLOCK DWORD 20000
filestart DWORD 0
fileend DWORD 0
pMF_Start DWORD 0
pMF_Mov DWORD 0

.data?
hnd_ConsoleOut HANDLE ?
charsWritten DWORD ?

hnd_file HANDLE ?
pmem DWORD ?
fsize DWORD ?
hnd_filemap HANDLE ?

.code

start:
invoke GetStdHandle, STD_OUTPUT_HANDLE
    mov hnd_ConsoleOut,eax

invoke CreateFile,ADDR buf_FileName,GENERIC_READ,0,NULL,OPEN_EXISTING,FILE_ATTRIBUTE_ARCHIVE,NULL
mov hnd_file, eax

invoke GetFileSize, hnd_file,NULL
mov fsize, eax

invoke CreateFileMapping, hnd_file, NULL, PAGE_READONLY, 0, 0, NULL
mov hnd_filemap, eax

invoke MapViewOfFile, hnd_filemap, FILE_MAP_READ,0,0,0
mov pMF_Start, EAX

invoke WriteConsole, hnd_ConsoleOut,pMF_Start,MEMBLOCK,ADDR charsWritten,0
invoke UnmapViewOfFile, pmem

MapFileLoop:
ADD pMF_Start, [MEMBLOCK]
                ;ADD pMF_Start, MEMBLOCK
invoke MapViewOfFile, hnd_filemap, FILE_MAP_READ,0,pMF_Start,MEMBLOCK
mov pMF_Start, EAX

invoke WriteConsole, hnd_ConsoleOut,pMF_Start,MEMBLOCK,ADDR charsWritten,0
invoke UnmapViewOfFile, pmem


MapFileLoopEnd:

invoke CloseHandle, hnd_file
invoke ExitProcess, 0

end start


paulfaz

Sorry by the way, the bit im having an issue with is "ADD pMF_Start, [MEMBLOCK]"

I want to increment the address of the pointer p_MF_Start by 128, so that i can read bits of the memory mapped file... I originally was using the readfile API but its terrible at opening big files, even in chunks... or at least it was for me..

zooba

Memory mapping files is really useful when you figure it out. There are so many annoying little things to deal with...

I've attached some of the source from ASM Runtime, which as of the latest version (0.300) has a set of file mapping functions. You can use these if you simply want it to work and not try and figure it out. Some of the code might be helpful, it may not.

If, however, you're keen to learn how it all works (which I strongy recommend :bg ), go for it. I think your problem at the moment is that MEMBLOCK has the value 20000 in decimal, while all the other commented defintions appear to be 20000h (or 0x20000). Since 4E20h (20000 decimal) isn't aligned to the right boundaries, it won't allocate properly. For maximum compatibility, use the GetSystemInfo function to get the page size and allocation granularity and read the documentation on MapViewOfFile to see what to do with them.

Cheers,

Zooba :U

[attachment deleted by admin]

paulfaz

Thanks Zooba, that actually helped a great deal.... Ive almost got it doing what i want at the moment...

I can now get MapFile to map different sections of the file, im running it through OLLYDBG rather than printing it to the screen, and i have called mapviewoffile 3 times, with different offsets and it works.. BUT my offset calculations are wrong, because they dont lead on... i,e if i have the word "this is my mother and you are my brother", and on the first run, i read say 4 chars, gaining "this ", the next time i map the file i want to start at position 5, and map another 4, giving me ...

run 1, "this "
run 2, "is m"
run 3, "y ot" etc

I thought what i had was right but its clearly not, i keep getting portions of the file that are way off from me each other...

my offset starts at 0, then i read x amount of bytes, and then i add that to my offset, and then in run two i start at the new offset and read x amount of bytes again....

Can you help?? This is what i have now...




.386
.model flat, stdcall
option casemap:none

include c:\masm32\include\windows.inc
include c:\masm32\include\kernel32.inc
include c:\masm32\include\user32.inc
include c:\asmdev\win32\msvcrt.inc
include c:\asmdev\win32\handleer.asm

include c:\masm32\include\comdlg32.inc
includelib c:\masm32\lib\kernel32.lib
includelib c:\masm32\lib\user32.lib
includelib c:\masm32\lib\msvcrt.lib

.const

.data
sysinf SYSTEM_INFO <>
fileName db "c:\disp1.txt",0
pMapOffset DWORD 0
.data?
hFile DWORD ?
hMap DWORD ?
dwFileSizeLo DWORD ?
dwFileSizeHi DWORD ?
dwPageSize DWORD ?
dwAllocAlign DWORD ?
pView DWORD ?
pStartOfMap DWORD ?


.code

start:

invoke CreateFile, ADDR fileName, GENERIC_READ, 0, 0, OPEN_EXISTING, 0, 0
MOV hFile, EAX

invoke GetSystemInfo, ADDR sysinf

PUSH sysinf.dwAllocationGranularity
POP dwAllocAlign
PUSH sysinf.dwPageSize
POP dwPageSize

invoke GetFileSize, hFile, ADDR dwFileSizeHi

.IF ([dwFileSizeHi]==0 && EAX==0)
PUSH dwPageSize
POP dwFileSizeLo
.ENDIF


invoke CreateFileMapping, hFile, 0, PAGE_READONLY, dwFileSizeHi, dwFileSizeLo, 0
MOV hMap, EAX

invoke MapViewOfFile, hMap, FILE_MAP_READ, 0, pMapOffset, dwAllocAlign

MOV pStartOfMap, EAX
PUSH dwAllocAlign
POP EAX
ADD pMapOffset, EAX

invoke MapViewOfFile, hMap, FILE_MAP_READ, 0, pMapOffset, dwAllocAlign

MOV pStartOfMap, EAX
PUSH dwAllocAlign
POP EAX
ADD pMapOffset, EAX

invoke MapViewOfFile, hMap, FILE_MAP_READ, 0, pMapOffset, dwAllocAlign

MOV pStartOfMap, EAX
PUSH dwAllocAlign
POP EAX
ADD pMapOffset, EAX

invoke CloseHandle, hFile
invoke CloseHandle, hMap

PUSH 0
CALL ExitProcess

end start



zooba

The problem is that you can only map on certain boundaries, dwAllocationGranularity, which is probably a few thousand bytes. So to access, say, byte 5, you need to round down to that boundary (0) and then add the difference onto the pointer you get back. To access byte 1005h, map a view from 1000h and add 5 to the pointer you get back (remembering that you need the original pointer later to close the view)

Here's a snippet from FileMapLock in the zip file above. (Note that dwAllocAlign is equal to dwAllocationGranularity. I stored it in the filemap object for no particular reason... I may put it somewhere else now I think about it :wink ):

; dwOffsetHi:dwOffsetLo is where we want to start
; dwRealStartHi:dwRealStartLo is rounded down
; oStart is the number of bytes we rounded down
    mov     eax, dwOffsetLo
    mov     edx, dwOffsetHi
    mov     ecx, [edi].FILEMAP.dwAllocAlign
    div     ecx
    mov     eax, dwOffsetLo
    sub     eax, edx
    mov     oStart, edx
    mov     edx, dwOffsetHi
    mov     [esi].FILEMAPVIEW.oRealStartLo, eax
    mov     [esi].FILEMAPVIEW.oRealStartHi, edx

paulfaz

reyt, i think i might be getting ahead of myself here for a two week asm newbie.. but hey, gotta start somewhere..

I think my main issue really is im having issues understanding low words and hi words and this allocation boundary thing, which pretty much puts me in a boat without any paddles...

The MapViewOfFile specifys it takes a low word offset and a high word offset, yet each of these are a DWORD?? which confuses me a bit to be honest, as to my knowledge the returned offset pointer is a DWORD so if i get the low and high from this say offsetlo and offsethi, they would contain just a WORD, yet the function wants a dword, if that makes any sense....

I ended up doing this,,,



invoke MapViewOfFile, hMap, FILE_MAP_READ, dwStartHi,dwStartLo , dwAllocAlign
MOV dwOffsetLo, EAX
ROL EAX, 16
MOV dwOffsetHi, EAX


I couldnt move AH which would apparently have contained the Lo Order Byte into dwOffsetLo as it is a DWORD, so i ended up with the above, as you can probably tell im seriously confused...

Anyway, that appeared to kind of work, i ended up with this.....


invoke MapViewOfFile, hMap, FILE_MAP_READ, dwStartHi,dwStartLo , dwAllocAlign

MOV pStartOfMap, EAX
PUSH dwAllocAlign
POP EAX
ADD pMapOffset, EAX
;--new code
;Obtain the Low & High Offset from the pointer
PUSH pMapOffset
POP EAX

MOV dwOffsetLo, EAX
ROL EAX, 16
MOV dwOffsetHi, EAX

MOV EAX, dwOffsetLo
MOV EDX, dwOffsetHi
MOV ECX, dwAllocAlign
DIV ECX
MOV EAX, dwOffsetLo
SUB EAX, EDX
MOV EDX, dwOffsetHi
MOV dwStartLo, EAX
MOV dwStartHi, EDX

invoke MapViewOfFile, hMap, FILE_MAP_READ, dwStartHi, dwStartLo, dwAllocAlign

;-new code



The output from OLLYDBG reported the first Inoke for MapViewOfFile as ...

MapSize = 10000 (65536) = dwAllocAlign(Allocation Granularity)
OffsetLow = 0
OffsetHi = 0

This works, it gets the beginning of the file, up presumably 65536 bytes...  Then next run, after the calculations above, reports..

MapSize = 10000 (65536) = dwAllocAlign(Allocation Granularity)
OffsetLow = 10000
OffsetHi = 1

This results in strangely enough an "ERROR_ACCESS_DENIED" error

When i take the calculations out and do out how i was doing it before, which works but isnt giving me the right bit of the file, the code looks like this...


invoke MapViewOfFile, hMap, FILE_MAP_READ, 0,pMapOffset , dwAllocAlign


MOV pStartOfMap, EAX
PUSH dwAllocAlign
POP EAX
ADD pMapOffset, EAX


;--new code
;Obtain the Low & High Offset from the pointer
;PUSH pMapOffset
;POP EAX

; MOV dwOffsetLo, EAX
; ROL EAX, 16
; MOV dwOffsetHi, EAX

; MOV EAX, dwOffsetLo
; MOV EDX, dwOffsetHi
; MOV ECX, dwAllocAlign
; DIV ECX
; MOV EAX, dwOffsetLo
; SUB EAX, EDX
; MOV EDX, dwOffsetHi
; MOV dwStartLo, EAX
; MOV dwStartHi, EDX

;-new code

invoke MapViewOfFile, hMap, FILE_MAP_READ, 0, pMapOffset, dwAllocAlign


On the second invoke OLLYDBG reports the following...

MapSize = 10000 (65536) = dwAllocAlign(Allocation Granularity)
OffsetLow = 10000
OffsetHi = 0

The only difference is the OffsetHi which is now NULL, like wise if use the caluclations above but set the OffsetHi to 0, i also get the same result...

Help whats going on?, i didnt think i would have to do any special stuff because im only reading in batches of the allocation granularity, im not reading random parts or bytes, im reading the first dwAllocAlign and then on the second run, i want to start again where i finished... in the registers this looks right to me, ollydbg reports the first invoke at being 0 and reading 10000 (65536) and the second invoke starting at 10000 (65536) and reading another 10000 (65536) but its not leading on, it jumps about 40 lines...

Sorry for being a pain in the ass mate... thanks for ur help so far and i appreciate any further help...

zooba

Okay. It seems you've got the idea of the allocation boundary right.

The trick with the offset is that Windows uses 64-bit unsigned integers for file access. So dwOffsetHi and dwOffsetLo are each DWORDs, since combined they represent a single QWORD (64-bit). In general (ie. for files less than 4GB big), dwOffsetHi should always be zero. It's important to always double check what the high and low refer to. These days it usually means DWORDs (with a huge exception being GUI stuff which uses WORDs for compatibility), since we need 64-bit numbers often. Numbers less than 32-bits are generally extended to fit 32-bits, so you basically never pass anything besides DWORDs to API functions.

The return value from MapViewOfFile is a pointer to a memory location, rather than a file offset. Memory addresses/pointers on a 32-bit system are 32-bits. To map a file, a block of memory is allocated somewhere in memory and then any accesses are redirected to the file automatically. Since this pointer doesn't point directly into the file it could be anywhere. And when you unmap and remap the same area of the file, the pointer could point somewhere completely different! The OS keeps up with it and makes it work.

The code causing dwOffsetHi to be 1 is the ROL. If you ROL 00010000h sixteen times, you end up with 00000001h. At the end of the calculation, this number is put directly into dwStartHi without being changed (the assumption being that dwAllocAlign is a power of 2 and rounding down won't require a borrow from the high DWORD). My use of Offset as part of the name implies (to me at least) that it's not a pointer to memory (plus I always prefix pointers with a p).

Main points:

The offset passed to MapViewOfFile is a 64-bit number split into a high and low DWORD.

The return value from MapViewOfFile is a 32-bit memory address, it's not a file offset.

You're doing pretty well for two weeks in. The best way to learn is to do more than you know how to :wink

Cheers,

Zooba :U

paulfaz

Cheers mate, ur not going to believe this... it was right all along...

Because i was running it in a debugger, the text string that OLLYDBG was showing me was actually cut down, it wasnt the full string for what was mapped... When i ran it using a WriteConsole i discovered there was more, and managed to line up the first and second run.....

My problem now is detecting the end of the of the file?... ill be back, if i cant suss it out..

cheers mate.

zooba

Excellent! Just yell if you want anything else :U

paulfaz

LOL, HELP...

Reyt mate, ive got a further issue now when updating the editcontrol with EM_REPLACESEL

So, it loops through the process reading in 64k (dwAllocAlign) 3 times before it crashes... the process is....


FileReadLoop:

;Map File
invoke MapViewOfFile, hMap, FILE_MAP_READ, dwOffsetHi, dwOffsetLo, dwAllocAlign

;Check we got a pointer back, if not end the loop
MOV pStartOfMap, EAX
CMP EAX,0
JE EndLoop

;Get the length of text already in the edit box
PUSH hWndCTL_FC
CALL GetWindowTextLength
MOV text_len, EAX

;Increment newsize by newsize=textlen + dwAllocAlign(64K) giving a new size for the editbox
PUSH text_len
POP newsize
PUSH dwAllocAlign
POP EAX
ADD newsize, EAX

;Set the new size to allow for next update
invoke SendMessage, hWndCTL_FC, EM_LIMITTEXT, newsize, 0

;Select the end of the current length of text
PUSH text_len
PUSH text_len
PUSH EM_SETSEL
PUSH hWndCTL_FC
CALL SendMessage

;Send a replacement to Selected text for whats in pStartOfMap = MemmoryMapFile Pointer
PUSH pStartOfMap
PUSH 0
PUSH EM_REPLACESEL
PUSH hWndCTL_FC
CALL SendMessage

;Increment the FileOffset dwOffsetLo by 64K for next Memory Map
PUSH dwAllocAlign
POP EAX
ADD dwOffsetLo, EAX

JMP FileReadLoop

EndLoop:



From what i cant tell this fails after around 3 loops, and on the SendMessage for EM_REPLACESEL with an Access Violation..

OLLYDBG ERROR : Access Violation When Reading [016E0000]
DS:[016E0000]=???
CL=2E('.')   << Is this possibly the current char being read from the filemap...

Any ideas mate??

Thanks

Paul

zooba

Quite possibly you have reached the maximum limit for a textbox. Try using a rich edit box instead, they can have bigger files in them.

Also, this could probably go into a new post, since everyone else appears to have abandoned this one and GUI isn't really my thing  :P

Cheers,

Zooba :U

MichaelW

The default limits are 32767 characters for an edit control and 64000 for a rich edit control. You can use the SETLIMITTEXT message to change this.
eschew obfuscation

paulfaz

cheers zooba, and michaelW,

Im already using an EM_SETSEL which looks like it does the same thing, ill try recoding it with a rich edit box..

basically im trying to code my own ASM Web Log viewer, for opening logs that are going to be around 180Mb...

The file im testing with is only 4mb?, whats the control that is used for Notepad.exe? does anyone know, as Notepad can open the 4mb file....

Each time in the loop, im retrieving how much is already in the edit box, and then adding the Allocation Granularity which is the amount im going to add(64K) so i end up with newsize=mytextboxlen+dwAllocAlign then i send a EM_SETSEL to change the maximum length to "newsize" to accomodate for the new length... It all seems right in principal, just dont work...

I need to tidy up the program anyway, so ill start again, atleast i now have the code to do what i want, so ill re-do it with a rich edit control...

PBrennick

If you are going to use a richedit control you can set EM_LIMITTEXT to -1 which will tell the control to use all of the available memory. With my editor, I use ReadFile with no problems. I can open a 4 meg file in one read.

Paul
The GeneSys Project is available from:
The Repository or My crappy website

MichaelW

paulfaz,

AFAIK Notepad is based on an edit control. If you can open a 4MB file then you must be running Windows NT/2000/XP. According to the page I linked the defaults apply until EM_SETLIMITTEXT is called. There is no mention of EM_SETSEL affecting the limit here.
eschew obfuscation