News:

MASM32 SDK Description, downloads and other helpful links
MASM32.com New Forum Link
masmforum WebSite

Puzzles about memory-mapped files

Started by mathewzhao, September 04, 2008, 07:40:25 AM

Previous topic - Next topic

mathewzhao

    A Masm book says memory-mapped files is faster than using direct read and write operations when used on small files(it doesn't explain the reasons) ,but I felt surprised the ways both load some sections of a file into physical memory.
    Why say direct read and write operations slower than memory-mapped files?

   My one classmate told me direct read and write operations is supported by  SOFTWARE (OS),but memory-mapped files is supported by HARDWARE (CPU),so the latter is faster than the former. I want to know the view whether is right or not?

  My other classamte said the memory region mapped actually is Windows's file cache, so no copies need to be created in user space,but I heard applications can access and update data in the file directly and in-place, as opposed to seeking from the start of the file or rewriting the entire edited contents to a temporary location.
  So I felt more suprised if no data is copied to user space,how can update data in the file in-place

Tedd

Quote from: mathewzhao on September 04, 2008, 07:40:25 AM
    A Masm book says memory-mapped files is faster than using direct read and write operations when used on small files(it doesn't explain the reasons) ,but I felt surprised the ways both load some sections of a file into physical memory.
    Why say direct read and write operations slower than memory-mapped files?
It depends exactly how the OS (windows) handles the read buffers - memory mapping 'can' be faster because the OS handles reading and can read a whole memory page and map that directly into the process memory space without needing to copy; with a small direct read (less than a memory page), the data would need to be copied into the destination buffer.

Quote
   My one classmate told me direct read and write operations is supported by  SOFTWARE (OS),but memory-mapped files is supported by HARDWARE (CPU),so the latter is faster than the former. I want to know the view whether is right or not?
No. Paging is supported by the hardware, and paging can be used so the data read is mapped into the process address space, but the hardware itself doesn't support memory mapped files (the OS must tell it what to do.)

Quote
  My other classamte said the memory region mapped actually is Windows's file cache, so no copies need to be created in user space,but I heard applications can access and update data in the file directly and in-place, as opposed to seeking from the start of the file or rewriting the entire edited contents to a temporary location.
  So I felt more suprised if no data is copied to user space,how can update data in the file in-place
Almost right. File data is read into physical memory, that memory can then be mapped into the virtual address space of a process. That process then owns the memory as if it allocated it itself - so it can modify it without changing the actual contents of the file. It starts to get a bit more complicated if two processes memory map the same file - they can use the same physical memory as long as they don't write to it (hardware has support for detecting this), if they do then separate copies need to be made. Windows reserves the upper 2GB of memory space for a 'shared' area, so it has somewhere free to map DLLs, memory mapped files, and shared memory objects - so this is the area where the file-mapping will appear.
No snowflake in an avalanche feels responsible.

gwapo

QuoteIt depends exactly how the OS (windows) handles the read buffers - memory mapping 'can' be faster because the OS handles reading and can read a whole memory page and map that directly into the process memory space without needing to copy; with a small direct read (less than a memory page), the data would need to be copied into the destination buffer.

If a file being memory-mapped is fragmented in many locations, then memory-mapping *still* needs to copy it somewhere to make it appear like you are "viewing" a continuous file space, hence the need to flush it back for any changes made.

Cheers,

-chris

Tedd

Fragmented in what way?

Whether the file is fragmented on disk is irrelevant as once it's read into memory it becomes linear (and you only see the memory, not the disk sectors.)
If you mean in physical memory, that makes no difference as physical memory is only allocated in whole pages, and the pages will be mapped into the address space linearly.
If you mean in virtual memory, the size of the mapping is specified in advance, so the range can be allocated in one go.

It needs to be flushed back after modification because the changes are only in memory, so the data needs to be rewritten to the file.
No snowflake in an avalanche feels responsible.

mathewzhao

After some days,I still can't understand:
Quote
  Windows transparently loads parts of the file into physical memory ,when each page of the file is accessed and copied into memory so that the CPU can access it.

  I want to know whethre memory-mapped file also need direct file I/O calls to  copied data into memory at a lower level or not.
  Does memory-mapped file  just reduce the number of times the hard disk read/write head moves? So we say the primary benefit of memory mapping a file is increased I/O performance.
  Is my view right?
  Thanks!

Tedd

Quote from: mathewzhao on September 08, 2008, 02:28:38 AM
  I want to know whethre memory-mapped file also need direct file I/O calls to  copied data into memory at a lower level or not.
  Does memory-mapped file  just reduce the number of times the hard disk read/write head moves? So we say the primary benefit of memory mapping a file is increased I/O performance.
  Is my view right?
Any access to files will need some file I/O calls. The only difference with memory-mapped files is that the OS takes care of this for you, so you don't explicitly need to do it (but it does still need to be done at some point.)
It may reduce the number of separate disk accesses, because the OS can read large chunks at once - but you can also do the same thing (read the whole file in one go), so it's not the primary benefit.
Whether memory-mapped or not, the file still needs to read from the disk (using file access functions.)
I've already explained why memory-mapping can be faster.
No snowflake in an avalanche feels responsible.