News:

MASM32 SDK Description, downloads and other helpful links
MASM32.com New Forum Link
masmforum WebSite

Reading large file byte per byte

Started by jdoe, January 10, 2009, 05:29:38 AM

Previous topic - Next topic

jdoe


Hi,

If I want to read a large file byte per byte, is it better to decrement the file size until it reach zero or I can increment the file pointer until it reach the end pointer.

The reason I ask that question is because it came to my mind that, if the pointer reach FFFFFFFFh and I do a check if the start pointer is below than the end pointer, it would fails past that limit.

What I should know about addressing small versus large files ?

Thanks


donkey

When lpDistanceToMoveHigh is NULL, lDistanceToMove is a 32 bit signed value, when lpDistanceToMoveHigh is not NUL the two make up a 64 bit signed value. For files over 4GB you use a 64 bit signed pointer. You would perform a compare on lpDistanceToMoveHigh and lDistanceToMove as one signed value.
"Ahhh, what an awful dream. Ones and zeroes everywhere...[shudder] and I thought I saw a two." -- Bender
"It was just a dream, Bender. There's no such thing as two". -- Fry
-- Futurama

Donkey's Stable

hutch--

JD,

Edgar is right, if the file is over 4 gig you need to use the extra value to get its size but there is another approach, work out a convenient buffer size and read that data in a block then read the next one etc .... Its probably faster than reading one byte at a time using the file system even though the file system uses memory cache to store blocks of data.

Basics are if you choose a 1 meg buffer for example, load the first meg into memory, scan it, read the next meg etc .... This approach will work if you are reading the file sequentially. If you need random access or reverse reads you will need to use the file system.
Download site for MASM32      New MASM Forum
https://masm32.com          https://masm32.com/board/index.php

donkey

Hutch is right, you should be using buffers of a predetermined size to read the file in chunks. However to test a file pointer to see if it is less than a given value you only need a few extra steps...

// 64 bit pointer value in EDX:EAX
test edx,edx
jnz >
cdq // Sign extend EAX into EDX if EDX is NULL
:
cmp edx,[PointerHigh]
jl >.ISSMALLER
cmp eax,[PointerLow]
jl >.ISSMALLER

.ISLARGEROREQUAL
...


.ISSMALLER
...
"Ahhh, what an awful dream. Ones and zeroes everywhere...[shudder] and I thought I saw a two." -- Bender
"It was just a dream, Bender. There's no such thing as two". -- Fry
-- Futurama

Donkey's Stable

jdoe


Thanks guys for these informations and I'm sorry for my late reply.

I realize that my question wasn't really precise. I know that getting a file pointer through MapViewOfFile for a large file could be slow but to explain what I try to understand let say that I do it like that. So I get a file pointer into EAX with MapViewOfFile and I want to read that file byte per byte (BYTE PTR [eax]). If the file is large, I'll get a situation where I'll reach the FFFFFFFFh pointer and after that point, what is happening. It is not reliable to get the end pointer and stop reading when the start pointer reach the end pointer. Am I better to use the file size and decrement it until it is zero. Sorry if I wasn't clear.


sinsi

If you're using MapViewOfFile, you won't have to worry about the pointer overflowing since you can only map less than 4 GiB anyway - it's got to fit into your program's address space.

If you use ReadFile to read byte-by-byte just look at lpNumberOfBytesRead - ReadFile returns no error but lpNumberOfBytesRead=0 at EOF.
Using ReadFile to read single bytes is very slow, you're better off using a buffer, then having a proc of your own to read that a byte at a time.
Once the buffer is empty, fill it up again until you get to EOF. Of course, if you're not reading sequentially you don't have a lot of choice.
Light travels faster than sound, that's why some people seem bright until you hear them.

jdoe


Thanks sinsi. You just gave me few clue about stuff I have to learn.

With basic knowledge, I can do a lot a stuff and with only one more piece of the puzzle in place, I'm gonna do things differently but more accurately. I love assembly for that... you never know when "that little something" will take place in your head.

If I had more time to read instead of learning by coding...   :green2

:U