Print Page - Reading a file

Title: Reading a file - Dynamic size
Post by: Ic3D4ne on January 26, 2005, 06:06:07 PM

If I were to read a file into a buffer, without knowing its file size before coding/assembling the application, how would I go about dynamicly allocate space for the buffer?

And also, can someone explain the argument lpNumberOfBytesRead in the ReadFile function for me?

Code Select

BOOL ReadFile(

    HANDLE hFile,	// handle of file to read 
    LPVOID lpBuffer,	// address of buffer that receives data  
    DWORD nNumberOfBytesToRead,	// number of bytes to read 
    LPDWORD lpNumberOfBytesRead,	// address of number of bytes read 
    LPOVERLAPPED lpOverlapped 	// address of structure for data 
   );

Help greatly appreciated.

-Ic3D4ne

Title: Re: Reading a file - Dynamic size
Post by: Relvinian on January 26, 2005, 06:56:35 PM

Quote from: Ic3D4ne on January 26, 2005, 06:06:07 PM
If I were to read a file into a buffer, without knowing its file size before coding/assembling the application, how would I go about dynamicly allocate space for the buffer?

And also, can someone explain the argument lpNumberOfBytesRead in the ReadFile function for me?

Code Select Expand
BOOL ReadFile( HANDLE hFile, // handle of file to read LPVOID lpBuffer, // address of buffer that receives data DWORD nNumberOfBytesToRead, // number of bytes to read LPDWORD lpNumberOfBytesRead, // address of number of bytes read LPOVERLAPPED lpOverlapped // address of structure for data );

Help greatly appreciated.

-Ic3D4ne

Here is a very simple function to open a existing file on your hardrive, find out the size, dynamically allocate a buffer, read the file contents in to the buffer and close the file handle.

I won't provide a lot of error checking for this sample but will try to comment the code well. Also, I haven't included any includes necessary to get the correct defines being used by CreateFile, etc so you'll need to do that to use this function into your own application.

NOTE:
I always wrap [] around my variable names. Some people do and some don't. It is personal preference with MASM but some assemblers do care if you put [] around your variables.

Function returns one of two different possibilities:
EAX = NULL is some sort of error
EAX = address of buffer otherwise

Code Select


LoadFile PROC PUBLIC FileSpec:DWORD
   LOCAL fileHandle  : HANDLE   ; handle to opened file
   LOCAL fileSizeLow : DWORD   ; size of file (low dword value)
   LOCAL fileSizeHigh : DWORD  ; size of file (high dword value)
   LOCAL bytesRead : DWORD   ; need a variable to hold the amount of bytes read from file
   LOCAL fileBuffer   : DWORD   ; a pointer to a dynamically allocated buffer (memory location)
  
   ; try and open the file specified by FileSpec parameter
   invoke CreateFile, [FileSpec], GENERIC_READ, FILE_SHARE_READ or FILE_SHARE_WRITE, NULL, \
        OPEN_EXISTING, FILE_ATTRIBUTE_NORMAL or FILE_FLAG_SEQUENTIAL_SCAN,  NULL
   
   ; make sure we at least opened the file
   cmp eax, INVALID_HANDLE_VALUE
   mov [fileHandle], eax
   je FailedOpen
   
   ; now that the file has been successfully opened, figure out the size of the file.
   invoke GetFileSize, [fileHandle], NULL, addr fileSizeHigh
   mov [fileSizeLow], eax

   ; NOTE: this example assumes the file size will be less then two gigabytes so we are only going to use
   ; the fileSizeLow value and ignore the fileSizeHigh value

   ; allocate a buffer to hold the file contents into. There are *many* different allocation routines to use
   ; but two of the most common for dynamic memory and files are HeapAlloc and VirtualAlloc. In this
   ; example, we are using VirtualAlloc to create our buffer
   invoke VirtualAlloc, NULL, [fileSizeLow], MEM_COMMIT, PAGE_READWRITE
   mov [fileBuffer], eax

   ; at this point, we are ready to read the conents of the file into the newly allocated buffer
   invoke ReadFile, [fileHandle], addr [fileBuffer], [fileSizeLow], addr bytesRead, NULL

   ; EAX will be either a 0 (failure) or a 1 (success).
   ; you can see how many bytes were actually read from the file by checking 'bytesRead' variable.
   ; unless there was some type of file corruption, the 'fileSizeLow' should equal 'bytesRead'. This
   ; is a way to check and make sure you read as much as you where expecting.

   ; other uses for 'bytesRead' can be to read in "chuncks" at a time from a file. Say you allocate a 64k
   ; memory block to read the file into and process that segment before reading another. With this
   ; way, you would check to bytesRead to see how many bytes were read (in case they were less
   ; then the 64k block size you specified).

   ; close the file handle
   invoke CloseFile, [fileHandle]
    
   ; at this point, you have succesfully opened, read the contents and close the file.
   ; return the pointer to the buffer.
   mov eax, [fileBuffer]
   jmp Done

FailedOpen:
   ; return a NULL value in EAx
   xor eax, eax

Done:
   ret
LoadFile ENDP

Hope this helps you.

Relvinian

Title: Re: Reading a file - Dynamic size
Post by: Ic3D4ne on January 26, 2005, 09:03:55 PM

Wow, thanks a lot.

Heh, that really, really, really helped me.

Just a 3 line answer would have sufficed, but that's good too. ;)

Just one more thing, could you explain what the difference is between using [] wraps, and not using them?

-Ic3D4ne

Title: Re: Reading a file - Dynamic size
Post by: Relvinian on January 26, 2005, 09:32:01 PM

Quote from: Ic3D4ne on January 26, 2005, 09:03:55 PM
Wow, thanks a lot.

Heh, that really, really, really helped me.

Just a 3 line answer would have sufficed, but that's good too. ;)

Just one more thing, could you explain what the difference is between using [] wraps, and not using them?

-Ic3D4ne

[] wrapping around variables is a way to tell the compiler that you want the contents of what is [] points to or to use the address of the variable.

Example:

Code Select


mov eax, [fileHandle]
mov eax, fileHandle

In MASM, the two statement listed above are the same thing. MASM doesn't distinguish be ADDRESS or the CONTENTS of a variable. It always uses CONTENTS unless you specifiy the ADDR keyword first. So the above two lines of code are saying:

Move the contents of variable 'fileHandle' into the EAX register. fileHandle has some value as '0x00000fdc'. The address of fileHandle might be something like 0x0f23561f.

In some other assemblers, the above code could produce different results based on assembler defintion of what [] is.

Relvinian

Title: Re: Reading a file - Dynamic size
Post by: Titan on January 26, 2005, 09:51:58 PM

Yes, thanks for posting that Relvinian. So if I wanted to use "LoadFile" to check if a file exists... I simply run LoadFile, and if eax is NULL then the file doesn't exist?

Title: Re: Reading a file - Dynamic size
Post by: Relvinian on January 26, 2005, 10:10:05 PM

Quote from: Titan on January 26, 2005, 09:51:58 PM
Yes, thanks for posting that Relvinian. So if I wanted to use "LoadFile" to check if a file exists... I simply run LoadFile, and if eax is NULL then the file doesn't exist?

You could do that but it would be *VERY* inefficient because if the file does exist, you are loading it into memory. You would also have to worry about freeing the buffer if it did load it.

A better solution to check and see if a file exists on the hard drive would be:

Code Select


   EAX = 0  ( file doesn't exist )
   EAX = 1  ( file does exist )

DoesFileExist PROC PUBLIC FileSpec: dword
    LOCAL  findInfo : WIN32_FIND_DATA <>
    
    ; check to see if the file exist on the hard drive by simple using a single API call.
    invoke FindFirstFile, [FileSpec], addr [findInfo]
   
    ; check to return value of FindFirstFile.  If the value is INVALID_HANDLE_VALUE,
    ; the file didn't exist, otherwise it did.
    cmp eax, INVALID_HANDLE_VALUE
    je NotFound

    ; since the file was found, we need to close the handle
    invoke FindClose, eax
    mov eax, 1
    jmp Done

NotFound:
    xor eax, eax

Done:
    ret
DoesFileExist ENDP

This is in my opinion the best way to check the existence of a file to see if it exists or not

Relvinian

Title: Re: Reading a file - Dynamic size
Post by: Titan on January 27, 2005, 03:44:30 AM

Thank you Relvinian. :U I had to do some tweaking for my assembler to like that code, but it works... and it was just what I've been looking for. ;) Thanks again for sharing your work. :)

Title: Re: Reading a file - Dynamic size
Post by: Relvinian on January 27, 2005, 03:38:23 PM

Titan,

No problem there. Glad the example function helped you out.

Relvinian

Title: Re: Reading a file - Dynamic size
Post by: Nilrem on January 27, 2005, 07:16:32 PM

I adapted my code (ReadFile thread, recently posted), however instead of displaying what was there, it now displays what was in the text file plus some weird characters that look like webdings. Here is my code, might have missed something:

Code Select


.data

OpenFileError DB " Could not be found.",0
CurrentFile DB ".\Resource\Read.txt",0
FileRead DB "File was read.",0
;dwNumRead DW ?
;lpstring DW 128
        
.data?

Numb DD ?
FileSize DD ?
hFile HANDLE ?

		LOCAL lpstring[128]:DWORD;Used for input()
		LOCAL buffer[128]:BYTE;	Used for putting strings together
		LOCAL buffer2:DWORD; A pointer to a dynamically allocated buffer (memory location)
		LOCAL fileSizeLow:DWORD; size of file (low dword value)
   		LOCAL fileSizeHigh:DWORD; size of file (high dword value)
   		
		invoke CreateFile, ADDR CurrentFile, GENERIC_READ, FILE_SHARE_READ,
        NULL, OPEN_EXISTING, FILE_ATTRIBUTE_NORMAL, NULL
        
        mov [hFile], eax;	Move handle value to handle variable
        
        invoke GetLastError;	Get the error value
        xor ebx, ebx;	Set ebx to 0
        mov ebx, eax;	Move the value of GetLastError to ebx
        .if ebx != 0;	If the file does not exist
        	strcat ADDR [buffer], ADDR [CurrentFile], ADDR [OpenFileError];	Put the strings together
        	invoke StdOut, addr [buffer];	Print the new combined string to the screen
        	mov lpstring, input();	Wait for user input
        	ret
    	.endif
        
   		;Now that the file has been successfully opened, figure out the size of the file.
   		invoke GetFileSize, [hFile], addr fileSizeHigh
   		mov [fileSizeLow], eax
   		; allocate a buffer to hold the file contents into. There are *many* different allocation routines to use
   		; but two of the most common for dynamic memory and files are HeapAlloc and VirtualAlloc. In this
   		; example, we are using VirtualAlloc to create our buffer
   		invoke VirtualAlloc, NULL, [fileSizeLow], MEM_COMMIT, PAGE_READWRITE
   		mov [buffer2], eax
    	
    	invoke ReadFile, [hFile], ADDR [buffer2], [fileSizeLow], ADDR [Numb], NULL;	Read the file
    	invoke StdOut, ADDR [buffer2]
    	invoke CloseHandle, ADDR [hFile];	Close the file
    	mov lpstring, input ();	Wait for user input
        ret

Title: Re: Reading a file - Dynamic size
Post by: Relvinian on January 27, 2005, 07:33:02 PM

Nilrem,

After your ReadFile line and before your StdOut line, you need to make sure your buffer is NULL terminator before sending it to StdOut.

An easy way to do this in your code is allocate at least ONE byte more then you read. Since you are using VirtualAlloc to allocate your buffer, the buffer gets zeroed out when created so you won't have to worry about adding the NULL at the end at long as you allocate more then you read.

Also, since you are declaring a pointer to a buffer on your stack and allocating that buffer with VirtualAlloc, you need to make sure you free the buffer before your function ends or you'll have major memory leaks. Just before your RET statment, call VirtualFree, [buffer2], 0, MEM_RELEASE.

Relvinian

PS - In my example of reading a file contents, I had a bug in the ReadFile statement. The correct syntax should be:

Code Select


invoke ReadFile, [fileHandle], [fileBuffer], [fileSizeLow], addr bytesRead, NULL

Note that I removed the 'addr' in front of the [fileBuffer] because we dynamically allocated it.

Title: Re: Reading a file - Dynamic size
Post by: Nilrem on January 30, 2005, 10:26:50 PM

Are you sure because once I do that I get a blank output screen. Also when allocating +1 byte like you said, could I do this (asking because I still get a blank screen):

Code Select

invoke ReadFile, [hFile], addr [buffer2], [fileSizeLow+1], ADDR Numb, NULL; Read the file
Thanks once again 8-)

Actually I thought that looked untidy so I did this:

Code Select


invoke GetFileSize, [hFile], addr fileSizeHigh
   		xor edx,edx
   		mov edx,1
   		add eax,edx
   		mov [fileSizeLow], eax
....
invoke ReadFile, [hFile], [buffer2], [fileSizeLow], ADDR Numb, NULL;	Read the file
    	invoke StdOut, ADDR [buffer2]
    	invoke CloseHandle, ADDR [hFile];	Close the file
    	mov lpstring, input ();	Wait for user input
    	invoke VirtualFree, [buffer2], 0, MEM_RELEASE
        ret

but still a black screen, if I do addr [buffer2] it works but with extra characters.

Title: Re: Reading a file - Dynamic size
Post by: Relvinian on January 31, 2005, 08:41:12 PM

Nilrem,

When you dynamically allocate a buffer and have a DWORD ptr to it, you don't use the ADDR to reference it. You only need the ADDR when you need a ptr to some data that is on the stack.

Example:

Code Select


MyFunc proc public String:dword
   local   CopyOfString1[256] : byte
   local   CopyOfString2 : dword

   ; copy the string passed in into our local string buffer.
   invoke lstrcpy, addr [CopyOfString1], [String]

   ; allocate a buffer to hold the string and copy the string passed in
   invoke GetProcessHeap
   invoke HeapAlloc, eax, 0, 260
   mov [CopyOfString2], eax
   invoke lstrcpy, eax, [String]

   ; now, both CopyOfString1 and CopyOfString2 have identitcal copies of our string passed in.

   ; free the allocated string so we don't have a memory leak
   invoke GetProcessHeap
   invoke HeapFree, eax, 0, [CopyOfString2]

   ; return back to caller
   ret
MyFunc endp
/

In your example you had:

Code Select


   invoke ReadFile, [hFile], addr [buffer2], [fileSizeLow+1], addr Numb, NULL

This is reading the incorrect amount because you are adding one to the address of 'fileSizeLow' not the value then using the new address and also placing the data in the wrong place corrupting your stack.

Code Select


invoke ReadFile, [hFile], [buffer2], [fileSizeLow], ADDR Numb, NULL; Read the file
    invoke StdOut, ADDR [buffer2]
    invoke CloseHandle, ADDR [hFile]; Close the file
    mov lpstring, input (); Wait for user input
    invoke VirtualFree, [buffer2], 0, MEM_RELEASE
    ret

In this section of code above, your ReadFile line is correct but you can't put the ADDR in front of 'buffer2' in your StdOut invoke statement. Same goes for the CloseHandle call. Don't put the ADDR or you are trying to close the address of 'hFile' and not the value.

Relvinian

Title: Re: Reading a file - Dynamic size
Post by: Nilrem on January 31, 2005, 10:10:46 PM

Thankyou, that worked perfectly, and I understood it. 8-) Now how would I maniuplate these? What I want to do is read the lines of the file in, for example it will output
file.txt
file1.txt now I want to know how to check if these exist, I know how to do it with CreateFile but how to manipulate the data read in to do this? I don't want the code writing for me (not at first anyways) just some hints and clues to get me started, best way to learn. Thanks a lot by the way guys. Thanklyou. 8-)

Title: Re: Reading a file - Dynamic size
Post by: Relvinian on January 31, 2005, 10:58:53 PM

Nilrem,

Once you have read in some data from the file into your buffer, you just pass the buffer pointer to some routine which will parse the information you are looking for. The MASM32.lib has some parsing routines which may be of use to you. Just browse through there and see if anything suits your fancy.

Let me know what kind of checking/parsing you are looking for and I'll give you more guidance and more details on what type of functions/API calls to use for your routines.

Relvinian

Title: Re: Reading a file - Dynamic size
Post by: Nilrem on February 01, 2005, 08:58:18 AM

Thankyou before I have a look (currently at school) at the masm32.lib I will tell you my parsing routines. Ok I want to check if a file exists, I have already done that in my code using 'createfile' with 'getlasterror'. However when I read in the text file it reads in names of files (which is what I want it to do), now with this I want to use it to check if the files exist (the filenames read in from my initial txt file). I am really stumped on this. But trying to use some logic could I use findfirstfile and findnextfile, and then when it has found all the files I could compare it against a text file that lists the filenames to see if it matches? A bit of a long shot but I want to prove I am trying myself not just asking endless questions without first trying it myself. Thanks again.

Title: Re: Reading a file - Dynamic size
Post by: skywalker on February 08, 2005, 01:33:24 PM

A better solution to check and see if a file exists on the hard drive would be:

Code Select


   EAX = 0  ( file doesn't exist )
   EAX = 1  ( file does exist )

DoesFileExist PROC PUBLIC FileSpec: dword
    LOCAL  findInfo : WIN32_FIND_DATA <>
    
    ; check to see if the file exist on the hard drive by simple using a single API call.
    invoke FindFirstFile, [FileSpec], addr [findInfo]
   
    ; check to return value of FindFirstFile.  If the value is INVALID_HANDLE_VALUE,
    ; the file didn't exist, otherwise it did.
    cmp eax, INVALID_HANDLE_VALUE
    je NotFound

My Win32.hlp isn't showing INVALID_HANDLE_VALUE in dwFileAttributes.
Is that something new that's been added?

Could you give me an example of this.

Thanks.

Title: Re: Reading a file - Dynamic size
Post by: hutch-- on February 08, 2005, 02:13:49 PM

The technique works fine without having to open the file but you have a notation mistake of using square brackets around named variables that is incorrect in MASM.

invoke FindFirstFile, [FileSpec], addr [findInfo]

Should be,

invoke FindFirstFile,FileSpec, addr findInfo

MASM simply ignores the mistake but it can lead to other mistakes by not using the named variable addressing mode correctly.

In MASM a named variable in this context is something like,

varname equ <[ebp+8]>

Adding the extra brackets effectively gives you [[ebp+8]] which MASM ignores.

Title: Re: Reading a file - Dynamic size
Post by: Relvinian on February 08, 2005, 04:13:36 PM

Skywalker,

That's because the WIN32.HLP file isnt always updated like the platform SDK (or online help from Microsoft).

Here's what the platform SDK says about FindFirstFile:

Code Select


Return Values
If the function succeeds, the return value is a search handle used in a subsequent call to FindNextFile or FindClose.

If the function fails, the return value is INVALID_HANDLE_VALUE. To get extended error information, call GetLastError.

Relvinian

Title: Re: Reading a file - Dynamic size
Post by: Relvinian on February 08, 2005, 04:17:21 PM

Quote from: hutch-- on February 08, 2005, 02:13:49 PM
The technique works fine without having to open the file but you have a notation mistake of using square brackets around named variables that is incorrect in MASM.

invoke FindFirstFile, [FileSpec], addr [findInfo]

Should be,

invoke FindFirstFile,FileSpec, addr findInfo

MASM simply ignores the mistake but it can lead to other mistakes by not using the named variable addressing mode correctly.

In MASM a named variable in this context is something like,

varname equ <[ebp+8]>

Adding the extra brackets effectively gives you [[ebp+8]] which MASM ignores.

MASM may not care about the brackets and can ignore them but when you look at the op-code output of what MASM generates, putting brackets around variables names becomes a personal preference choice (and in some assemblers, necessary). This was already mentiioned by me in this thread near the beginning when I first wrote that example.

Relvinian

Title: Re: Reading a file - Dynamic size
Post by: Relvinian on February 09, 2005, 10:12:20 PM

Quote from: Nilrem on February 01, 2005, 08:58:18 AM
Thankyou before I have a look (currently at school) at the masm32.lib I will tell you my parsing routines. Ok I want to check if a file exists, I have already done that in my code using 'createfile' with 'getlasterror'. However when I read in the text file it reads in names of files (which is what I want it to do), now with this I want to use it to check if the files exist (the filenames read in from my initial txt file). I am really stumped on this. But trying to use some logic could I use findfirstfile and findnextfile, and then when it has found all the files I could compare it against a text file that lists the filenames to see if it matches? A bit of a long shot but I want to prove I am trying myself not just asking endless questions without first trying it myself. Thanks again.

Nilrem,

Here's some psuedo logic to help you with the logic of parsing and accomplishing what you wanted to do.

Open text file which has contents of filenames to check and see if they exist
read the text file into memory
close text file
while not end of buffer
read line of text
check to see if that file exist with FindFirstFile/FindClose
do something with check result
loop back to while loop
free memory (if dynamically allocated)

Now, with something like this, it is easier to grasp what the goal is. Always take a task and break it down as much as possible. Even with what I provided, you can take it and break each one down a little farther to get more details. Once you have it broken down into the smallest tasks, it will be easy to write the code behind it to accomplish the main task.

Relvinian

Title: Re: Reading a file - Dynamic size
Post by: Nilrem on February 09, 2005, 10:54:46 PM

Thankyou for the reply. If I have problems I'll post again, but it might not be soon because I am fairly busy.

The MASM Forum Archive 2004 to 2012

General Forums => The Campus => Topic started by: Ic3D4ne on January 26, 2005, 06:06:07 PM