Over the Christmas/New Year break I looked at some alternative assemblers to MASM. GoAsm was immediately appealing to me with my limited experience because of the lack of clutter in the necessary code. Thanks Jeremy for a great product. I don't understand why your assembler appears less popular than others.
My question is not go-specific however: I am trying to quicken a file read line procedure by reducing the number of invocations to Win APIs. What I have come up with is this:
FileReadLine3 FRAME handle,ptrString ; return EAX = 0 is EOF
LOCAL dBytesRead
push ecx
push edx
push edi
mov edi,[ptrString]
.L1:
invoke ReadFile,[handle],edi,2,ADDR dBytesRead,0
mov eax,[dBytesRead]
cmp eax,2 ; check if not 2 bytes read
jne > .exit ; skip because EOF
cmp B[edi],13 ; check for end of even length lines
je > .exit
cmp B[edi+1],13 ; check for end of odd length lines
je > .exit1
add edi,2
jmp < .L1
.exit1:
add edi,1
invoke ReadFile,[handle],edi,1,ADDR dBytesRead,0 ; dummy read to advance read byte
mov eax,[dBytesRead]
.exit:
mov W[edi],0 ; stamp zero termination to the string
pop edi
pop edx
pop ecx
ret
ENDF
Does anybody have some thoughts on how to improve the performance of this?
Regards
Chris
If you want to speed things up you might consider a couple of alternatives:
1 - File mapping which allows you to address the file without API calls.
2 - Read the file in large chunks and process the chunks in memory.
File mapping is much faster than reading a file byte by byte however since it has to be marshalled, it is slower than reading the file into memory in chunks and processing it. The processing algorithm would be the same for both.
The following code segment will read file lines in memory, either allocated or mapped, and invoke a callback (pCallback) function passing pLineText (Pointer to a buffer containing the text), cbLine (Count of characters in pLineText), nLine (Line number in file) and pLine (Offset of the first character in the file) to it.
mov esi,[pData]
mov edi,esi
mov ebx,[fSizeLo]
mov ecx,ebx
mov eax,0Ah
L1:
repne scasb
push ecx,eax
mov eax,edi
sub eax,esi
push eax
add eax,0Fh
and eax,0FFFFFFF0h
invoke HeapAlloc,[hHeap],HEAP_ZERO_MEMORY,eax
mov [pBuffer],eax
pop ecx
sub ecx,2
jns >
mov ecx,0
:
inc D[LineCount]
invoke copymemory,eax,esi,ecx
sub esi,[pData]
invoke [pCallback],[pBuffer],ecx,[LineCount],esi
or eax,eax
jz >.EARLYEXIT
invoke HeapFree,[hHeap],0,[pBuffer]
mov esi,edi
pop eax,ecx
or ecx,ecx
jnz <L1
Donkey
Thank Donkey for your advice. Reading into large chunks of memory might be a nuisance when inevitably the end of the chunk does not coincide with the end of a line. File buffering to allocated memory and processing the map sounds more practical and should allow random access to file locations as well. I will work on this. Perhaps an "OpenBufferedFile" procedure followed by looping on a "BufferReadLine" procedure may provide a workable structure.
Regards
Chris
Unless you work with really large files (e.g. over 4MB), the easiest and fastest way is reading
the whole file into a buffer, and then look for CrLf's. Below is a snippet.
What I do in my own routines is to allocated a second chunk of memory that keeps addresses
of strings in memory. This allows a very fast access to each single line after the scan is complete.
invoke GetFileSize, hFile, ADDR dwHighSize
.IF dwHighSize
invoke CloseHandle, hFile ; we won't deal with 4MB files
xor eax,eax
ret
.ENDIF
mov dwFileSize, eax
invoke ReadFile, hFile, BufStart, dwFileSize, ADDR dwBytesRead, 0
mov bSuccess,eax
invoke CloseHandle, hFile
mov edi,BufStart ;start of memory
mov ecx,FileSize ;ecx counts down to zero
; *** edi will point to a CRLF-terminated string. We scan for the CRLF.
dec edi ;edi=BufStart-1, for compensation of first inc
Get_ST: mov edx,ecx
mov eax,2573 ;CrLF
Get_CR: inc edi ;position in buffer
dec ecx ;global byte counter
jle CrFound ; end of file encountered
cmp word ptr[edi],ax
jne Get_CR
CrFound: sub edx,ecx ;ecx file size downto 0, edx string len counter
inc StrCount
... etc. - move address somewhere, loop back to Get_CR
Quote from: ChrisLeslie on January 24, 2008, 08:02:07 AM
Thank Donkey for your advice. Reading into large chunks of memory might be a nuisance when inevitably the end of the chunk does not coincide with the end of a line. File buffering to allocated memory and processing the map sounds more practical and should allow random access to file locations as well. I will work on this. Perhaps an "OpenBufferedFile" procedure followed by looping on a "BufferReadLine" procedure may provide a workable structure.
Regards
Chris
Hi Chris,
The snippet I posted is from the ReadFileLines function in my Files library, it uses a memory mapped file, you can download it with GoAsm source from my website as an example if you like or use it directly.
Donkey
Hi Chris!
How about:-
DATA
hFile DD 0 ;to hold file handle
hFileMap DD 0 ;to hold file map handle
FileInMemory DD 0 ;to hold starting address of map view
FileSize DD 0 ;to hold size of the file
;
CODE
;
LOADFOR_FILELINEREAD:
PUSH 0,80h,3 ;3=open existing
PUSH 0,1h,80000000h ;8=read access
PUSH ADDR BUFFER ;filename in Unicode (use CreateFileA if filename in ANSI)
CALL CreateFileW ;try to open file
CMP EAX,-1 ;see if INVALID_HANDLE_VALUE
JZ >L2 ;yes, fail
MOV [hFile],EAX ;keep file handle
PUSH 0,EAX ;eax=file handle
CALL GetFileSize
CMP EAX,-1 ;see if failure
JZ >L1 ;yes
MOV [FileSize],EAX ;keep size of file
;
;*************** file map the whole file
PUSH 0,0,0,2h,0,[hFile] ;2h=PAGE_READONLY
CALL CreateFileMappingW ;filemap the file
OR EAX,EAX ;see if failed
JZ >L1 ;yes, failure to establish filemap
MOV [hFileMap],EAX ;keep file map handle
PUSH 0,0,0,4h,EAX ;eax=file map handle 4h=FILE_MAP_READ
CALL MapViewOfFile
MOV [FileInMemory],EAX ;keep starting address of map view
;***************
;next function can analyse the contents of the file directly,
;starting at FileInMemory, for FileSize number of bytes
CALL FILELINEREAD ;look at lines in the file directly in memory
;
PUSH [FileInMemory]
CALL UnmapViewOfFile ;close file mapping for this file
PUSH [hFileMap]
CALL CloseHandle ;close file map handle for this file
L1:
PUSH [hFile]
CALL CloseHandle ;close the file
L2:
RET
;
Instead of keeping the various handles in data labels, you could keep them in registers if you were careful to distinguish between the volatile and non-volatile registers and to save and restore them in FILELINEREAD.
Here is a version which reads the whole file into memory instead of file mapping it.
DATA
hFile DD 0 ;to hold file handle
FileInMemory DD 0 ;to hold starting address of file in memory
FileSize DD 0 ;to hold the file size
TEMP_DWORD DD 0 ;to receive the number of bytes read
;
CODE
;
LOADFOR_FILELINEREAD:
PUSH 0,80h,3 ;3=open existing
PUSH 0,1h,80000000h ;open file with read access only
PUSH ADDR BUFFER ;filename in Unicode (use CreateFileA if in ANSI)
CALL CreateFileW ;try to open file
CMP EAX,-1 ;see if INVALID_HANDLE_VALUE
JZ >L2 ;yes
MOV [hFile],EAX ;keep file handle
PUSH 0,EAX ;eax=file handle
CALL GetFileSize
CMP EAX,-1 ;see if failure
JZ >L1 ;yes
MOV [FileSize],EAX ;keep size of file
;
;**************** load the whole file into memory and read it there
PUSH 4h,3000h,EAX,0 ;reserve and commit eax bytes of read/write memory
CALL VirtualAlloc
OR EAX,EAX ;see if failed
JZ >L1 ;yes
MOV [FileInMemory],EAX ;keep memory start
PUSH 0,ADDR TEMP_DWORD,[FileSize],EAX,[hFile]
CALL ReadFile ;read FileSize bytes into eax
OR EAX,EAX ;see if failed
JZ >L1 ;yes
;***************
;next function can analyse the contents of the file directly,
;starting at FileInMemory, for FileSize number of bytes
CALL FILELINEREAD ;look at lines in the file directly in memory
;
PUSH 8000h,0,[FileInMemory]
CALL VirtualFree ;free memory used
L1:
PUSH [hFile]
CALL CloseHandle
L2:
RET
Instead of keeping the various handles in data labels, you could keep them in registers if you were careful to distinguish between the volatile and non-volatile registers and to save and restore them in FILELINEREAD.
I've used CALL instead of INVOKE in these two examples because I find it easier to add comments when using CALL.
Here is my version so far, using a memory buffer, as a self-contained program. I have used an offset in the buffer to keep track of where reading is up to. I also zero terminated the buffer as a means to detect the end:
DATA SECTION
str1 db 40 dup ? ; string to store each file line
ptrBuff dd ? ; pointer to buffer
ptrBuffOffset dd ? ; pointer to offset in the buffer
CODE SECTION
START:
invoke BufferedFileOpenReadx,"junk.txt"
cmp eax,-1
je > exit ; can't open file
mov [ptrBuff],eax
mov [ptrBuffOffset],eax
L1:
invoke BufferedFileReadLinex,[ptrBuffOffset],ADDR str1
mov [ptrBuffOffset],ebx ; update buffer offset
cmp eax,0ah ; check for end of line
je < L1 ; jump back if end if line detected
invoke GlobalFree,[ptrBuff]
exit:
xor eax,eax
call ExitProcess
;######################################################################
BufferedFileOpenReadx FRAME ptrFilename
;buffer pointer goes to eax
;buffer size goes to ebx
;note - limited to ~ 4 gig file size
LOCAL handle,dSize,dBytesRead,ptrBuffer
push ecx
push edx
mov D[ptrBuffer],0
mov D[dSize],0
invoke CreateFileA,[ptrFilename],80000000h,1h,0,3,80h,0
mov [handle],eax
cmp eax,-1
je > .exit2 ; can't open file
invoke GetFileSize,[handle],0
mov [dSize],eax
add D[dSize],2 ;add extra bytes to the buffer
invoke GlobalAlloc,40h,[dSize]
mov [ptrBuffer],eax
mov eax,[dSize]
sub eax,2 ;calculate size of buffer to read
invoke ReadFile,[handle],[ptrBuffer],eax,ADDR dBytesRead,0
invoke CloseHandle,[handle]
mov eax,[ptrBuffer]
add eax,[dSize]
mov W[eax-2],0 ;stamp two zero terminations at the end of buffer
.exit:
mov eax,[ptrBuffer]
mov ebx,[dSize]
.exit2:
pop edx
pop ecx
ret
ENDF
;######################################################################
BufferedFileReadLinex FRAME dPosition,ptrString
;last byte written goes to al - eax=0 is end of buffer
;new position for next line read goes to ebx
push edx
push esi
push edi
mov esi,[dPosition]
mov edi,[ptrString]
mov edx,-1
xor eax,eax
.L1:
inc edx
mov al,B[esi+edx]
cmp al,0
je > .exitp
cmp al,13
je < .L1
cmp al,10
je > .exitp
mov B[edi+edx],al ; store character in string
jmp < .L1
.exitp:
mov B[edi+edx],0 ; zero terminate string
add esi,edx
inc esi
mov ebx,esi ; return new buffer position in ebx
pop edi
pop esi
pop edx
ret
ENDF
Comparing the speed between different versions of "read line" using a 10000 line text file on my old XP P2/256MB I get:
Open file then read byte-by-byte = about 771 ms
Open file then read byte in twos (like my original example) = about 401 ms
Buffered read as above = about 10 ms
Using Java BufferedReader = about 140 ms
Chris
Nice work. Possibly silly question, how is buffer data handled internally by VirtualAlloc (and the other memory allocation schemas) -- is ANSI text which is read into a VirtualAlloc buffer converted to unicode internally, the same way that ANSI strings are all handled as unicode internally? (And thus slower than using native unicode functions?)
In other words, would a unicode source file be faster to load into a VirtualAlloc buffer than an ANSI file since it wouldn't need conversion?
Mark -
Not a silly question. Unfortunately though I'll have to wait until next Christmas break before I could get a chance to experiment with it again, the way things are at present!
Chris
.L1:
inc edx
mov al,B[esi+edx]
cmp al,0
je > .exitp
cmp al,13
je < .L1
cmp al,10
je > .exitp
mov B[edi+edx],al ; store character in string
jmp < .L1
This is your speed critical part. From experience with InstrCI thread (http://www.masm32.com/board/index.php?topic=9370.msg68107#msg68107), it is faster to read a word from esi (byte and dword reads are slower; but dword is not an option here); I would also compare speed between
mov ax, [esi]
add esi, 1
and
mov ax, [esi+edx]
add ecx, 1