News:

MASM32 SDK Description, downloads and other helpful links
MASM32.com New Forum Link
masmforum WebSite

reading a text file

Started by caprisun, May 06, 2007, 10:14:59 PM

Previous topic - Next topic

MichaelW

QuoteAssign FFFF to cx, and Int21 3F will put the actual number of bytes read into ax, a number which you can then manipulate/use in your routine.
A file can be larger than FFFFh bytes, in which case this method will not work. Function 42h returns the length as a 32-bit value, so it will work for files up to 4GB.
eschew obfuscation

caprisun

Michael, what did for me is similar to what i've done, but instead of int 21, I used int 10.

What I can't get is when I load the data into memory, I can see it there.

How do i access/extract the memory and get each idividual character in their hex decimal form (eg. A = 41h).

In C, there is a simplar function something similar to explode to accomplish this, but in assembly this thing doesn't exist?

MichaelW

Conversion procedures in assembly do exist. Most of them copy the converted values to a buffer as a null-terminated string. If you have the MASM32 package installed, you can find an example of such a conversion procedure in dw2ah.asm in the m32lib directory, but note that the procedure is 32-bit code and it would need to be modified to work in a 16-bit application. One easy to understand method is to convert each nibble of the value to the equivalent hex character by using the nibble value as an index into a table of hex characters. For example, if your table were defined like this:

hexchars db  "0123456789ABCDEF"

Then:

"A" = 41h = 01000001b
upper nibble = 0100b = 4
the character at offset 4 in the table is "4"
lower nibble = 0001b = 1
the character at offset 1 in the table is "1"
 
"z" = 7Ah = 01111010b
upper nibble = 0111b = 7
the character at offset 7 in the table is "7"
lower nibble = 1010 = 10
the character at offset 10 in the table is "A"

You can extract the lower nibble by ANDing the byte value with 0Fh, and the upper nibble by shifting the byte value right by 4. You can get the indexed character into a register by loading the nibble value into a base or index register and using that register to index the table, for example:

mov dl, hexchars[bx]

eschew obfuscation

caprisun

so if the value of BX is already know, then that won't be hard to inject the exact character.

I'm just assuming that :
mov dl, hexchar[bx]
that the use of dl register is no significance?


FileNameIn DB "test.txt",0
.
.
.
.code
mov ax,@data      
mov ds,ax

mov dx,OFFSET FileNameIn
mov al,2
mov ah,3Dh
int 21h
.
.
.

mov dx,offset Buffer
mov bx,HandleIn
mov cx,StringLength
mov ah,3Fh
int 21h

.
.
.

mov cx, StringLength      ; length of string
mov si,OFFSET Buffer      ; DS:SI - address of string
xor bh,bh   ; video page - 0
mov ah,0Eh ; function 0Eh - write character

lodsb         ;  AL = next character in string
int 10h ; call BIOS service
loop NextChar


With something like this.. I can't just
mov bx, [si+n]       ; where n>=0
mov dl, hexchar[bx]

to get the lower or upper nibble i need to make that conversion could I?
Trouble is that I have lodsb to mov each SI into AL, but the value that is moved into AL i assume is not the exact hex that I need to do the conversion?

MichaelW

To access the nibbles you must isolate each of them in a byte.

I used DL in the example because I assumed you were looking for a simpler method than putting the converted characters into a string. By putting the character into DL you can use Interrupt 21h function 2 to display it with just 2 additional instructions. It might help you to know that the DOS display functions call the BIOS Write Teletype function (Interrupt 10h, function 0Eh), so Interrupt 21h function 2 updates the cursor just as the Teletype function does, and it's easier to call.
eschew obfuscation

caprisun

 :red now we are in the right track.
How do I isolate each byte of this paticular array?

Because whenever I load the Offset buffer, SI always contains the value 66h as the first value, so I can't use SI a my source to extract each individual bytes?
And that is where lodsb, int10 gets it's character to move to AL right before it is display

MichaelW

This example is coded for the default processor (8086/8088), and because it must access the lower byte it must use BX (hexchars is defined as hexchars db "0123456789ABCDEF"):

    xor bx, bx            ; must zero upper byte
    mov bl, "A"           ; load byte (could be from memory or register)
    push bx               ; preserve it
    mov cl, 4             ; for .86 shift count > 1 must be in CL
    shr bx, cl            ; isolate upper nibble
    mov dl, hexchars[bx]  ; get indexed character from table
    ; Do something with the character
    pop bx                ; recover byte
    and bx,0fh            ; isolate lower nibble
    mov dl, hexchars[bx]  ; get indexed character from table
    ; Do something with the character

This simpler coding requires a .386 or later processor directive (and a .386 or later processor), and it can use any base or index register (practically speaking that means BX, SI or DI).

    movzx si, byte_val    ; must be memory or register
    push si
    shr si, 4
    mov dl, hexchars[si]
    ; Do something with the character
    pop si
    and si, 0fh
    mov dl, hexchars[si]
    ; Do something with the character

And not to confuse the issue, but by using a table were each element contains both of the characters for a byte value, you can avoid having to isolate the nibbles:

.data
    hextable \
    db "000102030405060708090A0B0C0D0E0F"
    db "101112131415161718191A1B1C1D1E1F"
    db "202122232425262728292A2B2C2D2E2F"
    db "303132333435363738393A3B3C3D3E3F"
    db "404142434445464748494A4B4C4D4E4F"
    db "505152535455565758595A5B5C5D5E5F"
    db "606162636465666768696A6B6C6D6E6F"
    db "707172737475767778797A7B7C7D7E7F"
    db "808182838485868788898A8B8C8D8E8F"
    db "909192939495969798999A9B9C9D9E9F"
    db "A0A1A2A3A4A5A6A7A8A9AAABACADAEAF"
    db "B0B1B2B3B4B5B6B7B8B9BABBBCBDBEBF"
    db "C0C1C2C3C4C5C6C7C8C9CACBCCCDCECF"
    db "D0D1D2D3D4D5D6D7D8D9DADBDCDDDEDF"
    db "E0E1E2E3E4E5E6E7E8E9EAEBECEDEEEF"
    db "F0F1F2F3F4F5F6F7F8F9FAFBFCFDFEFF"
.code

And by combining this table with a .386 or later processor directive and a 32-bit register (that does not need to be a base or index register) you can reduce the code to this (the scaling factor *2 compensates for each table element being 2 bytes in length, and the displacement +1 adjusts the address to the second hex character of the indexed element):

    movzx ebx, byte_val    ; must be memory or register
    mov dl, hextable[ebx*2]
    ; Do something with the character
    mov dl, hextable[ebx*2+1]
    ; Do something with the character


eschew obfuscation

caprisun

I think I wasn't clear enough and made you end up typing a lot of code that is very useful for me but not for this current sitution.

this is what eek posted a  day ago, and was onto what I has having troubles with
Quote
DS:DX points to the start so move dx to si (or di)

mov ah,3F ;<read
mov al,00 ;<bl has file handle
mov cx,[BYTES]
;<read 1 page
int 21
mov si,dx   ;  <--------------------------------move dx to si so you can manipulate it
mov cx,ax  ;bytes read


what eek posted here answer my question
1) what happens to the data which is read from a .txt file
Ans: it is store in SI

Next situtation/question/problem..

2) In SI the value (Hexdecimal number) doesn't correspond to the hex value of the data loaded from text
example, the text file contains letter "A" = 41h
but, the SI doesn't contain 41h, instead it is some arbitrary number.

So the question that still remains, where do I get my hex value of 'A' which was read from the .txt file?



So basicly, the information you gave me is what I would implement after I get my solution for Question 2?

MichaelW

Trying code is no problem. The difficult part is trying to guess what you are asking, what you are trying to do, and what you know and don't know.
Quote
what happens to the data which is read from a .txt file
Ans: it is store in SI
No, the data is not stored in SI or in any other register. The data read from the file is stored in an application-defined buffer. You pass the address of this buffer in DX when you call the function that reads the file. If you do not have a reference for this type of information, you should get Ralf Brown's Interrupt list. An HTML version is here:

http://www.ctyme.com/rbrown.htm

And the download version here:

http://www-2.cs.cmu.edu/~ralf/files.html

I posted code that reads the contents of a small file into a buffer, and then used DEBUG to dump the contents of the buffer, here:

http://www.masm32.com/board/index.php?topic=7245.msg53846#msg53846

To display the contents of the buffer in hex you would need to read the contents of the buffer a byte at a time, convert the byte value to hex, and display the hex.
eschew obfuscation

eek

"Ans: it is store in SI"

I've been drinking tonite but:
NO.
si only points at it. I MADE si point at it.

You can point di at it if you want, or make your own window in ketman.


The buffer is at the bottom of the code

Thats the problem with this ASM stuff.
It takes a while before you get the bigger picture.

In a bout 12 months I'll be asking you wtf is going on...

eek

Quote2) In SI the value (Hexdecimal number) doesn't correspond to the hex value of the data loaded from text
example, the text file contains letter "A" = 41h
but, the SI doesn't contain 41h, instead it is some arbitrary number.

So the question that still remains, where do I get my hex value of 'A' which was read from the .txt file?

have you tried

mov si,C1    ;yet ??

Its all there, thats the thing, we don't need to explain it, the machine is laid out for you.

You can see the file text in the data strip, and experiment with it.


caprisun

Quote from: eek on May 14, 2007, 02:37:09 AM
QuoteI've been drinking tonite but:
Lol, i need a drink too.

Anyhow, thank you, Michael and eek for all your help. I know it's fustrating teaching a noobie new tricks in a short amount of time
but you guys maintain your cool.  :U


The problem is not assembly, it's my professor.
He wants us to program with the knowledge in his "head" and not from whatever we gain from our text book.
Must consentrate on my other subjects, so this is going to be the end, for now.

ATM, I wish I was "Sylar" (from Heroes).  :bdg

skywalker

Some things that'll help you.

grdb debugger debugger forr 16 and 32 bit apps (command line)

Take frequent breaks.

Get Ralph Brown's Interrupt List

Search, download and study any code you can find

Take frequent breaks.

Get IDA for windows (dissassembles and can make an asm source code file as well)