a strange problem for me !!, a reg and a var, have the same value, but...

Started by Dasar, July 15, 2006, 10:10:45 AM

Previous topic - Next topic

Dasar

hi

i made this little program to encrypt a file with a simple xor method...

this is the steps i did:

1- open the file to read.
2- get file size.
3- read it in memory, and get the pointer of the first byte.
4- encrypt all the charactars in the file.
5- close the file.


the problem that i can't understand its cause, is:

when i get the pointer of the first byte, i put it in a var called "p" , and assign the value of "p"  in "ebx" ...
if i used "p" in the loop of "encryption" (not sure if this word is true :]  ) ,  the code doesn't work, but if i use ebx (which has the
same value ) the code works fine , why ???
they both have the same value, and the "ebx" gets its value from "p", so why "p" doesn't work, while ebx does ????


the code:

.386
.model flat, stdcall
option casemap:none

include \masm32\include\windows.inc
include \masm32\include\user32.inc
include \masm32\include\kernel32.inc
includelib \masm32\lib\user32.lib
includelib \masm32\lib\kernel32.lib


.data
  a_file   db  "Notice.txt",0
 

.data?
    han HANDLE  ?
    memHan HANDLE ?
    ReadBytes   dword   ?
    filesize    dword   ?
    p   dword   ?


.code

  start:
  push ebx
 
  invoke CreateFile, Addr a_file, GENERIC_READ OR GENERIC_WRITE, FILE_SHARE_WRITE OR FILE_SHARE_READ, NULL,

OPEN_EXISTING,\
            FILE_ATTRIBUTE_NORMAL, 0

  mov han, eax

invoke GetFileSize, han, NULL
mov filesize, eax
invoke GlobalAlloc, GMEM_FIXED, filesize
mov memHan, eax
invoke GlobalLock, memHan

mov p, eax    ; "p"  is the pointer of the first byte of file in memory...
mov ebx, p   ; "ebx" get its value from "p"...

invoke ReadFile, han, p, filesize, Addr ReadBytes, NULL

xor ecx, ecx
label1:
mov dl, BYTE PTR [ebx + ecx]   ; here if i use "p"  rather than "ebx", the code doesn't work !!!!!
xor dl, BYTE PTR 0FEDAh   ; i know that i have do it: "xor dl, 0DAh", but both work fine :)
mov BYTE PTR [ebx + ecx], dl    ; here if i use "p"  rather than "ebx", the code doesn't work !!!!
     ; i know, that there is no need to use, BYTE PTR in this loop, because the target is a Byte...
inc ecx
cmp ecx, ReadBytes   ; test if we are done all the read bytes or not...
jne label1      ; if not, then go back again...

invoke SetFilePointer, han, 0, NULL, FILE_BEGIN
invoke WriteFile, han, p, filesize, Addr ReadBytes,  NULL   ; this works fine, so i'm sure that "p" has the correct address !!!
invoke CloseHandle, han
invoke GlobalFree, memHan
pop ebx
invoke ExitProcess, 0

end start


================================


thank you alot for any help :)




Ian_B

It could be that MASM is mistaking your variable "p" for a reserved word. It's best to use longer, more descriptive variable names to avoid this possibility, one-character names are asking for trouble...  :bg But why use it anyway when your working solution is SO much more efficient? Referencing a memory variable p rather than using a register value directly is MUCH slower, and you are doing it for EVERY byte in your file.

This line is also very strange:

xor dl, BYTE PTR 0FEDAh

Are you XORing DL with the VALUE 0DAh? Because the BYTE PTR implies that you are XORing with the value of the byte at memory location 0FEDAh, which even if valid may be subject to change between runs of your code, so you might not get back the original file when you decrypt. If it's 0DAh you want, simplify by keeping this value in a register. Also, you can speed up the code by more than a factor of 4 by XORing full DWORDs at a time rather than single bytes, especially if they are aligned in memory (your memory block will be at least DWORD aligned).

You are doing a memory pull, an XOR in a register and then a memory write - this can all be reduced to a single XOR in memory, much more efficient. Finally, it's generally MUCH better practise to have a decrementing counter that tests for 0 or sign as a result of the decrement rather than an incrementing one that you have to explicitly compare with the target value requiring another op.

Consider this:

push edi    ; an extra register often comes in handy for values used often
; keep/use values in regs wherever possible to avoid memory accesses

invoke GetFileSize, han, NULL
mov edi, eax    ; EDI preserved through API calls

invoke GlobalAlloc, GMEM_FIXED, eax    ; EAX is still filesize value
mov memHan, eax
invoke GlobalLock, memHan
mov ebx, eax    ; forget keeping "p", you don't need it

invoke ReadFile, han, ebx, edi, ADDR ReadBytes, NULL

; if you want to XOR EVERY byte with value 0DAh, then put in a reg:
mov edx, 0DADADADAh    ; DL is also 0DAh for XORing single bytes

; now we need to find how many extra bytes over DWORDs in the file:
mov eax, 3
mov ecx, edi    ; EBX+ECX points to byte after EOF
and eax, edi    ; remainder bytes to handle singly in AL, AH free
jz dwordxorloop    ; skip if not needed

bytexorloop:
sub ecx, 1
xor [ebx+ecx], dl
sub al, 1
jnz bytexorloop    ; loops max three times

; EBX+ECX now points to byte after last whole DWORD in the file

dwordxorloop:
sub ecx, 4    ; get next DWORD if there are any left
jb xordone    ; last one is at filestart, ECX = 0
xor [ebx+ecx], edx    ; much simpler!
jmp dwordxorloop

xordone:
xor eax, eax    ; NULL = 0 = FILE_BEGIN so save a few bytes
invoke SetFilePointer, han, eax, eax, eax
invoke WriteFile, han, ebx, edi, ADDR ReadBytes, NULL    ; EBX/EDI still valid
pop edi    ; can release them now
pop ebx


You also need to make sure you add appropriate error-handling in case the readfile/writefile/getfilesize/setfilepointer don't return correctly - you can either simply check the return value as shown in the API reference, or to be even more bullet-proof you can confirm the number of bytes written/read (returned in the ReadBytes variable) by comparing it with the filesize, easy here as it's stored throughout in EDI.

Hope this is useful.
Ian_B

Casper

Dasar
The following line is problematic.

xor dl, BYTE PTR 0FEDAh

A byte ptr should not masquerade as a value, it will definitely confuse the assembler.  If it is a pointer, get rid of the h (and probably the zero), if it is a value, then get rid of the byte ptr syntax but in no way should you ever use the two together.

Also, the reason a register (ebx) works better than a variable (P) is because a register implicitly sets the size to be used, a variable does not.  So using a register goes a long way towards helping the assembler make correct assumptions.

Paul

Dasar

hi Ian_B,

i have littel questions

i know "and" instruction, i know it sets the bit if both the destination, and source bits are set, and clear otherwise.

i also know "jz", it jumps if the zero flag is set... but i don't know why do you use "mov eax, 3" and then used "and eax, edi", what does "and eax, edi" do here ?
is it for getting the "remainder of division",  like "mod" in delphi ? ( if you know about delphi :]  )
is the zero flag will set if the result of "and" (which will stored in eax) is zero ??

also, want to ask about "sub al, 1"   &  "jnz bytexorloop"
does the "sub" will set the zero flag if the destination (which is "al"  here)  got zero ?? so, we can use "jnz"  ?

please correct me if there is somthing i understood wrong here :D


Ian_B, thank you a lot for your help..

QuoteHope this is useful.

its very very useful, and perfect for me ^_^



@Casper

thank you alot for your help and comments, but i don't know why it works :)

i tested "xor dl, BYTE PTR 0FEDAh", (yes i know its not good "as you told me"),  but it works fine. the same as "xor dl, 0DAh"

Ian_B

Quote from: Dasar on July 15, 2006, 03:09:21 PMi also know "jz", it jumps if the zero flag is set... but i don't know why do you use "mov eax, 3" and then used "and eax, edi", what does "and eax, edi" do here ?
is it for getting the "remainder of division",  like "mod" in delphi ? ( if you know about delphi :]  )
is the zero flag will set if the result of "and" (which will stored in eax) is zero ??
Sort of...  :P

Using AND in this way works for all powers of 2 that you want to "div/mod" by, the AND value just has to be (mod value - 1) - in this case we need the remainder left over 4 bytes, so we use 3. It uses the value 3 as a bitmask to remove the bits representing exact multiples of 4, which we don't want, leaving only the remainder.

Suppose your filelength is 32 bytes. That's divisible by 4 exactly. The binary representation of 32 is 100000 - you see there are no bits set in the lowest 2 digits. If you AND with 3 (11 in binary) then the result is zero, there are no extra single bytes to process so we want to skip that code.

If your filelength was 33 bytes, then the binary representation of the filelength is now 100001 - when you AND with 3 the result is 1, the number of bytes remainder when you "div" by 4, or the "mod" value, yes. So now we can use that as a loop counter to process the number of remaining bytes singly, the rest of the register is empty.

Quotealso, want to ask about "sub al, 1"   &  "jnz bytexorloop"
does the "sub" will set the zero flag if the destination (which is "al"  here)  got zero ?? so, we can use "jnz"  ?

Yes, the loop continues until AL = 0, so we process every remainder byte. Exactly.

Quotei tested "xor dl, BYTE PTR 0FEDAh", (yes i know its not good "as you told me"),  but it works fine. the same as "xor dl, 0DAh"

It shouldn't.  :eek  The problem is the BYTE PTR in that line. As Casper said, it mustn't be there unless you mean the 0FEDAh value to mean the MEMORY LOCATION 0FEDAh, which isn't what you want at all. If you remove it, then you'd get an error because you can't XOR a byte-length register (DL) with a WORD-length value.

Only use BYTE PTR or DWORD PTR when you need to let the assembler know that you are referring to a byte-length (or DWORD length) memory location when it might not otherwise be clear. For example, when putting immediate data into it:
MOV DWORD PTR MemLocation, 0FFh    ; puts 4 bytes 000000FFh
MOV BYTE PTR MemLocation, 0FFh    ; puts single byte 0FFh


or when you have defined a variable as one size and need to access it as a different-length object:
.data
Buffer db 20 DUP 0
LoValue WORD 0
HiValue WORD 0

.code

...

xor eax, eax
; test if buffer is still empty
cmp DWORD PTR Buffer, eax
; Buffer was defined as a series of bytes, so comparing to a DWORD-length register
; will cause an error - you must use DWORD PTR here to "cast" the memory
; location correctly to match the size of EAX

mov cx, LoValue
mov dx, HiValue
; these are fine, WORD-sized variables, WORD-length registers
mov DWORD PTR LoValue, eax
; we can clear both in a single go with EAX = 0 rather than with 2 memory writes,
; but again EAX is DWORD in length and LoValue was defined as a WORD, so
; "cast" the location as a DWORD to avoid an error


Ian_B

Dasar

Ian_B , thank you alot for your explanation, its very clear, i understood very well :)

thank you for your time ^_^