News:

MASM32 SDK Description, downloads and other helpful links
MASM32.com New Forum Link
masmforum WebSite

Registers Can Combine?

Started by 2-Bit Chip, August 18, 2009, 03:17:26 AM

Previous topic - Next topic

2-Bit Chip


MUL

If "src" is a word value, then AX is
multiplied by "src" and DX:AX receives the result.  If "src" is
a double word value, then EAX is multiplied by "src" and EDX:EAX
receives the result.


DX:AX? EDX:EAX? I have also seen:
invoke crt_printf,chr$("%.2f MHz%c"),edx::eax,10

dedndave

DX:AX means those two 16-bit (word) registers are paired to hold a 32-bit (dword) value
EDX:EAX means those two 32-bit (dword) registers are paired to hold a 64-bit (qword) value
the same is more or less true for the invoke
except the 2 dword registers are pushed onto the stack seperately, but still form a 64-bit value
i think the high order register is pushed first
so, that would probably work the same as...

invoke crt_printf,chr$("%.2f MHz%c"),eax,edx,10

although, the PROTO for that invoke may only allow a qword - not sure, as i don't use it - lol

MichaelW

The single colon form is the normal notation, with the high-order WORD or DWORD first and the low-order WORD or DWORD last, just as we express numbers with the highest-order digit first and the lowest-order digit last. For whatever reason, for INVOKE Microsoft chose to use the same notation but with a double colon.

When placing a 64-bit value on a 32-bit stack (or a 32-bit value on a 16-bit stack), it's necessary to consider the storage order in memory, which for the x86 processors places the highest order BYTE, WORD, or DWORD at the highest address, and the lowest order BYTE, WORD, or DWORD at the lowest address. Since the stack "grows" down in memory, with PUSH decreasing the stack pointer and POP increasing it, to place a 64-bit value on a 32-bit stack you would need to push the high-order DWORD first and the low-order DWORD last.

So either of these statements:

invoke crt_printf,chr$("%.2f MHz%c"),edx::eax,10
invoke crt_printf,chr$("%.2f MHz%c"),eax,edx,10

Would assemble to:

00401000 6A0A                   push    0Ah
00401002 52                     push    edx
00401003 50                     push    eax
00401004 6800304000             push    403000h
00401009 FF151C204000           call    dword ptr [printf]

Edit:

I see that I left out a key point. The order in which the arguments are pushed is determined by the calling convention in use. For the C calling convention used for the INVOKE statements above, as well as the STDCALL calling convention that is normal for the API functions, the arguments are pushed in right to left order as they appear in the invoke statement, prototype, or procedure/function definition or declaration.


eschew obfuscation

2-Bit Chip

Thank you very much you two.

I appreciate to how in-depth you went to help me understand, Michael. I thank you. :dance:

Rockoon

You will also find that the division instructions can take edx:eax (or dx:ax) as input .. that is, they divide a 64-bit value..

If you want to divide a signed 32-bit value in eax, you must first sign extend it into edx.. and if its an unsigned value, you must zero out edx first.
When C++ compilers can be coerced to emit rcl and rcr, I *might* consider using one.

Mirno

It is not strictly true that you must zero out edx - you just need to know that the result of the division will be 32 bits or less...
In practice however most of the time cdq (sign extend eax to edx) or xor edx,edx will be all you need!

Rockoon

Quote from: Mirno on August 18, 2009, 11:11:24 AM
It is not strictly true that you must zero out edx - you just need to know that the result of the division will be 32 bits or less...
In practice however most of the time cdq (sign extend eax to edx) or xor edx,edx will be all you need!

Not true.

; calculate 16 divided by 8
mov eax, 16
mov ebx, 8
idiv ebx

Is the result correct? Its supposed to be 2 (a 2-bit number is less than 32-bits)

If EDX contained *any* value other than 0, then the result will not be correct.
When C++ compilers can be coerced to emit rcl and rcr, I *might* consider using one.

jj2007

Quote from: Rockoon on August 18, 2009, 10:48:28 AM
If you want to divide a signed 32-bit value in eax, you must first sign extend it into edx.. and if its an unsigned value, you must zero out edx first.


Correct. More specifically, if the value in eax is below 7fffffh, cdq will always do the job; if the value of eax is above or equal 80000000h, then cdq plus idiv still works fine but cdq plus div will raise an exception. The latter case is rare, however, so you might as well go for the exception, at least in the debug phase :bg

dedndave

if overflow is a problem, you can cascade DIV instructions to handle larger quotients or smaller divisors
if DividendA is a 128-bit input value:

DividendA  dd 0FFFFFFFFh,0FFFFFFFFh,0FFFFFFFFh,0FFFFFFFFh
DivisorA   dd 8123456h
QuotientA  dd 4 dup(?)
RemainderA dd ?

        mov     ecx,DivisorA
        xor     edx,edx

        mov     eax,DividendA+12
        div     ecx
        mov     QuotientA+12,eax

        mov     eax,DividendA+8
        div     ecx
        mov     QuotientA+8,eax

        mov     eax,DividendA+4
        div     ecx
        mov     QuotientA+4,eax

        mov     eax,DividendA
        div     ecx
        mov     QuotientA,eax

        mov     RemainderA,edx

Notice that, for each DIV instruction, the remainder
becomes the high-order portion of the dividend for the
next DIV instruction. The last remainder is the remainder
of the entire divide operation.

Also, it is important to note that the divisor is
still limited to a 32-bit value. Accomodating larger
divisors is a much more complicated issue, as the
remainder register needs to be larger, as well.
I think you are stuck with long division, then.

Mirno

Hi Rockoon,
It is true, the division is 64 bit but as I said the result must fit in 32 bits.

If you load 16 into edx, and zero out eax, then divide by 32 i.e. you are dividing 0x10 0000 0000 by 0x20 you will get the result 0x8000 0000 in eax, which is the correct result.

As I (and Intel's documents) said, the division is considered 64 bit, so if you don't clear/sign extend to edx and want to do 32 bit division then you will obviously get nonsense. However this does not mean that edx must be 0 or -1.

Mirno


Rockoon

Quote from: Mirno on August 19, 2009, 08:31:03 AM
Hi Rockoon,
It is true, the division is 64 bit but as I said the result must fit in 32 bits.

I explicitely talked about division *OF* 32-bit values, by mentioning that fact clearly.

If you wanted to get into the division of 64-bit values, you should have said so. I think that the thread starter would like the people who respond to be informative instead of cryptic.
When C++ compilers can be coerced to emit rcl and rcr, I *might* consider using one.

Mirno

If we're going to be all pedantic about things, then the starter of the thread didn't mention that they wanted 32 bit or 64 bit, or even division.
If you want to start picking fights feel free to continue - but the fact is you were providing incomplete information on a topic 2-bit chip had not asked about.

You may have mentioned that your case was specific to 32bits. However the general case is that the division is 64bit, the result of that division must fit in 32 bits, most of the time people will simply want to either sign extend (cdq) or zero (xor edx,edx), but the oportunity exists, should it be needed to divide a full 64bit number.

Mirno

Rockoon

Quote from: Mirno on August 19, 2009, 03:43:34 PM
If you want to start picking fights feel free to continue - but the fact is you were providing incomplete information on a topic 2-bit chip had not asked about.

I provided incomplete, but accurate information, on the topic he did ask about (the usage of edx:eax)

I'm not here to pick fights, but I do respond in kind.

Quote from: Mirno on August 19, 2009, 03:43:34 PM
You may have mentioned that your case was specific to 32bits. However the general case is that the division is 64bit, the result of that division must fit in 32 bits, most of the time people will simply want to either sign extend (cdq) or zero (xor edx,edx), but the oportunity exists, should it be needed to divide a full 64bit number.


Let me quote your remark:

QuoteIt is not strictly true that you must zero out edx - you just need to know that the result of the division will be 32 bits or less... In practice however most of the time cdq (sign extend eax to edx) or xor edx,edx will be all you need!

What help is this exactly?

It leads the reader to believe that if the result is going to be greater than 32-bits, that the solution to this problem is to zero out or sign extend into edx! Thats not true. That leads to incorrect results. If the result is going to be greater than 32-bits, no change to edx is going to solve it.

In practice, cdq or xor edx, edx *only* applies to 32-bit division. It never applies to 64-bit values.
When C++ compilers can be coerced to emit rcl and rcr, I *might* consider using one.