News:

MASM32 SDK Description, downloads and other helpful links
MASM32.com New Forum Link
masmforum WebSite

Olly and test

Started by jj2007, May 07, 2009, 02:14:47 PM

Previous topic - Next topic

redskull

Here's my interpretation which is, of course, always subject to error.  I'm using the June 2005 version of the Intel manual.  From Volume 2A, Figure 2-1, the first byte is the opcode, and the second byte, labeled ModR/M, breaks down like this:

Mod field: bits 7-6
Reg/Opcode field: 5-3
R/M field: 2-0

From Volume 2B, the "TEST r/m32, r32" instruction is encoded as "85 /r".  In the case of "TEST ECX, EAX", that means eax would be the 'r32', and ecx would be the 'r/m32'.

Going back to volume 2A, table 2-2, section 2.1.5.  I found ECX row in the "Effective Address" Column (7th from the bottom), and then followed it over to the 'EAX' column.  They meet at the 'C1' value, which is what NASM encoded it as, but not MASM.
Just to make sure I matched up the columns right, in the second paragraph it states:

...the Effective Address column lists 32 effective addresses that can be assigned to the first operand of an instruction..

and

... this row (REG=) specifies the use of the 3-bit Reg/Opcode field when the field is used to give the location of a second operand..

I'm pretty certain that means the first operand (ECX) is down the left, and the second (EAX) is across the top.  Comments/Criticisms?  I have been known to mix things up before...

-r


Strange women, lying in ponds, distributing swords, is no basis for a system of government

dedndave

somewhere, i remember seeing some code where the source register was on the left
i don't think it was intel asm86 or masm, though

many of these instructions have 2 (or possibly more) valid op-codes
i.e. there is more than one way to code the same instruction

using TEST (or XCHG, for that matter) leaves it up in the air, as the instruction performs the same task either way

the ModR/M byte, in these cases, has three fields, Mod, reg, R/M
this could be a little confusing, as it is refered to as the "Mod R/M" byte, but the "reg" field is sandwiched between the two others

Mod field - bits 7,6
reg field - bits 5,4,3
R/M field - bits 2,1,0

reg or R/M for EAX=000 binary
reg or R/M for ECX=001 binary
if ECX is specified in the reg field, and EAX is specified in the R/M field, the lower nibble will be 8
if EAX is specified in the reg field, and ECX is specified in the R/M field, the lower nibble will be 1

for the TEST opcode 85, the destination register is specified in the R/M field

85 C1  TEST ECX,EAX
85 C8  TEST EAX,ECX

as a note of interest, the AND instruction has one form where the destination register is specified
in the R/M field and another form where the destination register is specified in the reg field
this may be how MASM coded it wrong - they grabbed the wrong table

21 AND r/m,reg
23 AND reg,r/m

Olly is right - MASM is wrong
I have attached pix from the Intel manual


[attachment deleted by admin]

Jimg

I look at it this way-

if I code
test ecx, eax
  and ecx, eax
  add ecx, eax
  xor ecx, eax

I get                                   
                                 
000001AD  85 C8   test ecx, eax       
000001AF  23 C8    and ecx, eax       
000001B1  03 C8    add ecx, eax       
000001B3  33 C8    xor ecx, eax       
in the listing.                 
                                       
If I look at olly, it says-             
                                 
004011AD   85C8    TEST EAX,ECX         
004011AF   23C8    AND ECX,EAX         
004011B1   03C8    ADD ECX,EAX         
004011B3   33C8    XOR ECX,EAX         
                                 
The intel manual says:                 

03 /r ADD  r32,r/m32   Add r/m32 to r32
23 /r AND  r32,r/m32   r32 AND r/m32
85 /r TEST r/m32,r32   AND r32 with r/m32; set SF, ZF, PF according to result

It seems pretty clear to me, but I could be wrong.

redskull

Exactly; your quote from the intel manual shows that for ADD and AND (at least the 03 and 23 versions of them) take the reg32 first and r/m32 second; the TEST opcode is the reverse.  So, if they all have the same second byte in the binary, TEST should dissassemble to the exact opposite of the ADD and AND, which Olly shows.  Had MASM encoded the ADD as '01' instead of '03', the order of the ADD would match the order of TEST.

-r

**EDIT*** Just to keep on ranting:  AND and ADD both have two different opcodes for register/register operation

01 - ADD r/m32, r32
03 - ADD r32, r/m32

21 - AND r/m32, r32
23 - AND r32, r/m32

whereas TEST only has one:

85 - TEST r/32, r32

So, if MASM choose to encode ADD and AND as 01 and 21, respectivley (where the order of the operands match that of TEST), then the same parameter order should have the same dissassembly.  If MASM choosed the 03 and 23 versions, in which the order is opposite of TEST, then the same paramaters should produce different second bytes, which it didn't.
Strange women, lying in ponds, distributing swords, is no basis for a system of government

Jimg

In the interest of sanity, I'm going to conceed at this point.  We need an Intel cpu engineer to really know how it actually works.

dedndave

lol Jim
reading the PDF's gives me a headache

it would be great to have one of Intel guys in here
one of the core architects
i bet he could write some killer code, too
he could solve all our issues   :lol

btw - d/l my zip file
it is pretty easy to follow
i have red-lined the pics

MichaelW

I don't know which way it's supposed to be, but in my limited test MASM is the odd one.

test ecx, eax
test ecx, ebx
test ecx, edx
test eax, ecx
test ebx, ecx
test edx, ecx


GoAsm disassembled with DumpPE (based on Microsoft tools AFAIK) or DrWatson:

85c1  test  ecx,eax
85d9  test  ecx,ebx
85d1  test  ecx,edx
85c8  test  eax,ecx
85cb  test  ebx,ecx
85ca  test  edx,ecx


Gas disassembled with DumpPE (based on Microsoft tools AFAIK) or DrWatson:

85c1  test ecx,eax
85d9  test ecx,ebx
85d1  test ecx,edx
85c8  test eax,ecx
85cb  test ebx,ecx
85ca  test edx,ecx


DEBUG (32-bit clone, originally by Paul Vojta, now developed by japheth, version 1.14):

6685C1 TEST ECX,EAX
6685D9 TEST ECX,EBX
6685D1 TEST ECX,EDX
6685C8 TEST EAX,ECX
6685CB TEST EBX,ECX
6685CA TEST EDX,ECX


Masm disassembled with DumpPE (based on Microsoft tools AFAIK) or DrWatson:

85c8  test  eax,ecx
85cb  test  ebx,ecx
85ca  test  edx,ecx
85c1  test  ecx,eax
85d9  test  ecx,ebx
85d1  test  ecx,edx


Edit:

Tasm 3.1 disassembled with Turbo Debugger 3.1:

6685C8  test  eax,ecx
6685CB  test  ebx,ecx
6685CA  test  edx,ecx
6685C1  test  ecx,eax
6685D9  test  ecx,ebx
6685D1  test  ecx,edx


At least MASM isn't all alone.
eschew obfuscation

dedndave

MASM and some of the MS debuggers are debuggy as well

Tedd

In the encoding of some instructions, bit 1 of the first byte (the 'd' bit) indicates which way around operand 1 and operand 2 should be interpreted - in the case of two register operands this effectively results in src,dest or dest,src.
If you look at the encoding for SUB, for example, there are actually two and they differ in just this d bit. However, for TEST there is only one encoding, because the other is taken by XCHG.
Anyway, the end result is that it's probably down to a nuance in the way ml produces the encodings for instructions (probably lookup tables, etc.) And since they're both functionally equivalent in the case of TEST, it's not a problem.
That said, it would be comforting to know that you actually get out the code for the instructions you put in.



SUB
0010 100w : 11 reg1 reg2
0010 101w : 11 reg1 reg2

TEST
1000 010w : 11 reg1 reg2

XCHG
1000 011w : 11 reg1 reg2

{w=0: 8-bit operands, w=1: 16/32-bit operands (16 in 16-bit code, 32 in 32-bit code)}

It seems ml defaults to d=1, so the encoding appears backwards for TEST because it requires d=0.
No snowflake in an avalanche feels responsible.

dedndave

cool Tedd
i saw that on "AND" (21/23h)
i did not think to see if that bit was the direction bit for other op-codes
it makes sense, though
the guys at intel that lay these out are pretty sharp, i guess
who knows - they may use some kind of program to help them
they made some good choices early on with the 8086/8088
i suppose they made some bad ones, too
hard to have a crystal ball and predict future requirements
processor architecture has come a long way in 30 years
come to think of it, if i were to make a list of areas
where development has accelerated the most over that period,
microprocessor architecture would have to be near the top