News:

MASM32 SDK Description, downloads and other helpful links
MASM32.com New Forum Link
masmforum WebSite

Still learning.... after all these years!

Started by glegman, February 05, 2008, 03:23:43 PM

Previous topic - Next topic

glegman

Firstly as a newby i'd like to say hi to everyone. :bg

I've recently become aware of something regarding a difference between how MASM and NASM generate code. As an example take 'xor ax,ax'. Simple enough!

When MASM assembles  it generates 0x33,0xc0

When NASM is used it generates 0x31,0xc0

So checked with the Intel site (GOD ALMIGHTY) 'The Software Developers manual, Appendix A' shows that there are two instructions suitable.

XOR—Logical Exclusive OR    
Opcode    Instruction    Description    
34 ib    X O R AL, imm8    AL XO R imm8    
35 iw    X O R AX , imm16    AX X O R imm16    
35 id    X O R EA X, imm32    EA X XO R imm32    
80 / 6 ib    XO R r/m8,imm8    r/m8 XO R imm8    
81 / 6 iw    XO R r/m16,imm16    r/m16 XO R imm16    
81 / 6 id    XO R r/m32,imm32    r/m32 XO R imm32    
83 / 6 ib    XO R r/m16,imm8    r/m16 XO R imm8 (sign-extended)    
83 / 6 ib    XO R r/m32,imm8    r/m32 XO R imm8 (sign-extended)    
30 / r    XO R r/m8,r8    r/m8 XO R r8    
31 / r    XO R r/m16,r16    r/m16 XO R r16     <-------------------------- NASM
31 / r    XO R r/m32,r32    r/m32 XO R r32    
32 / r    XO R r8,r/m8    r8 XO R r/m8    
33 / r    XO R r16,r/m16    r8 XO R r/m8     <------------------------- MASM
33 / r    XO R r32,r/m32    r8 XO R r/m8    

I suppose the question i'm asking is, is it possible to get MASM to use the alternate hex code? If so how?

If not then i suppose that its somthing i'll have to live with, after all they both do the same thing. This also appears to apply to 8 and 32 bit instructions as well as other instructions, mov, add etc. Anything that uses two registers.

Any comments?
Glegman :U

ToutEnMasm

Hello,
The job of masm is to translate text files into hexadécimal files.So,if you want to use hexadécimal code,you don't need masm.You can use hexadécimal code with a debugger.
If you want to see how masm work,ml can made listing and other files that show the final result.

jj2007

Quote from: glegman on February 05, 2008, 03:23:43 PM
Firstly as a newby i'd like to say hi to everyone. :bg
Welcome on board. "Newbie" does not seem entirely correct  :wink

Quote
after all they both do the same thing.
Can you give us a "real" code example where this would produce different results?

Jimg

Interesting.  I never noticed those typos in the Intel manuals before.
And the typos are carried over into every other help file and manual I have.
Finally found a corrected description, vol 2b, page 4-411. Looks like they fixed it in the 64 bit manuals.
And we're all curious why you would want one op code over the other.

glegman

Here are two examples to show what's going on. The first is using MASM. The second NASM. The examples are not identical and have been butchered to reduce space.

Please don't get distracted by what i'm trying to do but what the assemblers are doing when i xor eax with eax.

MASM is generating 33c0 hex at memory location 00000016


Microsoft (R) Macro Assembler Version 6.14.8444          02/06/08 03:42:53
Test.asm                       Page 1 - 1


            .586
            .model flat,stdcall
            option casemap:none
            .nolist
               
            ;   includelib user32.lib
               includelib kernel32.lib

            .listall
            WinMain proto :DWORD,:DWORD,:DWORD,:DWORD


00000000         .data
00000000 4D 61 69 6E 57      ClassName db "MainWinClass",0
      69 6E 43 6C 61
      73 73 00
0000000D 4D 61 69 6E 20      AppName  db "Main Window",0
      57 69 6E 64 6F
      77 00

00000000         .data?
00000000 00000000         hInstance HINSTANCE ?
00000004 00000000         CommandLine LPSTR ?

00000000         .code


00000000         start:
               invoke GetModuleHandle, NULL
00000000  6A 00      *       push   +000000000h
00000002  E8 00000000 E   *       call   GetModuleHandleA
00000007  A3 00000000 R      mov    hInstance,eax
               
               invoke GetCommandLine
0000000C  E8 00000000 E   *       call   GetCommandLineA
00000011  A3 00000004 R      mov    CommandLine,eax
               
            ;   invoke WinMain, hInstance,NULL,CommandLine, SW_SHOWDEFAULT
00000016  33 C0         xor eax,eax
               invoke ExitProcess,eax
00000018  50         *       push   eax
00000019  E8 00000000 E   *       call   ExitProcess


            end start


Using NASM (-l option) shows at 00000009 the code 31c0 hex.


12081                                  [SECTION CODE USE32 CLASS=CODE]
12082                                 
12083                                   ..start:
12084                                   
12085 00000000 C8000000                   enter   0,0
12086                                  ;   call   GetModuleHandleA,NULL
12087 00000004 A3[1F040000]               mov      [hInst],eax
12088                                  ;   call   DialogBoxParamA,[hInst],1000,0,@Dialog1Proc,0   ;defwinproc is the boss...
12089 00000009 31C0                       xor   eax,eax
12090                                  ;   call   ExitProcess,[hInst]

If your not convinced then try using debug.

C:\nasm>debug
-e 100
1395:0100  00.33
-e 101
1395:0101  00.c0
-u 100
1395:0100 33C0          XOR     AX,AX


-e 100
1395:0100  00.31
-e 101
1395:0101  00.c0
-u 100
1395:0100 31C0          XOR     AX,AX

Regarding my choice of op code.......
This all started as an interest regarding os kernels and how they work. Have tried severel on the net but most seem to use nasm rather than masm. When i used masm it generated a different code. I've always been a fan of masm ever since version 1.25! But never looked closly at the code generated as long as it works!

I suppose the point i'm making is that not all assemblers are the same.

MichaelW

I would guess that there are many instructions with more than one encoding. There are, for example, at least three different encodings for lea esp, [esp]:

00401000 8D2424          lea esp,[esp]
00401003 8D642400        lea esp,[esp]
00401007 8DA42400000000  lea esp,[esp]


MASM generates the first encoding normally, and the longer encodings to serve as alignment nops.
eschew obfuscation

zooba

The difference (ignoring the typo as Jimg noted) is the order of the operands:

31 /r   XOR r/m32, r32      r/m32 XOR r32
33 /r   XOR r32, x/m32      r32 XOR r/m32


The first supports using a memory location as the destination but requires a register as the source, for example:

.data
    value DWORD 100
.code
    XOR value, eax


The second requires a register as the destination but supports a memory location or a register as the source:

.data
    value DWORD 100
.code
    XOR eax, value


With two registers they would appear to be interchangeable.

MichaelW,

If you count memory location encodings (as you have) as well as instruction encodings there are millions upon millions of combinations. :wink

Cheers,

Zooba :U

MichaelW

I probably should have stated that as "more than one encoding that will produce the same end result".
eschew obfuscation

zooba

Quote from: MichaelW on February 06, 2008, 12:01:03 AM
I probably should have stated that as "more than one encoding that will produce the same end result".

Which is the whole point of this thread. :bg

Tight_Coder_Ex

Whenever you need specific encoding that supersedes what the assembler wants to do use

       
       db     31H, 0C0H
       invoke   ExitProcess, eax


I've used this method with MASM where I want an alternate return path within a procedure but still want MASM to generate the appropriate epilogue.


      ........  My code
      db   0C3H


glegman

db 31h,0c0h

Thats exactly what i ended up doing.

Thanks for everyones input.

Glegman  :U