News:

MASM32 SDK Description, downloads and other helpful links
MASM32.com New Forum Link
masmforum WebSite

mov vs. movzx

Started by Astro, April 26, 2010, 01:14:49 AM

Previous topic - Next topic

Slugsnack

Quote from: MichaelW on April 26, 2010, 01:57:22 PM
Quote from: Slugsnack on April 26, 2010, 10:37:55 AM
this is a subject that has been discussed in the past, how :
mov al, byte ptr 8

actually assembles to :
mov al, 8

personally i believe this can be regarded as a 'bug' on the part of the assembler. the first instruction is actually valid.

It is valid, but it is not interpreted as:

mov al, byte ptr [8]

Would be in, for example, Debug or CodeView.

Judging from multiple statements to this effect in the MASM documentation, the PTR operator is intended for specifying operand size. It is not intended for specifying direct memory operands.

yes, however :
mov al, [8]

is also incorrectly interpreted as :
mov al, 8

when surely it should be :
mov al, byte ptr ds:[8]

Astro

The MASM docs are clear in that:

mov eax,Buffer

is the same as:

mov eax,[Buffer]

The [ ] are not explicitly required (this caused me much confusion when I first started!).




The only time [ ] are explicitly required is when dealing with registers:

mov eax,ecx

is NOT the same as:

mov eax,[ecx]




Immediates are not affected, so:

mov eax,8

is the same as:

mov eax,[8]



Quotewhen surely it should be :
mov al, byte ptr ds:[8]

I just tried this - you're correct. [8] should become ds:[8] when built, but it does not.

Best regards,
Robin.

dedndave

welllll - that isn't quite true
try assembling this and see what code is generated...

        mov     al,8
        mov     al,[8]     ;this one may generate an error message - if so, remove it
        mov     al,ds:[8]

Astro

Quotemov     al,[8]     ;this one may generate an error message - if so, remove it
It does not generate an error. [8] becomes an immediate value, not a memory reference.

Best regards,
Robin.

dedndave

here is what i get
00401014 B008                    mov al,08
00401016 B008                    mov al,08
00401018 A008000000              mov al,[00000008]

clive

I love what it did with "mov al,byte ptr [512]"

Microsoft (R) Macro Assembler Version 6.15.8803     04/26/10 14:20:26
test4.asm      Page 1 - 1


        .386
        .MODEL Flat

00000000         .DATA

00000000 01 foo     db      1

00000000         .CODE

00000000 start:

00000000  B0 08         mov al,8
00000002  B0 08         mov al,[8]
00000004  B0 08         mov al,byte ptr [8]
00000006  A0 00000008         mov al,ds:[8]
0000000B  B0 08         mov al,0[8]
0000000D  B0 7F         mov al,[127]
0000000F  B0 80         mov al,[128]
00000011  B0 FF         mov al,[255]
;        mov al,[512] ; chokes
00000013  B0 00         mov al,byte ptr [512]
00000015  A0 00000000 R         mov al,foo

        END start


At least 5.1 errors with both 512 references

Microsoft (R) Macro Assembler Version 5.10                  4/26/10 14:32:34
                                                             Page     1-1


        .386

0000 _data   SEGMENT PARA PUBLIC 'DATA'

0000  01 foo     db      1

0001 _data   ENDS

0000 _text   SEGMENT PARA USE32 PUBLIC 'CODE'
        ASSUME CS:_text, DS:_data

0000 start:

0000  B0 08         mov al,8
0002  B0 08         mov al,[8]
0004  B0 08         mov al,byte ptr [8]
0006  A0 00000008         mov al,ds:[8]
000B  B0 08         mov al,0[8]
000D  B0 7F         mov al,[127]
000F  B0 80         mov al,[128]
0011  B0 FF         mov al,[255]
0013  B0 00         mov al,[512] ; chokes
test5.asm(22): error A2050: Value out of range
0015  B0 00         mov al,byte ptr [512]
test5.asm(23): error A2050: Value out of range
0017  A0 00000000 R         mov al,foo
001C  B0 08         mov al,offset 8

001E _text   ENDS

        END start
It could be a random act of randomness. Those happen a lot as well.

dedndave

that is interesting
it might be nice to know
but, i would hate to depend on it being a feature instead of a bug
otherwise, it could be handy with an equate
i suppose you could trust:

mov al,byte ptr SomeEquate and 255

the older assemblers would have spit out a syntax error, expecting the equate to find a 16-bit destination

MichaelW

Quote from: Slugsnack on April 26, 2010, 05:22:46 PM
yes, however :
mov al, [8]
is also incorrectly interpreted as :
mov al, 8

Coming to MASM from Debug, this particular detail threw me too. The section Direct Memory Operands here specifies how the index operator is used, and under Segment Override:
Quote
A segment name override or the segment override operator identifies the operand as an address expression.
. . .
As the example shows, a constant expression cannot be an address expression unless it has a segment override.

So you can get the behavior that you expect with:

mov al, ds:[8]


eschew obfuscation

clive

MSVC 12.00 is at least consistent. Though I've never really used the opcode in this form, generally indexing from some base register, or referencing a data structure that MASM knows about.

#include <windows.h>

#include <stdio.h>
#include <stdlib.h>

int main(int argc, char **argv)
{
  __asm
  {
        mov al,8
        mov al,[8]
        mov al,byte ptr [8]
        mov al,ds:[8]
        mov al,0[8]
        mov al,[127]
        mov al,[128]
        mov al,[255]
        mov al,[512]
        mov al,byte ptr [512]
  }

  return(0);
}


Disassembly

00000050                    _main:
00000050 55                     push    ebp
00000051 8BEC                   mov     ebp,esp
00000053 53                     push    ebx
00000054 56                     push    esi
00000055 57                     push    edi
00000056 B008                   mov     al,8
00000058 B008                   mov     al,8
0000005A B008                   mov     al,8
0000005C 3EA008000000           mov     al,ds:[8]
00000062 B008                   mov     al,8
00000064 B07F                   mov     al,7Fh
00000066 B080                   mov     al,80h
00000068 B0FF                   mov     al,0FFh
0000006A B000                   mov     al,0
0000006C B000                   mov     al,0
0000006E 33C0                   xor     eax,eax
00000070 5F                     pop     edi
00000071 5E                     pop     esi
00000072 5B                     pop     ebx
00000073 5D                     pop     ebp
00000074 C3                     ret
It could be a random act of randomness. Those happen a lot as well.

hutch--

The general drift with historical Intel notation that MASM more or less preserves si that it is a fully specified language which means you can still write it in much the same way as a CL.EXE asm dump does where it specifies the data size with every instruction.

While other tools use different notation, masm uses named variables which have corresponding addresses so if you have a stack variable [ebp+16] that is named as "var", placing square brackets around it is the same as writing [[ebp+16]] which is ambiguous as the x86 hardware does not have the mechanism for multiple levels of indirection. masm will allow mov eax, [ecx+edx*4][128] where the contents of the second pair of square brackets are ADDED to the address like any normal displacement but with [named_variable] it just ignores the notation and you just get "named_variable".

Over time a shorthand has developed where you can omit the size specifier "BYTE PTR" and similar if the size can be determined from either of the operands by the assembler but where the size cannot be determined the full specification is required.

movzx eax, [ebp+16]  ; goes bang because there is no way to dtermine the size of the data to be zero extended.

movzx eax, WORD PTR [ebp+16]   ; removes the ambiguity.
Download site for MASM32      New MASM Forum
https://masm32.com          https://masm32.com/board/index.php

joemc

oops... i had never zero extended an immediate value (since there really is no point), so i just accidentally typed it that way. I did test it before i posted and it worked :) but i agree it is a confusing way when disassembly and debugging tend to look at it differently.