Hi,
Just want to check I understand this correctly:
mov doesn't always overwrite the upper part of a register?
If that is correct, is:
movzx ecx,eax
equivalent to:
xor ecx,ecx
mov ecx,eax
Best regards,
Robin.
I think you have the right idea, but mov ecx,eax is going to move ALL of eax into ecx so the xor isnt going to do much for you. movzx is good for when you want to move one object that is a different size into a larger one.
a better example would be
movzx ecx, byte ptr 9
is equivalent to
xor ecx ecx
mov cl, 9
the main time is use it is for string pointers
to get one character from a string pointed to by eax
movzx ecx, byte ptr [eax]
::)
I keep forgetting that part - SIZE DIFFERENCE.
Thanks.
Best regards,
Robin.
Robin,
movzx is useful for a couple of reasons, you can tweak the size of a smaller piece of data up to the size you want but it has the added advantage of clearing a partial register read which on some earlier Intel hardware gave you a bad stall.
Typically you have a situation of a BYTE of DATA at an address whewre you use MOVZX to copy it into a full 32 bit register so it can be used with full 32 bit comparisons.
movzx eax, WORD PTR [esp+22]
This gets a 16 bit WORD directly off the stack and writes it to EAX so you can test against it with full 32 bit instructions which is generally faster than using 16 bit compares.
Nice!! Thanks for the info. I was wondering if:
cmp al,1
was faster or the same as
cmp eax,1
You answered that one! :bg
Best regards,
Robin.
Quote from: joemc on April 26, 2010, 01:39:40 AM
I think you have the right idea, but mov ecx,eax is going to move ALL of eax into ecx so the xor isnt going to do much for you. movzx is good for when you want to move one object that is a different size into a larger one.
a better example would be
movzx ecx, byte ptr 9
is equivalent to
xor ecx ecx
mov cl, 9
the main time is use it is for string pointers
to get one character from a string pointed to by eax
movzx ecx, byte ptr [eax]
actually those 2 are not equivalent. byte ptr 9 is not the same as the value 9. byte ptr 9 is the byte that is pointed to by memory address 9. most likely that would cause an access violation since no memory is allocated there.
@ Astro : generally, you dealing with larger registers rather than partial registers is quicker. so cmp eax, 1 would be faster in this case
Quote from: Astro on April 26, 2010, 01:14:49 AM
If that is correct, is:
movzx ecx,eax
equivalent to:
xor ecx,ecx
mov ecx,eax
Robin,
movzx eax, ecx will first of all generate error A2070:invalid instruction operands
What works is
movzx eax, ax
movsx eax, ax (sign extension)
movzx eax, cl
movsx eax, cl (sign extension)
movzx eax, word ptr [mem]
movsx eax, word ptr [mem]
movzx eax, byte ptr [mem]
movsx eax, byte ptr [mem]
Hi jj,
Yes - I already discovered that! :bg Thanks for pointing it out though.
@joemc:Quoteactually those 2 are not equivalent. byte ptr 9 is not the same as the value 9.
I've seen it written both ways, and appears to work identically?
e.g.
mov al, byte ptr 8
mov byte ptr al,8
O/T slightly: I've found a weird error I get in one app but not another.
SomeProc proc Buffer:DWORD
mov eax,Buffer
mov eax,[eax] ; this line fails build in one app but works in another????? Build options are the same except processor type.
; working processor is 386, failing processor is 486.
Best regards,
Robin.
Astro : those instructions do very different things. the first treats al as a pointer and moves the byte value at memory address 8 into it. this would cause an access violation on most machines since that is below the minimum application address. the second is not even a valid instruction. you are telling the computer to move the value 8 into the byte pointed to by al. the machine can not treat al as a pointer however, since pointers are all 32 bits.. even if we were to do the following :
movzx eax, al
mov byte ptr ds:[eax], 8
that is very very different to the 'opposite' of the first instruction which i assume was your intention
For both of these statements:
mov al, byte ptr 8
mov byte ptr al, 8
MASM ignores the unnecessary BYTE PTR operators (unnecessary because the size can be determined from the destination operand) and assembles a MOV reg, immed:
MOV AL, 8
this is a subject that has been discussed in the past, how :
mov al, byte ptr 8
actually assembles to :
mov al, 8
personally i believe this can be regarded as a 'bug' on the part of the assembler. the first instruction is actually valid. despite the fact that in 90% of cases, the user intended the second instruction, i do not believe it should be just changed.
above is a clear example of where there has been confusion about this already. the equivalence joemc claims is actually true if that is what you input into masm32 since it is 'translated' like above. however if you assembled that code within ollydbg, the semantics becomes very different
So MASM behavior is:
mov al, byte ptr 8
to:
mov al, 8
which is actually INCORRECT?
Best regards,
Robin.
MOV AL,8 is good enough
no need to use "byte ptr", as AL is a byte register, so the operand can only be a byte
Quote from: Astro on April 26, 2010, 01:14:49 AM
Hi,
Just want to check I understand this correctly:
mov doesn't always overwrite the upper part of a register?
If that is correct, is:
movzx ecx,eax
equivalent to:
xor ecx,ecx
mov ecx,eax
Best regards,
Robin.
movzx ecx, eax is not a valid instruction
the syntax for movzx (move with zero extend)
movzx r32, r/m16
movzx r32, r/m8
movzx r16, r/m8
i'm pretty sure you just made a typo and you meant movzx ecx, ax. if thats what you meant then you're right
xor ecx, ecx
mov cx, ax
would accomplish the same thing as
movzx ecx, ax
Quote from: Slugsnack on April 26, 2010, 10:37:55 AM
this is a subject that has been discussed in the past, how :
mov al, byte ptr 8
actually assembles to :
mov al, 8
personally i believe this can be regarded as a 'bug' on the part of the assembler. the first instruction is actually valid.
It is valid, but it is not interpreted as:
mov al, byte ptr [8]
Would be in, for example, Debug or CodeView.
Judging from multiple statements to this effect in the MASM documentation, the PTR operator is intended for specifying operand size. It is not intended for specifying direct memory operands.
Quote from: MichaelW on April 26, 2010, 01:57:22 PM
Quote from: Slugsnack on April 26, 2010, 10:37:55 AM
this is a subject that has been discussed in the past, how :
mov al, byte ptr 8
actually assembles to :
mov al, 8
personally i believe this can be regarded as a 'bug' on the part of the assembler. the first instruction is actually valid.
It is valid, but it is not interpreted as:
mov al, byte ptr [8]
Would be in, for example, Debug or CodeView.
Judging from multiple statements to this effect in the MASM documentation, the PTR operator is intended for specifying operand size. It is not intended for specifying direct memory operands.
yes, however :
mov al, [8]
is also incorrectly interpreted as :
mov al, 8
when surely it should be :
mov al, byte ptr ds:[8]
The MASM docs are clear in that:
mov eax,Buffer
is the same as:
mov eax,[Buffer]
The [ ] are not explicitly required (this caused me much confusion when I first started!).
The only time [ ] are explicitly required is when dealing with registers:
mov eax,ecx
is NOT the same as:
mov eax,[ecx]
Immediates are not affected, so:
mov eax,8
is the same as:
mov eax,[8]
Quotewhen surely it should be :
mov al, byte ptr ds:[8]
I just tried this - you're correct.
[8] should become
ds:[8] when built, but it does not.
Best regards,
Robin.
welllll - that isn't quite true
try assembling this and see what code is generated...
mov al,8
mov al,[8] ;this one may generate an error message - if so, remove it
mov al,ds:[8]
Quotemov al,[8] ;this one may generate an error message - if so, remove it
It does not generate an error. [8] becomes an immediate value, not a memory reference.
Best regards,
Robin.
here is what i get
00401014 B008 mov al,08
00401016 B008 mov al,08
00401018 A008000000 mov al,[00000008]
I love what it did with "mov al,byte ptr [512]"
Microsoft (R) Macro Assembler Version 6.15.8803 04/26/10 14:20:26
test4.asm Page 1 - 1
.386
.MODEL Flat
00000000 .DATA
00000000 01 foo db 1
00000000 .CODE
00000000 start:
00000000 B0 08 mov al,8
00000002 B0 08 mov al,[8]
00000004 B0 08 mov al,byte ptr [8]
00000006 A0 00000008 mov al,ds:[8]
0000000B B0 08 mov al,0[8]
0000000D B0 7F mov al,[127]
0000000F B0 80 mov al,[128]
00000011 B0 FF mov al,[255]
; mov al,[512] ; chokes
00000013 B0 00 mov al,byte ptr [512]
00000015 A0 00000000 R mov al,foo
END start
At least 5.1 errors with both 512 references
Microsoft (R) Macro Assembler Version 5.10 4/26/10 14:32:34
Page 1-1
.386
0000 _data SEGMENT PARA PUBLIC 'DATA'
0000 01 foo db 1
0001 _data ENDS
0000 _text SEGMENT PARA USE32 PUBLIC 'CODE'
ASSUME CS:_text, DS:_data
0000 start:
0000 B0 08 mov al,8
0002 B0 08 mov al,[8]
0004 B0 08 mov al,byte ptr [8]
0006 A0 00000008 mov al,ds:[8]
000B B0 08 mov al,0[8]
000D B0 7F mov al,[127]
000F B0 80 mov al,[128]
0011 B0 FF mov al,[255]
0013 B0 00 mov al,[512] ; chokes
test5.asm(22): error A2050: Value out of range
0015 B0 00 mov al,byte ptr [512]
test5.asm(23): error A2050: Value out of range
0017 A0 00000000 R mov al,foo
001C B0 08 mov al,offset 8
001E _text ENDS
END start
that is interesting
it might be nice to know
but, i would hate to depend on it being a feature instead of a bug
otherwise, it could be handy with an equate
i suppose you could trust:
mov al,byte ptr SomeEquate and 255
the older assemblers would have spit out a syntax error, expecting the equate to find a 16-bit destination
Quote from: Slugsnack on April 26, 2010, 05:22:46 PM
yes, however :
mov al, [8]
is also incorrectly interpreted as :
mov al, 8
Coming to MASM from Debug, this particular detail threw me too. The section Direct Memory Operands
here (http://webster.cs.ucr.edu/Page_TechDocs/MASMDoc/ProgrammersGuide/Chap_03.htm) specifies how the index operator is used, and under Segment Override:
Quote
A segment name override or the segment override operator identifies the operand as an address expression.
. . .
As the example shows, a constant expression cannot be an address expression unless it has a segment override.
So you can get the behavior that you expect with:
mov al, ds:[8]
MSVC 12.00 is at least consistent. Though I've never really used the opcode in this form, generally indexing from some base register, or referencing a data structure that MASM knows about.
#include <windows.h>
#include <stdio.h>
#include <stdlib.h>
int main(int argc, char **argv)
{
__asm
{
mov al,8
mov al,[8]
mov al,byte ptr [8]
mov al,ds:[8]
mov al,0[8]
mov al,[127]
mov al,[128]
mov al,[255]
mov al,[512]
mov al,byte ptr [512]
}
return(0);
}
Disassembly
00000050 _main:
00000050 55 push ebp
00000051 8BEC mov ebp,esp
00000053 53 push ebx
00000054 56 push esi
00000055 57 push edi
00000056 B008 mov al,8
00000058 B008 mov al,8
0000005A B008 mov al,8
0000005C 3EA008000000 mov al,ds:[8]
00000062 B008 mov al,8
00000064 B07F mov al,7Fh
00000066 B080 mov al,80h
00000068 B0FF mov al,0FFh
0000006A B000 mov al,0
0000006C B000 mov al,0
0000006E 33C0 xor eax,eax
00000070 5F pop edi
00000071 5E pop esi
00000072 5B pop ebx
00000073 5D pop ebp
00000074 C3 ret
The general drift with historical Intel notation that MASM more or less preserves si that it is a fully specified language which means you can still write it in much the same way as a CL.EXE asm dump does where it specifies the data size with every instruction.
While other tools use different notation, masm uses named variables which have corresponding addresses so if you have a stack variable [ebp+16] that is named as "var", placing square brackets around it is the same as writing [[ebp+16]] which is ambiguous as the x86 hardware does not have the mechanism for multiple levels of indirection. masm will allow mov eax, [ecx+edx*4][128] where the contents of the second pair of square brackets are ADDED to the address like any normal displacement but with [named_variable] it just ignores the notation and you just get "named_variable".
Over time a shorthand has developed where you can omit the size specifier "BYTE PTR" and similar if the size can be determined from either of the operands by the assembler but where the size cannot be determined the full specification is required.
movzx eax, [ebp+16] ; goes bang because there is no way to dtermine the size of the data to be zero extended.
movzx eax, WORD PTR [ebp+16] ; removes the ambiguity.
oops... i had never zero extended an immediate value (since there really is no point), so i just accidentally typed it that way. I did test it before i posted and it worked :) but i agree it is a confusing way when disassembly and debugging tend to look at it differently.