Segment registers in Win32

mike2 · August 02, 2011, 12:00:25 PM

I have some source code like this:

proc1 proc
local arg1:dword

mov eax, [arg1]
mov eax, [arg1+4]
mov eax, [4+arg1]
mov eax, [arg1+edx]
mov eax, [arg1+edx+4]
mov eax, [arg1+4+edx]
mov eax, [edx+arg1+4]
mov eax, [edx+4+arg1]
mov eax, [4+edx+arg1]
mov eax, [4+arg1+edx]
mov eax, arg1[edx]
mov eax, arg1[edx+4]
mov eax, arg1[4+edx]
ret

proc1 endp

When I assemble the code with TASM or with JWASM, I will get

Code Select

opcodes       disasembly
8B45FC        mov   eax,[ebp][-4]
8B4500        mov   eax,[ebp][0]
8B4500        mov   eax,[ebp][0]
8B4415FC      mov   eax,[ebp][edx][-4]
8B441500      mov   eax,[ebp][edx][0]
8B441500      mov   eax,[ebp][edx][0]
8B042A        mov   eax,[edx][ebp]
8B042A        mov   eax,[edx][ebp]
8B042A        mov   eax,[edx][ebp]
8B441500      mov   eax,[ebp][edx][0]
8B4415FC      mov   eax,[ebp][edx][-4]
8B441500      mov   eax,[ebp][edx][0]
8B441500      mov   eax,[ebp][edx][0]

But MASM will produce

Code Select

opcodes       disasembly
8B45FC        mov   eax,[ebp][-4]
8B4500        mov   eax,[ebp][0]
8B4500        mov   eax,[ebp][0]
368B442AFC    mov   eax,ss:[edx][ebp][-4]
368B042A      mov   eax,ss:[edx][ebp]
368B042A      mov   eax,ss:[edx][ebp]
8B441500      mov   eax,[ebp][edx][0]
8B441500      mov   eax,[ebp][edx][0]
8B441500      mov   eax,[ebp][edx][0]
368B042A      mov   eax,ss:[edx][ebp]
368B442AFC    mov   eax,ss:[edx][ebp][-4]
368B042A      mov   eax,ss:[edx][ebp]
368B042A      mov   eax,ss:[edx][ebp]

I understand that "mov ax, [si+bp]" will use "ds:" as default segment register while "mov ax, [bp+si]" will use "ss:" instead. But that's old 16 bit stuff and it normally shouldn't be any problem inside Win32 code. MASM clearly produces the larger code here. Is there some command line switch or other trick which can be used to generate the smaller code with MASM?

clive · August 02, 2011, 02:04:15 PM

The 32-bit segment usage patterns are exactly the same as 16-bit ones, the differences are masked in FLAT mode where CS, DS, ES, and SS share similar selectors pointing to the same memory region. The 386+ doesn't have to operate in such a flat landscape, and system code might not be.

If you want it to drop the SS: you can add an explicit DS:, because MASM is assuming the arg1 is on the stack frame. The DS: won't be encoded if that is the natural destination for the register/order used. The assembler is already assuming SS=_DATA and DS=_DATA so you can't override it there.

We should probably also ignore the pretty bogus addressing scheme you are trying to implement here. If you are referencing based on the content of arg1, you should load it into a register.

mike2 · August 04, 2011, 04:41:32 PM

If the assembler is assung DS=_DATA and SS=_DATA, why does it put the ss: prefix in front of the instruction?

What's bogus with the addressing scheme? Ok, using it the way I posted doesn't make much sense, but I can also write "local arg1[100h]:dword". Then accessing arg1+X does make sense again and MASM will still assemble the ss: prefix nearly everywhere.

dedndave · August 04, 2011, 06:11:28 PM

we cannot answer that without seeing your code....

News:

Segment registers in Win32

mike2

clive

mike2

dedndave