
MASM32 SDK Description, downloads and other helpful links New Forum Link
masmforum WebSite

wich faster?

Started by theunknownguy, November 03, 2010, 12:36:41 AM

Previous topic - Next topic


Excuse me got the following doubt i have two methods for do a same thing, let me show:

        .If (OpSize == 1)
Add [Eax], Edx
.If (Esi == 1)
Add Word Ptr [Eax], Dx
Add Byte Ptr [Eax], Dl


OpSize == 1 , simbolise if the opcode is 32 byte.
If ESI == 1 The opcode is Word (66 prefix)
If ESI == 0 The opcode is byte

Now i can do this other method:

        Mov Ebx, OpSize
Imul Ebx, Ebx, 0FFFFFFh
Imul Esi, Esi, 0FFh
Rol Esi, 8
Rol Ebx, 8
Add Esi, 0FFh
Add Ebx, Esi
        Add Dword Ptr [Eax], Edx
And Dword Ptr [Eax], Ebx


OpSize == 1 (0FFFFFFFFh)
ESI == 1 (0FFFFh)
ESI == 0 (0FFh)

(Yes, there will never be OpSize == 1 and ESI == 1. Prefix 66 is ignored in this case)

So wich is faster?.

I know Imul and Rol operation uses some clocks, but avoid the branchs, cmps, and the partial regist operation would speed up things. *PROBABLY*.

Thanks  :U


as you have them written, the if-then-else is probably faster
that doesn't mean it's the fastest way   :bg
do EAX, EDX or ESI have to be preserved ?
what are the other possible values of ESI ? (if it is not=1)
in fact, tell us what all the values might be - lol


Quote from: dedndave on November 03, 2010, 12:57:23 AM
as you have them written, the if-then-else is probably faster
that doesn't mean it's the fastest way   :bg
do EAX, EDX or ESI have to be preserved ?
what are the other possible values of ESI ? (if it is not=1)

EAX == Hold destiny of the operation opcode
EDX == Hold source of the operation opcode
ESI  == Hold Prefix of operation opcode

Ofc i can use other regs, but why you ask?.

ESI could be only: 1 or 0 (66 prefix is founded or not)


Add Dword ptr ds:[Eax], 10  (To be emulated)

EAX == Addr of [EAX]
EDX == 10
ESI == 0


give us some values
what is the range of values of [EAX] and EDX


for ESI, you could use

        dec     esi
now, it is either all 0's or all 1'   :U
DEC ESI is a single byte opcode and is quite fast


Quote from: dedndave on November 03, 2010, 01:03:41 AM
give us some values
what is the range of values of [EAX] and EDX

That depends cause i am doing an opcode emulator (intel architecture).

So i analyze other process and get the opcode, usually i check if the address is valid. Another example:

Add Word ptr ds:[100000h], 1000

EAX == [1000000h]
EDX == 1000
ESI  == 1

So i kind of actually not know whats inside of 1000000h i just perform the operation.


IMO, the conditional moves are your best way to optimize things; off the top of my head, something like this (I'm doing this quick and i'm full of gin, so its a suggestion only):

mov ebx, dl
mov ecx, dx
cmp esi,1
cmovz ecx, ebx
cmp OpSize,1
cmovz ecx, edx
add [eax], ecx

On the large scale, the unnecessary moves are worth not having the branch mispredictions. Take it as you will, i'm sure there are mistakes.

Strange women, lying in ponds, distributing swords, is no basis for a system of government


Quote from: dedndave on November 03, 2010, 01:05:28 AM
for ESI, you could use

        dec     esi
now, it is either all 0's or all 1'   :U
DEC ESI is a single byte opcode and is quite fast

If ESI == 0 yes the sub esi, 1 (instead of dec esi, i think sub is faster) its a great idea but:

ESI == 1 i need it to be 0FFFFh caused it represent the WORD prefix (66).

Also i need this calculation to work with the EBX regist since:

EBX == 1 i need it to be 0FFFFFFFFh (It represent 32 bit reg operation)
EBX == 0 Then i take in consideration the ESI value.

All of this caused i want to avoid branchs, cmps and the partial regist emulation, so i can treath everything has a DWORD and finally AND it with the actual operation size.

Hope i havent confused you guys >.<

PS: Here a more detailed explain of the deal, since i think i am confusing everyone:

Opcodes have the Operation size bit example:


OpcodeNumber: 00


OpcodeNumber: 01

Ok so the WORD at least on the win32 enviroment i am trying to emulate needs to have the 66 prefix:


PrefixOpcode: 66
OpcodeNumber: 01

You guys know all about this, so i go to the main point:

ESI == PrefixOpcode (If 66 then 1, if not then 0)
EBX == Byte Size (If 32 bit then 1, if 8 bit then 0)
EAX == Dest of the operation opcode
EDX == Source of the operation opcode

So instead of making cmps for execute the ADD instruction (or many others) with their partial regist, i wanted to avoid the branchs, cmps and do everything in 32 bit method and finally AND it with the real size of the operation.


Quote from: redskull on November 03, 2010, 01:08:46 AM
IMO, the conditional moves are your best way to optimize things; off the top of my head, something like this (I'm doing this quick and i'm full of gin, so its a suggestion only):

mov ebx, dl
mov ecx, dx
cmp esi,1
cmovz ecx, ebx
cmp OpSize,1
cmovz ecx, edx
add [eax], ecx

On the large scale, the unnecessary moves are worth not having the branch mispredictions. Take it as you will, i'm sure there are mistakes.


Thx redskull i think conditional moves will work for this, but i tought they where slower than arithmetic operations like IMUL or ROL... Ill give it a try   :wink


       dec     esi
       not     esi
       or      esi,0FFh
       and     edx,esi
       add     [eax],dx


Quote from: dedndave on November 03, 2010, 01:33:21 AM
       dec     esi
       not     esi
       or      esi,0FFh
       and     edx,esi
       add     [eax],dx

ESI == 0 (0FFh)

But it needs to be:

ESI == 1 (0FFFFh) (WORD)


the upper word of EDX isn't used
however, there might be a problem if [EAX] + DL generates a carry into the second byte


Quote from: theunknownguy on November 03, 2010, 01:41:40 AM
ESI == 0 (0FFh)

But it needs to be:

ESI == 1 (0FFFFh) (WORD)

Strange rules, anyway  :green2


Quote from: dedndave on November 03, 2010, 01:44:53 AM
the upper word of EDX isn't used
however, there might be a problem if [EAX] + DL generates a carry into the second byte

What happen then when EDX have a 32 bit source?


Add Dword Ptr ds:[Eax], 10000000 (Trying to emulate)

EDX == 100000000

Also the idea was to avoid the partial regist emulation, since using 32 bit is faster than using 16 or 8...


Quote from: Antariy on November 03, 2010, 01:45:59 AM
Quote from: theunknownguy on November 03, 2010, 01:41:40 AM
ESI == 0 (0FFh)

But it needs to be:

ESI == 1 (0FFFFh) (WORD)

Strange rules, anyway  :green2

Oh nvm your example that you deleted give me:

ESI == 1  (0FFFFh) (ok cool)
ESI == 0 (0) (wrong, i need it to be 0FFh)
