News:

MASM32 SDK Description, downloads and other helpful links
MASM32.com New Forum Link
masmforum WebSite

wich faster?

Started by theunknownguy, November 03, 2010, 12:36:41 AM

Previous topic - Next topic

theunknownguy

Excuse me got the following doubt i have two methods for do a same thing, let me show:

        .If (OpSize == 1)
Add [Eax], Edx
.Else
.If (Esi == 1)
Add Word Ptr [Eax], Dx
.Else
Add Byte Ptr [Eax], Dl
.EndIf
                .EndIf


Where:

OpSize == 1 , simbolise if the opcode is 32 byte.
If ESI == 1 The opcode is Word (66 prefix)
If ESI == 0 The opcode is byte


Now i can do this other method:


        Mov Ebx, OpSize
Imul Ebx, Ebx, 0FFFFFFh
Imul Esi, Esi, 0FFh
Rol Esi, 8
Rol Ebx, 8
Add Esi, 0FFh
Add Ebx, Esi
        Add Dword Ptr [Eax], Edx
And Dword Ptr [Eax], Ebx


Where:

OpSize == 1 (0FFFFFFFFh)
ESI == 1 (0FFFFh)
ESI == 0 (0FFh)


(Yes, there will never be OpSize == 1 and ESI == 1. Prefix 66 is ignored in this case)

So wich is faster?.

I know Imul and Rol operation uses some clocks, but avoid the branchs, cmps, and the partial regist operation would speed up things. *PROBABLY*.

Thanks  :U





dedndave

as you have them written, the if-then-else is probably faster
that doesn't mean it's the fastest way   :bg
do EAX, EDX or ESI have to be preserved ?
what are the other possible values of ESI ? (if it is not=1)
in fact, tell us what all the values might be - lol

theunknownguy

Quote from: dedndave on November 03, 2010, 12:57:23 AM
as you have them written, the if-then-else is probably faster
that doesn't mean it's the fastest way   :bg
do EAX, EDX or ESI have to be preserved ?
what are the other possible values of ESI ? (if it is not=1)

EAX == Hold destiny of the operation opcode
EDX == Hold source of the operation opcode
ESI  == Hold Prefix of operation opcode

Ofc i can use other regs, but why you ask?.

ESI could be only: 1 or 0 (66 prefix is founded or not)

Example:

Add Dword ptr ds:[Eax], 10  (To be emulated)

EAX == Addr of [EAX]
EDX == 10
ESI == 0



dedndave

give us some values
what is the range of values of [EAX] and EDX

dedndave

for ESI, you could use

        dec     esi
now, it is either all 0's or all 1'   :U
DEC ESI is a single byte opcode and is quite fast

theunknownguy

Quote from: dedndave on November 03, 2010, 01:03:41 AM
give us some values
what is the range of values of [EAX] and EDX

That depends cause i am doing an opcode emulator (intel architecture).

So i analyze other process and get the opcode, usually i check if the address is valid. Another example:

Add Word ptr ds:[100000h], 1000

EAX == [1000000h]
EDX == 1000
ESI  == 1

So i kind of actually not know whats inside of 1000000h i just perform the operation.

redskull

IMO, the conditional moves are your best way to optimize things; off the top of my head, something like this (I'm doing this quick and i'm full of gin, so its a suggestion only):

mov ebx, dl
mov ecx, dx
cmp esi,1
cmovz ecx, ebx
cmp OpSize,1
cmovz ecx, edx
add [eax], ecx


On the large scale, the unnecessary moves are worth not having the branch mispredictions. Take it as you will, i'm sure there are mistakes.

-r
Strange women, lying in ponds, distributing swords, is no basis for a system of government

theunknownguy

Quote from: dedndave on November 03, 2010, 01:05:28 AM
for ESI, you could use

        dec     esi
now, it is either all 0's or all 1'   :U
DEC ESI is a single byte opcode and is quite fast

If ESI == 0 yes the sub esi, 1 (instead of dec esi, i think sub is faster) its a great idea but:

ESI == 1 i need it to be 0FFFFh caused it represent the WORD prefix (66).

Also i need this calculation to work with the EBX regist since:

EBX == 1 i need it to be 0FFFFFFFFh (It represent 32 bit reg operation)
EBX == 0 Then i take in consideration the ESI value.

All of this caused i want to avoid branchs, cmps and the partial regist emulation, so i can treath everything has a DWORD and finally AND it with the actual operation size.

Hope i havent confused you guys >.<

PS: Here a more detailed explain of the deal, since i think i am confusing everyone:

Opcodes have the Operation size bit example:


ADD BYTE PTR DS:[EAX],DL

OpcodeNumber: 00

ADD DWORD PTR DS:[EAX],EDX

OpcodeNumber: 01


Ok so the WORD at least on the win32 enviroment i am trying to emulate needs to have the 66 prefix:


ADD WORD PTR DS:[EAX],DX

PrefixOpcode: 66
OpcodeNumber: 01


You guys know all about this, so i go to the main point:

ESI == PrefixOpcode (If 66 then 1, if not then 0)
EBX == Byte Size (If 32 bit then 1, if 8 bit then 0)
EAX == Dest of the operation opcode
EDX == Source of the operation opcode


So instead of making cmps for execute the ADD instruction (or many others) with their partial regist, i wanted to avoid the branchs, cmps and do everything in 32 bit method and finally AND it with the real size of the operation.






theunknownguy

Quote from: redskull on November 03, 2010, 01:08:46 AM
IMO, the conditional moves are your best way to optimize things; off the top of my head, something like this (I'm doing this quick and i'm full of gin, so its a suggestion only):

mov ebx, dl
mov ecx, dx
cmp esi,1
cmovz ecx, ebx
cmp OpSize,1
cmovz ecx, edx
add [eax], ecx


On the large scale, the unnecessary moves are worth not having the branch mispredictions. Take it as you will, i'm sure there are mistakes.

-r

Thx redskull i think conditional moves will work for this, but i tought they where slower than arithmetic operations like IMUL or ROL... Ill give it a try   :wink

dedndave

       dec     esi
       not     esi
       or      esi,0FFh
       and     edx,esi
       add     [eax],dx

theunknownguy

Quote from: dedndave on November 03, 2010, 01:33:21 AM
       dec     esi
       not     esi
       or      esi,0FFh
       and     edx,esi
       add     [eax],dx


ESI == 0 (0FFh)
ESI == 1 (0FFFFFFFFh)

But it needs to be:

ESI == 1 (0FFFFh) (WORD)



dedndave

nahhhh
the upper word of EDX isn't used
however, there might be a problem if [EAX] + DL generates a carry into the second byte

Antariy

Quote from: theunknownguy on November 03, 2010, 01:41:40 AM
ESI == 0 (0FFh)
ESI == 1 (0FFFFFFFFh)

But it needs to be:

ESI == 1 (0FFFFh) (WORD)

Strange rules, anyway  :green2

theunknownguy

Quote from: dedndave on November 03, 2010, 01:44:53 AM
nahhhh
the upper word of EDX isn't used
however, there might be a problem if [EAX] + DL generates a carry into the second byte

What happen then when EDX have a 32 bit source?

Example:

Add Dword Ptr ds:[Eax], 10000000 (Trying to emulate)

EDX == 100000000


Also the idea was to avoid the partial regist emulation, since using 32 bit is faster than using 16 or 8...





theunknownguy

Quote from: Antariy on November 03, 2010, 01:45:59 AM
Quote from: theunknownguy on November 03, 2010, 01:41:40 AM
ESI == 0 (0FFh)
ESI == 1 (0FFFFFFFFh)

But it needs to be:

ESI == 1 (0FFFFh) (WORD)

Strange rules, anyway  :green2

Oh nvm your example that you deleted give me:

ESI == 1  (0FFFFh) (ok cool)
ESI == 0 (0) (wrong, i need it to be 0FFh)

LoL...