The MASM Forum Archive 2004 to 2012

General Forums => The Workshop => Topic started by: lonewolff on March 07, 2009, 03:09:09 AM

Title: What is more efficient? [Solved]
Post by: lonewolff on March 07, 2009, 03:09:09 AM
With 'if' situations what is more efficient?

This


cmp eax,0
je somwhere


or this type of thing?


.IF eax==NULL
do something...
.ENDIF


Or do they both get translated to the same thing (in the same amount of bytes)?
Title: Re: What is more efficient?
Post by: Mark Jones on March 07, 2009, 03:25:23 AM
Generally in this case, they both do the same thing. For general coding, use which ever feels right. However, if this is a speed or size-critical compare, then other manual code should be used. (i.e., CMP is slow and the literal value of 0 is stored as a DWORD in the code, whereas OR eax,eax or TEST eax,eax are faster and smaller, but imply other caveats. One could even check the zero flag if doing an INC or DEC, so many possibilities exist.) See \masm32\help\opcodes.chm.
Title: Re: What is more efficient?
Post by: donkey on March 07, 2009, 03:48:00 AM
test eax,eax
jnz somewhere

There is no write operation as with OR so latency and stalls can be avoided. The best way is if you have performed any math on EAX prior to the test to just use the jnz/jz as the flags will be set anyway for example

mov edi,someoffset
@@:
mov [edi],al
dec edi
jnz @B

No need for a test at all.

If you are looking for speed by all means avoid the high level constructs, they are not optimized and will occasionally choose the worst way of comparing data, they also are ripe with unnecessary jumps.
Title: Re: What is more efficient?
Post by: lonewolff on March 07, 2009, 04:15:37 AM
Thanks for the replies guys.

I am a bit unclear how 'test eax,eax' works.

Wouldn't eax always be equal to eax? Or is something else going on here?
Title: Re: What is more efficient?
Post by: NightWare on March 07, 2009, 04:21:13 AM
 :P here, with test reg,reg, you just fix the flags of the register (its current state), and one of the flags is zero... so why not using it...
Title: Re: What is more efficient?
Post by: donkey on March 07, 2009, 04:33:04 AM
Quote from: lonewolff on March 07, 2009, 04:15:37 AM
Thanks for the replies guys.

I am a bit unclear how 'test eax,eax' works.

Wouldn't eax always be equal to eax? Or is something else going on here?

Hi,

test eax, eax does a logical AND of the register against itself (without the write) and sets the flags, since if the register is any value other than zero ZF will not be set it is a very fast check for FALSE.
Title: Re: What is more efficient?
Post by: lonewolff on March 07, 2009, 04:37:40 AM
If you logical AND something against itself, wont you always get a result of true? I must be missing something  :red
Title: Re: What is more efficient?
Post by: MichaelW on March 07, 2009, 04:51:04 AM
It's actually a bitwise AND.

0 and 0 = 0
Title: Re: What is more efficient?
Post by: BogdanOntanu on March 07, 2009, 04:57:21 AM
Quote from: lonewolff on March 07, 2009, 04:37:40 AM
If you logical AND something against itself, wont you always get a result of true? I must be missing something  :red

It is not a logical AND. Instead it is a bitwise AND.

IF all bits of the register are zero THEN  the ZERO flag will be set to 1.
IF any bit of the register is 1 THEN the ZERO flag will be set to 0.

In ASM one usually works with bitwise operations. The logical AND/OR/etc operations are the task of the programmer and a matter of convention.
Title: Re: What is more efficient?
Post by: lonewolff on March 07, 2009, 04:59:03 AM
Quote from: MichaelW on March 07, 2009, 04:51:04 AM
It's actually a bitwise AND.

0 and 0 = 0

Ah, yes. Got ya  :wink
Title: Re: What is more efficient?
Post by: lonewolff on March 07, 2009, 05:33:59 AM
Just trying a little test. But am unsure of how to use labels to jump from one part of the program to another. This is what I have but cant get it to compile.


.386
.model flat,stdcall
option casemap:none
include \masm32\include\windows.inc
include \masm32\include\kernel32.inc
includelib \masm32\lib\kernel32.lib
include \masm32\include\user32.inc
includelib \masm32\lib\user32.lib

.data
MsgBoxTextTrue db "true",0
MsgBoxTextFalse db "false",0
MsgBoxTitle db "Information",0

.code
start:
mov eax,1
test eax,eax
jnz testprog

invoke ExitProcess,NULL
end start

testprog:
invoke MessageBox,NULL,addr MsgBoxTextTrue,addr MsgBoxTitle,MB_ICONINFORMATION
end testprog

What I am trying to do is experiment with mov eax,1 and move eax,0 to see how the program behaves. But I am not sure of the correct asm usage to jump to 'testprog'.

Title: Re: What is more efficient?
Post by: MichaelW on March 07, 2009, 05:50:55 AM
The END directive marks the end of the source and optionally sets the program entry point. So the first END directive is effectively ending the source, and everything below that point is being ignored, so:

jnz testprog

Is referencing a symbol that has not been defined.

Title: Re: What is more efficient?
Post by: lonewolff on March 07, 2009, 06:01:12 AM
No problems :)

I have used 'ret' instead of end and all is good.

Thanks for your help  :U
Title: Re: What is more efficient?
Post by: MichaelW on March 07, 2009, 06:17:00 AM
For simple tests it's somewhat easier to use the print macro.

; «««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««
    include \masm32\include\masm32rt.inc
; «««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««
    .data
    .code
; «««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««

; ----------------------------------------------------------
; This proc displays the value of the Overflow, Direction,
; Interrupt Enable, Sign, Zero, Auxiliary Carry, Parity, and
; Carry flags, in a DEBUG-style format.
;
;   Flag          position  flag set flag clear
;   ----          --------  -------- ----------
; Overflow        (bit 11)     OV        NV
; Direction       (bit 10)     DN        UP
; Interrupt       (bit 9)      EI        DI
; Sign            (bit 7)      NG        PL
; Zero            (bit 6)      ZR        NZ
; Auxiliary Carry (bit 4)      AC        NA
; Parity          (bit 2)      PE        PO
; Carry           (bit 0)      CY        NC
; ----------------------------------------------------------

dumpflags proc
    pushfd
    pushad
    pushf
    pop   bx
    .IF bx & 1 SHL 11         ; Overflow (bit 11)
      print "OV "
    .ELSE
      print "NV "
    .ENDIF
    .IF bx & 1 SHL 10         ; Direction (bit 10)
      print "DN "
    .ELSE
      print "UP "
    .ENDIF
    .IF bx & 1 SHL 9          ; Interrupt (bit 9)
      print "EI "
    .ELSE
      print "DI "
    .ENDIF
    .IF bx & 1 SHL 7          ; Sign (bit 7)
      print "NG "
    .ELSE
      print "PL "
    .ENDIF
    .IF bx & 1 SHL 6          ; Zero (bit 6)
      print "ZR "
    .ELSE
      print "NZ "
    .ENDIF
    .IF bx & 1 SHL 4          ; Auxiliary Carry (bit 4)
      print "AC "
    .ELSE
      print "NA "
    .ENDIF
    .IF bx & 1 SHL 2          ; Parity (bit 2)
      print "PE "
    .ELSE
      print "PO "
    .ENDIF
    .IF bx & 1 SHL 0          ; Carry (bit 0)
      print "CY ",13,10
    .ELSE
      print "NC ",13,10
    .ENDIF
    popad
    popfd
    ret
dumpflags endp

; «««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««
start:
; «««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««

    ;-------------------------------------------------------------
    ; This code uses EBX instead of EAX because EBX, like ESI and
    ; EDI, will retain its value across the print statements.
    ;-------------------------------------------------------------

    mov ebx, 0
    print str$(ebx),13,10
    test ebx, ebx
    call dumpflags

    mov ebx, 1
    print str$(ebx),13,10
    test ebx, ebx
    call dumpflags

    inkey "Press any key to exit..."
    exit
; «««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««
end start


0
NV UP EI PL ZR NA PE NC
1
NV UP EI PL NZ NA PO NC


Title: Re: What is more efficient?
Post by: jj2007 on March 07, 2009, 08:19:53 AM
Here is a simple test case, together with the code you see in OllyDbg (http://www.ollydbg.de/version2.html). I have chosen a high level construct (".if Zero?") because I am lazy, but I am sure somebody will find a shorter and much faster solution :bg

There is a problem in the documentation: Opcodes.hlp and Opcodes.chm say test "performs a logical AND of the two operands". Well, that's wrong, as several others have stated already.

include \masm32\include\masm32rt.inc

.code
start:
mov eax, 4
mov ecx, 2
test eax, ecx
.if Zero?
nop
print "Is zero"
nop
.else
nop
print "Is not zero"
nop
.endif
nop
exit ; short form of invoke ExitProcess, 0


CPU Disasm
Address               Hex dump                       Command                                                                  Comments
0040100D              ³. B8 04000000                 mov eax, 4
00401012              ³. B9 02000000                 mov ecx, 2
00401017              ³. 85C1                        test ecx, eax
00401019              ³.75 0E                       jne short 00401029
0040101B              ³. 90                          nop
0040101C              ³. 68 00204000                 push offset TESTTEST.00402000                                            ; ÚArg1 = ASCII "Is zero"
00401021              ³. E8 1A000000                 call 00401040                                                            ; ÀTESTTEST.00401040
00401026              ³. 90                          nop
00401027              ³.EB 0C                       jmp short 00401035
00401029              ³> 90                          nop
0040102A              ³. 68 08204000                 push offset TESTTEST.00402008                                            ; ÚArg1 = ASCII "Is not zero"
0040102F              ³. E8 0C000000                 call 00401040                                                            ; ÀTESTTEST.00401040
00401034              ³. 90                          nop
00401035              ³> 90                          nop
00401036              ³. 6A 00                       push 0                                                                   ; ÚExitCode = 0
00401038              ³. E8 B5000000                 call <jmp.&kernel32.ExitProcess>                                         ; ÀKERNEL32.ExitProcess


EDIT: TESTTEST is the name I chose for this proggie. The nops have no specific meaning, but they help Olly to interpret this tiny snippet correctly.
Title: Re: What is more efficient?
Post by: jj2007 on March 07, 2009, 06:24:29 PM
Quote from: Mark Jones on March 06, 2009, 04:04:20 PM
Also, in a speed-critical application, you can give the conditional jumps a "branch hint prefix." Take a look at the macro being used here (not that that code itself was designed for speed):
http://www.masm32.com/board/index.php?topic=10919.msg80277#msg80277

I tested that, it makes absolutely no difference; the reason being that the default predictor is most of the time correct anyway. GCC and ICC have switched it off completely:

Branch hint prefixes (http://gcc.gnu.org/ml/gcc/2008-02/msg00634.html)
GCC has support for this feature, but it has turned out to not gain
anything and was disabled by default, since branch reordering streamlines
code well enough to match the default predictor behaviour.
Same conclusion was done by other compiler teams too, ICC is not
generating the hints either.
Title: Re: What is more efficient?
Post by: RuiLoureiro on March 07, 2009, 07:02:04 PM
Quote from: lonewolff on March 07, 2009, 04:37:40 AM
If you logical AND something against itself, wont you always get a result of true? I must be missing something  :red
Hi
        The logical AND of 2 bits: 0 AND 0 = 0   ; 0 AND 1 = 0  ; 1 AND 0 = 0; 1 AND 1 = 1

        If X is a variable with 2 bits: 00, 01, 10 or 11 then

                      X    |    X    |   X AND X | ZF is set according to the result
                      ----------------------------
                      00   |   00    |   00     => ZF=ZeroFlag= 1 means X is zero
                      01   |   01    |   01     => ZF         = 0 means X is not zero
                      10   |   10    |   10     => ZF         = 0          "  "    "
                      11   |   11    |   11     => ZF         = 0          "  "    "

        Note that the result is X:    X AND X = X

        Then, if we     test  X, X    then      if X=0 then ZF=1;
                                                 otherwise             ZF=0

        In this way,  «AND   eax, eax» doesnt change eax because eax AND eax = eax
        and           «TEST  ..., ...» doesnt change because the result are discarded
       
Rui
Title: Re: What is more efficient?
Post by: lonewolff on March 11, 2009, 01:53:26 AM
Some great information.

This has been very helpfull.

I am going to go with the asm style instead of using IF statements. So, over time I will be able to look at some pure assembly calls and understand what is going on.

Thanks again guys!  :U
Title: Re: What is more efficient?
Post by: NightWare on March 11, 2009, 02:12:15 AM
once you will be a bit more experienced, read agner fog's optimization pdf. you will see that, sometimes, it's possible to not jump at all (coz jumps have a price...).
Title: Re: What is more efficient?
Post by: lonewolff on March 11, 2009, 03:43:56 AM
Thanks for the reference. I have just downloaded it.

Looks pretty full on. Som light bedtime reading  :bg