News:

MASM32 SDK Description, downloads and other helpful links
MASM32.com New Forum Link
masmforum WebSite

Odd or Even

Started by herge, October 13, 2008, 11:35:59 AM

Previous topic - Next topic

herge


Hi jj2007:


There is no need to move the value into eax. Test value, 1 is sufficient.


Thank you jj2007 one less op-code.

Regards herge
// Herge born  Brussels, Belgium May 22, 1907
// Died March 3, 1983
// Cartoonist of Tintin and Snowy

jj2007

Quote from: hutch-- on October 18, 2008, 08:40:21 AM
It fails on "TEST IMMEDIATE, 00000000000000000000000000000001b"

You are perfectly right, Hutch. In these rare cases, the slow solution proposed here by zooba must be applied.

hutch--

 :bg

Apart from the use of the register working in all cases, the averages for a MOV/TEST are much better than TEST MEM, IMMEDIATE and the very slow alternative.
Download site for MASM32      New MASM Forum
https://masm32.com          https://masm32.com/board/index.php

jj2007

Quote from: hutch-- on October 18, 2008, 11:23:51 AM
:bg

Apart from the use of the register working in all cases, the averages for a MOV/TEST are much better than TEST MEM, IMMEDIATE and the very slow alternative.

Much better?? I love hard facts, as you know (code attached):

0 cycles, eax odd
-2 cycles, VarOdd
0 cycles, VarEven

-1 cycles, mov eax, odd
0 cycles, mov eax, VarOdd
0 cycles, mov eax, VarEven
0 cycles, mov eax, immediate even
-1 cycles, mov eax, immediate odd

3 cycles, call IsOdd, odd
3 cycles, call IsOdd, VarOdd
3 cycles, call IsOdd, VarEven
3 cycles, call IsOdd, immediate even
3 cycles, call IsOdd, immediate odd

With regard to immediates, yes it's true that your jeven macro is the most flexible option:

jevn MACRO value, lbl
mov eax, value
test eax, 1
jz lbl
ENDM

.code
  mov eax, 123
  jevn eax, L1
  nop
L1:

  jevn 123, L2
  nop
L2:


However, in pseudocode the jevn 123, L2 can be expressed as:

.if 123==odd
  nop
.endif

An experienced coder (20 years of experience should be enough, I guess) might notice that the immediate value 123 is, in the great majority of cases, odd; therefore the snippet could be shortened as follows:

; .if 123==odd
  nop
; .endif

[attachment deleted by admin]

herge


Hi jj2007:

The results of your program on my computer.


0 cycles, eax odd
0 cycles, VarOdd
0 cycles, VarEven

0 cycles, mov eax, odd
0 cycles, mov eax, VarOdd
0 cycles, mov eax, VarEven
0 cycles, mov eax, immediate even
0 cycles, mov eax, immediate odd

2 cycles, call IsOdd, odd
2 cycles, call IsOdd, VarOdd
3 cycles, call IsOdd, VarEven
3 cycles, call IsOdd, immediate even
2 cycles, call IsOdd, immediate odd
Press any key to continue ...


Regards herge
// Herge born  Brussels, Belgium May 22, 1907
// Died March 3, 1983
// Cartoonist of Tintin and Snowy

hutch--

 :bg

Its surprising what you learn from manually optimising C compiler output. There is an old P1/PII optimisation still done by many C compilers where a value is translated into a register first before performing the operation. Spend the time to remove the register transfer on the complete algo and you pick up a ZERO speed increase. The only reason why you bother is to free up registers.

Now the obvious is that the macro line,


mov eax, value



can be used in all 3 forms,


mov eax, REG
mov eax, MEM
mov eax, IMMEDIATE


which is why you bother to write the macro in that form. Unless the code was in the middle of a truly speed critical algorithm, it will do the job fine and easily as fast as a direct memory operation. the exceptions occur when you "may" get a better encoding by changing the entire instruction sequence.

herge,

look at the timing values to see why the test is meaningless with that duration, the test needs to be done in an appropriate context for about 500ms or longer to get stable results.

jj,

Quote
An experienced coder (20 years of experience should be enough, I guess) might notice that the immediate value 123 is, in the great majority of cases, odd; therefore the snippet could be shortened as follows:

Spot the diference.


    mov eax, 1111011b
    mov eax, 7Bh
    mov eax, 123
Download site for MASM32      New MASM Forum
https://masm32.com          https://masm32.com/board/index.php

jj2007

Quote from: hutch-- on October 18, 2008, 02:53:03 PM
look at the timing values to see why the test is meaningless with that duration, the test needs to be done in an appropriate context for about 500ms or longer to get stable results.

I fleshed it up a little bit:
420 cycles, test xxx, 1
551 cycles, mov eax, xxx, test eax, 1

Quote
Spot the diference.


    mov eax, 1111011b
    mov eax, 7Bh
    mov eax, 123

Hmmm....? First, you don't use the spell-checker any more; second, they are all odd. Otherwise, no idea what you mean.

By the way, there is one situation where the mov eax, xxx is indeed needed:
  mov eax, [esp]
  test eax, 1
In all other cases, the mov eax just wastes a register and costs some space.

[attachment deleted by admin]

hutch--

 :bg

here are results from 3 tests. The mov/test is never slower but will handle all cases, the optimised register version is more than twice as fast and the test on a memory operand brings up the tail as being not as flexible and not as fast as the pure register/immediate version.

1109   ; 1st test
454    ; second test
1109   ; third test
Press any key to continue ...


; ¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤
    include \masm32\include\masm32rt.inc
; ¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤

comment * -----------------------------------------------------
                        Build this  template with
                       "CONSOLE ASSEMBLE AND LINK"
        ----------------------------------------------------- *

    .code

start:
   
; ¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤

    call main
    inkey
    exit

; ¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤

main proc

    LOCAL value :DWORD

    invoke SetPriorityClass,rv(GetCurrentProcess),HIGH_PRIORITY_CLASS

    push esi

  ; ---------------------------------

    invoke GetTickCount
    push eax
    mov esi, 500000000

  @@:
    mov eax, 12345678
    test eax, 00000000000000000000000000000001b     ; test the end bit
    mov eax, 12345678
    test eax, 00000000000000000000000000000001b     ; test the end bit
    mov eax, 12345678
    test eax, 00000000000000000000000000000001b     ; test the end bit
    mov eax, 12345678
    test eax, 00000000000000000000000000000001b     ; test the end bit
    sub esi, 1
    jnz @B

    invoke GetTickCount
    pop ecx
    sub eax, ecx
    print str$(eax),13,10

  ; ---------------------------------
    invoke SleepEx,200,0
  ; ---------------------------------

    invoke GetTickCount
    push eax
    mov esi, 500000000

    mov eax, 12345678
  @@:
    test eax, 00000000000000000000000000000001b     ; test the end bit
    test eax, 00000000000000000000000000000001b     ; test the end bit
    test eax, 00000000000000000000000000000001b     ; test the end bit
    test eax, 00000000000000000000000000000001b     ; test the end bit
    sub esi, 1
    jnz @B

    invoke GetTickCount
    pop ecx
    sub eax, ecx
    print str$(eax),13,10

  ; ---------------------------------
    invoke SleepEx,200,0
  ; ---------------------------------

    invoke GetTickCount
    push eax
    mov esi, 500000000

    mov value, 12345678

  @@:
    test value, 00000000000000000000000000000001b     ; test the end bit
    test value, 00000000000000000000000000000001b     ; test the end bit
    test value, 00000000000000000000000000000001b     ; test the end bit
    test value, 00000000000000000000000000000001b     ; test the end bit
    sub esi, 1
    jnz @B

    invoke GetTickCount
    pop ecx
    sub eax, ecx
    print str$(eax),13,10

  ; ---------------------------------

    invoke SetPriorityClass,rv(GetCurrentProcess),NORMAL_PRIORITY_CLASS


    pop esi

    ret

main endp

; ¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤

end start
Download site for MASM32      New MASM Forum
https://masm32.com          https://masm32.com/board/index.php

jj2007

1593
938
1563
1562    4)

4)
REPEAT 4
    mov eax, value
    test eax, 00000000000000000000000000000001b     ; test the end bit


Difference on my Celeron M is not as dramatic, but it still confirms:
- test exx, 1 is a lot faster than the indirect version
- mov eax, value plus test eax, 1 is never faster than the direct test value, 1

So we can safely use the shortest version :green

herge

 Hi hutch:


704
437
860
Press any key to continue ...


Results of your program on my computer.

Regards herge
// Herge born  Brussels, Belgium May 22, 1907
// Died March 3, 1983
// Cartoonist of Tintin and Snowy

hutch--

 :bg

> So we can safely use the shortest version

If you don't mind it failing on the immediate. The idea of the macro was to be general purpose which it is but with no speed penalty. Manual coding options allow you to different things in different contexts but for general purpose, the reg load then TEST is more flexible and as fast.
Download site for MASM32      New MASM Forum
https://masm32.com          https://masm32.com/board/index.php

hutch--

herge,

What processor are you using ? Mine was done on my old PIV, my other two more or less agree and I know JJ is using a celeron.
Download site for MASM32      New MASM Forum
https://masm32.com          https://masm32.com/board/index.php

jj2007

Quote from: hutch-- on October 18, 2008, 11:13:31 PM
:bg

> So we can safely use the shortest version

If you don't mind it failing on the immediate.
In general, an .if 123==odd doesn't make sense. An immediate is either odd or even, and you know it already at assembly time, not at runtime. The only exception would be self-modifying code :wink

Quote
for general purpose, the reg load then TEST is more flexible and as fast.

as fast, except if value is a register - a frequent case...

As you know, I am a great fan of macros, but in this case I vote for the (almost) plain assembler:

test RegOrMem, 1
.if zero?
   print "is even"
.else
  print "is odd"
.endif

herge

 Hi hutch:

I have a duo core.


 Processor: Intel(R) Core(TM)2 Duo CPU     E4600  @ 2.40GHz (2 CPUs)


Regards herge
// Herge born  Brussels, Belgium May 22, 1907
// Died March 3, 1983
// Cartoonist of Tintin and Snowy

Roger

Hi Herge,

Quote from: herge on October 18, 2008, 11:24:59 PM
  I have a duo core.

  Processor: Intel(R) Core(TM)2 Duo CPU     E4600  @ 2.40GHz (2 CPUs)


Does this mean that

           ebx is odd
           ebx is even


                          can both be true at the same time? ::)

Regards Roger