The MASM Forum Archive 2004 to 2012

General Forums => The Campus => Topic started by: frktons on August 27, 2010, 09:40:15 AM

Title: Testing an odd number
Post by: frktons on August 27, 2010, 09:40:15 AM
I've to test if an integer number is odd or even.
I've thought about a couple of solutions:

1. shift right by 1 position the number MOVed into a register and test the carry flag [I think at least]
2. Testing the 0 bit of the register into which I moved the number to test, and  acting accordingly.

I'd like to ask:

1. What is the fastest method, and if there is a better one than the 2 methods I'm thinking about?
2. What is the syntax of both these methods if I want to print "The number is ODD" or "The number is EVEN"?

Thanks for your help
Title: Re: Testing an odd number
Post by: bomz on August 27, 2010, 09:53:20 AM
and eax, 1
cmp eax, 1   
jne NotEven
Title: Re: Testing an odd number
Post by: frktons on August 27, 2010, 09:59:58 AM
Quote from: bomz on August 27, 2010, 09:53:20 AM
and eax, 1
cmp eax, 1   
jne NotEven

Thanks bomz.

What about this solution:

;-------------------------------------------------------------------------------------------------------------------------
; Test if a number is ODD or EVEN
; Date: 27 august 2010
;-------------------------------------------------------------------------------------------------------------------------


INCLUDE \masm32\include\masm32rt.inc

;------------------------------------------------------------------------------------------------------------------------------

.DATA

number    db    12


;------------------------------------------------------------------------------------------------------------------------------


.CODE


start:

    xor eax,eax

    mov al, number

    shr eax, 1

    jc  isodd

    print "The number is EVEN",13,10

    inkey

    jmp end_test

isodd:

    print "The number is ODD",13,10

    inkey


end_test:


    ret

end start


is it slower? or what else?
Title: Re: Testing an odd number
Post by: bomz on August 27, 2010, 10:00:28 AM
and eax, 1
jz ODD
Title: Re: Testing an odd number
Post by: frktons on August 27, 2010, 10:07:52 AM
Quote from: bomz on August 27, 2010, 10:00:28 AM
and eax, 1
jz ODD

This is probably faster than the previous one.  :U

I'm looking at the BT mnemonic to see how it can be used.


Maybe something like:


BT number, 0
jc  ODD

Title: Re: Testing an odd number
Post by: bomz on August 27, 2010, 10:15:03 AM
thats all I know about SHR - I think GOOGLE knows how many each operation processor tact(?) needs
(http://s42.radikal.ru/i096/1008/c1/6c6bdc07c292.gif)
Title: Re: Testing an odd number
Post by: frktons on August 27, 2010, 10:19:44 AM
Thanks bomz. These examples should be enough for now  :U

Nice gif  :P
Title: Re: Testing an odd number
Post by: Neil on August 27, 2010, 10:31:55 AM
Just a thought,

and eax,1

This zero's all bits except bit 0, so if you want to preserve your original number, and is not the answer.
Title: Re: Testing an odd number
Post by: Neil on August 27, 2010, 10:37:42 AM
You might like to look at this thread :-


http://www.masm32.com/board/index.php?topic=10109.msg74010#msg74010
Title: Re: Testing an odd number
Post by: jj2007 on August 27, 2010, 10:39:45 AM
Intel(R) Pentium(R) 4 CPU 3.40GHz (SSE3)
4552    cycles for 1000*test eax, 1, result=500
10634   cycles for 1000*bt eax, 0, result=500

4595    cycles for 1000*test eax, 1, result=500
10640   cycles for 1000*bt eax, 0, result=500


Testbed attached.
Title: Re: Testing an odd number
Post by: dedndave on August 27, 2010, 11:33:59 AM
;value in EAX:
        test    al,1
        jnz     value_is_odd
Title: Re: Testing an odd number
Post by: jj2007 on August 27, 2010, 12:06:59 PM
Quote from: dedndave on August 27, 2010, 11:33:59 AM
;value in EAX:
        test    al,1
        jnz     value_is_odd


You won - three bytes shorter than my version  :bg

However, it's a lot slower:
Intel(R) Celeron(R) M CPU        420  @ 1.60GHz (SSE3)
2885    cycles for 1000*test eax, 1, result=500
4023    cycles for 1000*test al, 1, result=500
3001    cycles for 1000*bt eax, 0, result=500
Title: Re: Testing an odd number
Post by: dedndave on August 27, 2010, 12:10:43 PM
hiyas Jochen - good morning (here) - good afternoon (there)

i can't imagine that much difference in speed  :P
must be something wrong with the test   ::)   :lol
Title: Re: Testing an odd number
Post by: lingo on August 27, 2010, 12:39:41 PM
"The thieves are always liars"[/U]
As you saw his test program is proven crap, hence his results too...  :lol
Title: Re: Testing an odd number
Post by: frktons on August 27, 2010, 12:42:08 PM
Morning/Evening/Night Masters. Oh Mr Lingo is here. How are you Sir?

So from the tests Jochen has posted the faster instruction is:

test eax,1


Probably because on 32/64 bit machines dealing with 32/64 bits at a time
is the faster solution for the CPU.
I run the test on my pc and post here afterwards.

Edit: Well on my CPU BT is faster than TEST:
Quote
Intel(R) Core(TM)2 CPU          6600  @ 2.40GHz (SSE4)
2043    cycles for 1000*test eax, 1, result=500
2037    cycles for 1000*bt eax, 0, result=500

2019    cycles for 1000*test eax, 1, result=500
2035    cycles for 1000*bt eax, 0, result=500


--- ok ---


They are quite the same.
Title: Re: Testing an odd number
Post by: dedndave on August 27, 2010, 12:44:21 PM
actually lingo - i was making the remark and rolling my eyes because of all your nonsense - lol
Title: Re: Testing an odd number
Post by: dedndave on August 27, 2010, 12:46:55 PM
Frank - if the contents of the register are "disposable", you can use SHR
        shr     eax,1
        jc      value_was_odd

the original value may be recovered, if desired with RCL EAX,1
as long as the carry flag has not been altered
Title: Re: Testing an odd number
Post by: frktons on August 27, 2010, 12:53:12 PM
Quote from: dedndave on August 27, 2010, 12:46:55 PM
Frank - if the contents of the register are "disposable", you can use SHR
        shr     eax,1
        jc      value_was_odd

the original value may be recovered, if desired with RCL EAX,1
as long as the carry flag has not been altered

Thanks Dave. Good to know, it's one of the solutions I tested.
What about avoiding the eax MOV and testing the variable directly?

;-------------------------------------------------------------------------------------------------------------------------
; Test if a number is ODD or EVEN
; Date: 27 august 2010
;-------------------------------------------------------------------------------------------------------------------------


INCLUDE \masm32\include\masm32rt.inc

;------------------------------------------------------------------------------------------------------------------------------

.DATA

number    dd    12


;------------------------------------------------------------------------------------------------------------------------------


.CODE


start:

    bt number, 0

    jc  isodd

    print "The number is EVEN",13,10

    inkey

    jmp end_test

isodd:

    print "The number is ODD",13,10

    inkey


end_test:


    ret

end start


Title: Re: Testing an odd number
Post by: frktons on August 27, 2010, 12:59:00 PM
Well testing the number directly has some small advantages:

Intel(R) Core(TM)2 CPU          6600  @ 2.40GHz (SSE4)
2072    cycles for 1000*test eax, 1, result=500
2028    cycles for 1000*bt eax, 0, result=500

2016    cycles for 1000*bt number, 0, result=1000

2041    cycles for 1000*test eax, 1, result=500
2036    cycles for 1000*bt eax, 0, result=500

2025    cycles for 1000*bt number, 0, result=1000


--- ok ---


I tested with number = 13.
Title: Re: Testing an odd number
Post by: lingo on August 27, 2010, 01:02:23 PM
Hi,Frank
"What about avoiding the eax MOV and testing the variable directly?"
Will be faster coz you will skip mov eax, number.

dedndave,
actually dedndave - have you some news  about your wife coz it is not healthy for a man to be alone? :snooty:
Title: Re: Testing an odd number
Post by: frktons on August 27, 2010, 01:05:46 PM
Quote from: lingo on August 27, 2010, 01:02:23 PM
Hi,Frank
"What about avoiding the eax MOV and testing the variable directly?"
Will be faster coz you will skip mov eax, number.

Well, there is infact a quite small advantage as I realized with my test.
Tell me Lingo, do you really think these tests are useless, or are you
just joking?
Title: Re: Testing an odd number
Post by: lingo on August 27, 2010, 01:13:19 PM
Will be better for you to test it. Just make  two algos equal and test them. The times must be the same but I think they will be different... :U
Title: Re: Testing an odd number
Post by: frktons on August 27, 2010, 01:42:37 PM
Quote from: lingo on August 27, 2010, 01:13:19 PM
Will be better for you to test it. Just make  two algos equal and test them. The times must be the same but I think they will be different... :U

On modern OSes, with dozens of tasks around all the time, is almost impossible
to have two equal results. I think we can only have a broad idea of the speed, and
this should be the target of the tests. Thanks anyway for your comments.  :U
Title: Re: Testing an odd number
Post by: jj2007 on August 27, 2010, 02:46:38 PM
Quote from: dedndave on August 27, 2010, 12:46:55 PM
the original value may be recovered, if desired with RCL EAX,1
as long as the carry flag has not been altered

Cute, Dave. Didn't know that, probably since OpCodes.chm tells a different story:

QuoteRotates the bits in the destination to the left "count" times with all data pushed out the left side re-entering on the right. The Carry Flag holds the last bit rotated out.

Nothing on carry flag rotating in etc...

But I checked it with Olly, and you are right, as usual :U
Title: Re: Testing an odd number
Post by: dedndave on August 27, 2010, 02:51:19 PM
i am a little surprised, Jochen, that you don't have a solid handle on SH/RO/RC instructions
that's some of the first asm stuff i learned   :P
in the beginning, i didn't have many tools - had to figure out how things worked manually
when i first got a PC, i thought DEBUG was the bees knees - lol

on an 8008, we considered CF to be the "9th bit"   :lol
Title: Re: Testing an odd number
Post by: jj2007 on August 27, 2010, 02:53:10 PM
Quote from: dedndave on August 27, 2010, 02:51:19 PM
i am a little surprised, Jochen, that you don't have a solid handle on SH/RO/RC instructions
that's some of the first asm stuff i learned   :P

My solid handles are \masm32\help and Google, and I thought \masm32\help\opcodes.chm was enough. Usually it is correct.
Title: Re: Testing an odd number
Post by: mineiro on August 27, 2010, 02:57:05 PM
Hello Sr frktons;
I'm getting float results here; while at every firsts test "test" wins, but after some 10 tests the result invert and "bt" wins.
If I put some huge program to be load, while doing the test program, this results preserve here. At night I can do more tests to you.
regards.
Title: Re: Testing an odd number
Post by: dedndave on August 27, 2010, 03:02:02 PM
at my age, my solid handles seem to be around my waist - lol
all my life, i have worn slacks and jeans with 34 to 36 waist
had to get some new stuff - 38" now   :lol
Title: Re: Testing an odd number
Post by: jj2007 on August 27, 2010, 03:10:19 PM
Quote from: dedndave on August 27, 2010, 12:10:43 PM
hiyas Jochen - good morning (here) - good afternoon (there)

i can't imagine that much difference in speed  :P
must be something wrong with the test   ::)   :lol

Yes, sure! On a P4, test al is three millicycles slower:
Intel(R) Pentium(R) 4 CPU 3.40GHz (SSE3)
4034    1000*test eax, 1, result=500
4037    1000*test al, 1, result=500
10676   1000*bt eax, 0, result=500


By the way: Lingo is so quiet. I think he is cooking up an SSE5 version that runs in -50 cycles - on his CPU only :green2

P.S.:
> had to get some new stuff - 38" now   :lol
My condolencies - I know the problem :thumbu

P.P.S.: For those who draw their skills from OpCode.chm, here a "second opinion" (MasmReference):
QuoteRCL/RCR/ROL/ROR Rotate
Rotates the bits in the destination operand the number of times specified in the
source operand. RCL and ROL rotate the bits left; RCR and ROR rotate right.
ROL and ROR rotate the number of bits in the operand. For each rotation, the
leftmost or rightmost bit is copied to the carry flag as well as rotated. RCL and
RCR rotate through the carry flag. The carry flag becomes an extension of the
operand so that a 9-bit rotation is done for 8-bit operands, or a 17-bit rotation
for 16-bit operands.

On the 8088 and 8086, the source operand can be either CL or 1. On the
80186–80486, the source operand can be CL or an 8-bit constant. On the
80186–80486, rotate counts larger than 31 are masked off, but on the 8088 and
8086, larger rotate counts are performed despite the inefficiency involved. The
overflow flag is modified only by single-bit variations of the instruction; for
multiple-bit variations, the overflow flag is undefined.
Title: Re: Testing an odd number
Post by: bomz on August 27, 2010, 03:41:54 PM
masm32 applications limits by windows procedure, and serious code optimization needs only for huge computation.

(http://s47.radikal.ru/i116/1008/62/5ab791728ac3.png)

whether there Patch to remove this header ?
Title: Re: Testing an odd number
Post by: dedndave on August 27, 2010, 04:27:58 PM
when looking at the shift instructions, a picture is worth a thousand words (or dwords)

http://www.arl.wustl.edu/~lockwood/class/cs306/books/artofasm/Chapter_6/CH06-3.html#HEADING3-42
Title: Re: Testing an odd number
Post by: frktons on August 27, 2010, 05:21:24 PM
Quote from: mineiro on August 27, 2010, 02:57:05 PM
Hello Sr frktons;
I'm getting float results here; while at every firsts test "test" wins, but after some 10 tests the result invert and "bt" wins.
If I put some huge program to be load, while doing the test program, this results preserve here. At night I can do more tests to you.
regards.


Thanks miniero. I suppose test and bt performs quite the same.
If you run some more tests, let me know.

Quote from: dedndave on August 27, 2010, 04:27:58 PM
when looking at the shift instructions, a picture is worth a thousand words (or dwords)

http://www.arl.wustl.edu/~lockwood/class/cs306/books/artofasm/Chapter_6/CH06-3.html#HEADING3-42

Yes Master, I use them essentially when I've to divide/multiply by 2-4-8-16 in a fast way,
taking care of the carry-boy  :P

Title: Re: Testing an odd number
Post by: jj2007 on August 27, 2010, 06:15:21 PM
Quote from: bomz on August 27, 2010, 03:41:54 PM
masm32 applications limits by windows procedure, and serious code optimization needs only for huge computation.

Wow, that sounds like old Chinese wisdom. How old are you? And what do you mean concretely, in plain and correct English?
Title: Re: Testing an odd number
Post by: Rockoon on August 27, 2010, 07:29:22 PM
I recall BT performing fairly badly on old CPU's (386/486/P1 era) and at that time it was really only performance-useful for its ability to treat memory itself (rather than registers) as a huge array of bits 

In the case of:

bt [esi], ecx

ecx can be any integer value (not just in the range 0..31) so effectively the BT instruction calculated the proper word displacement in memory..

esi + (ecx >> 5)

...and bit mask...

1 << (ecx & 31)

..for you...

In the case of in-register (bt eax, ecx) usage it was easily beaten at the time in any number of ways.

These days it is very efficient, but I suspect that it will compete with nearby non-dependent lea instructions (which has similar shift-then-add capabilities) for the same silicon
Title: Re: Testing an odd number
Post by: mineiro on August 27, 2010, 09:57:15 PM

Intel(R) Pentium(R) Dual  CPU  E2160  @ 1.80GHz (SSE4)
2136    cycles for 1000*test eax, 1, result=500
2063    cycles for 1000*bt eax, 0, result=500

2344    cycles for 1000*test eax, 1, result=500
2028    cycles for 1000*bt eax, 0, result=500
--- ok ---

I really don't understand these results, this is a first try of the long nigh and "kabum", chaos teory again.
Without look at assembly level (opcode table,...) I tried to figure how both logics are done.
The "test" is a logical "and", and "and" is sequential, so, need run in all the path to reach the end.
In "bt", I imagine some "and"(sequential) with "or"(organizational/option)(multiplexer,demultiplexer), this can have more velocity, but, the grandfather 8086 cannot deal with one bit, his characteristics are from byte. You cannot deal with only one bit, if you like to set only one bit, you need work at least with one byte, then change the bit and send the byte again.
I humild cannot vote to a better based in my tests, but if I need vote in one, I will vote in one that get better results in all other cpu's.
regards. (sorry about my language).
Title: Re: Testing an odd number
Post by: bomz on August 28, 2010, 12:01:25 PM
Quote from: jj2007 on August 27, 2010, 06:15:21 PM
Quote from: bomz on August 27, 2010, 03:41:54 PM
masm32 applications limits by windows procedure, and serious code optimization needs only for huge computation.

Wow, that sounds like old Chinese wisdom. How old are you? And what do you mean concretely, in plain and correct English?

windows is not assembler code, this is code which need last modern processors or two, big hdd, modern video cards ...... MASM32 widely , accept own macros, use API functions. system  distribute time according it own consideration. deep optimization of concrete code have no big sence, as optimization own style of programming

http://www.kolibrios.org/ - system need one floppy disk. poor IBM
Title: Re: Testing an odd number
Post by: jj2007 on August 28, 2010, 03:27:23 PM
Quote from: bomz on August 28, 2010, 12:01:25 PM
windows is not assembler code, this is code which need last modern processors or two, big hdd, modern video cards ...... MASM32 widely , accept own macros, use API functions. system  distribute time according it own consideration. deep optimization of concrete code have no big sence, as optimization own style of programming

К сожалению, мой русский немного ржавый. Можете ли вы объяснить это на английском языке, пожалуйста? И можете ли вы привести конкретный пример? Спасибо.
Title: Re: Testing an odd number
Post by: bomz on August 28, 2010, 04:49:17 PM
ну какой тут можно привести пример. ты приведи пример как ты умудрился деоптимизировать код на ассемблере, что это сказалось на скорости выполнения программы

даже если пустой цикл вставишь. в любом случае есть разумное соотношение оптимизации программы к затраченному на это времени, при аксиме что любой код может быть оптимизирован.
Title: Re: Testing an odd number
Post by: jj2007 on August 28, 2010, 05:11:47 PM
Google believes you said:
QuoteWell what is there to give you an example. you give an example how you managed to deoptimizirovat code in assembler, it affected the speed of execution of the program even if the empty cycle to insert. In any case, there is a reasonable ratio to optimize the program spent for this time, Auxemite that any code can be optimized

bomz, if you believe you can do better, download the testbed, add your algo and show us that you are the champion. We respect good coders. Otherwise I would suggest that you spend some time learning English, because that is the language of this forum.
Title: Re: Testing an odd number
Post by: bomz on August 28, 2010, 05:16:28 PM
http://translate.google.com/
Title: Re: Testing an odd number
Post by: bomz on June 05, 2011, 12:09:13 PM
This problem with TICKS still alive and I remember about it. Absolutely accidentally I find this FASM code. It's complicate and I already not very good understand how it works, but I do seems working for MASM32. see below

;============================================================================
format     pe gui
include     '%fasminc%\win32a.inc'
;============================================================================
  ; выровнено на 4096                         ; В А Ш И   Д А Н Н Ы Е
;++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++



;++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
iter     =     10   ;  кол-во проходов testing:
;============================================================================
section     '.test' code readable writeable executable
;++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
tics     dq     0   ;
overhead    dd     0   ;
counter     dd     0   ;
resultlist  rq     iter   ;
templist    rb     iter*10+2   ;      "технические данные"
message     rb     iter*26+1   ;
caption     rb     64   ;
lpfmtm     db     '%.8X%.8X%.8X',13,10,0   ;
lpfmtc     db     '%0.8X / %u bytes / %u passes',0
;============================================================================
align     1024   ;
;============================================================================
entry     $
    invoke  GetCurrentProcess
    invoke  SetPriorityClass,eax,REALTIME_PRIORITY_CLASS
    invoke  GetCurrentThread
    invoke  SetThreadPriority,eax,THREAD_PRIORITY_TIME_CRITICAL
;============================================================================
    mov     ebp,5   ;
align     16   ;
@@:     mov     eax,0   ;
    cpuid   ;
    rdtsc   ;
    mov     dword [tics],eax   ;          подсчет overhead
    mov     dword [tics+4],edx   ;  (используется для вычета
    xor     eax,eax   ;        тактов, ушедших на
    cpuid   ;     "технические" детали)
    xor     eax,eax   ;
    cpuid   ;
    rdtsc   ;
    sub     eax,dword [tics]   ;
    mov     [overhead],eax   ;
    dec     ebp   ;
    jnz     @B   ;
;============================================================================
  ; используйте esi edi ebp                 И Н И Ц И А Л И З А Ц И Я
;++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++



;++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
;                                                    цикл проходов (итераций)
;============================================================================
align     16
testloop:   times   8 : nop   ; для выравнивания testing:
    mov     eax,0
    cpuid
    rdtsc
    mov     dword [tics],eax
    mov     dword [tics+4],edx
    xor     eax,eax
    cpuid     ; eax ecx edx ebx  не сохраняется
;============================================================================
testing:  ; выровнено на 16     ; Т Е С Т И Р У Е М Ы Е   И Н С Т Р У К Ц И И
;++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++



;++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
testsize    =     $-testing     ; К О Н Е Ц   И Н С Т Р У К Ц И Й
;============================================================================
    xor     eax,eax     ; eax ecx edx ebx  не сохраняется
    cpuid
    rdtsc
    mov     ebx,[counter]
    mov     dword [resultlist+ebx],eax
    mov     dword [resultlist+ebx+4],edx
    mov     eax,dword [tics]
    mov     edx,dword [tics+4]
    add     eax,[overhead]
    adc     edx,0
    sub     dword [resultlist+ebx],eax
    sbb     dword [resultlist+ebx+4],edx
    add     ebx,8
    mov     [counter],ebx
    cmp     ebx,iter * 8
    jb     testloop
;============================================================================
;                                                           вывод результатов
;============================================================================
    invoke  GetCurrentThread
    invoke  SetThreadPriority,eax,THREAD_PRIORITY_NORMAL
    invoke  GetCurrentProcess
    invoke  SetPriorityClass,eax,NORMAL_PRIORITY_CLASS
;============================================================================
    finit
    mov     esi,resultlist
    mov     ebp,templist
    mov     edi,message
align     4
@@:     fild    qword [esi]
    fabs
    fbstp   [ebp]
    invoke  wsprintf,edi,lpfmtm,[ebp+8],[ebp+4],[ebp]
    add     esp,20
    add     esi,8
    add     ebp,10
    add     edi,eax
    sub     [counter],8
    jnz     @B
    invoke  wsprintf,caption,lpfmtc,testing,testsize,iter
    add     esp,20
    invoke  MessageBox,0,message,caption,MB_ICONINFORMATION
    finit
    invoke  ExitProcess,0
;============================================================================
data     import
library     kernel32,'kernel32.dll',user32,'user32.dll'
include     '%fasminc%\apia\kernel32.inc'
include     '%fasminc%\apia\user32.inc'
end     data
;============================================================================


.586

.model flat, stdcall
option casemap :none

include \MASM32\INCLUDE\windows.inc
include \MASM32\INCLUDE\user32.inc
include \MASM32\INCLUDE\kernel32.inc
include \masm32\include\masm32.inc
includelib \MASM32\LIB\user32.lib
includelib \MASM32\LIB\kernel32.lib
includelib \masm32\lib\masm32.lib

.data?
Ticks LONG64 ?
Ticks1 LONG64 ?
CReg dt ?
FString db 21 dup(?)
.code
start:
invoke  GetCurrentProcess
invoke  SetPriorityClass,eax,REALTIME_PRIORITY_CLASS
invoke  GetCurrentThread
invoke  SetThreadPriority,eax,THREAD_PRIORITY_TIME_CRITICAL
mov ecx, 2
@@:
rdtsc
mov dword ptr [Ticks], eax
mov dword ptr [Ticks+4], edx
push ecx
;===============================================================
;================== Here insert testing code ===================
;===============================================================
pop ecx
rdtsc
mov dword ptr [Ticks1],eax
mov dword ptr [Ticks1+4],edx
loop @B

invoke  GetCurrentThread
invoke  SetThreadPriority,eax,THREAD_PRIORITY_NORMAL
invoke  GetCurrentProcess
invoke  SetPriorityClass,eax,NORMAL_PRIORITY_CLASS

fild Ticks
fild Ticks1
fsubr
fbstp CReg

lea edi,FString+18
lea esi,CReg
mov ecx, 10
@@:
xor eax, eax
lodsb
ror ax,4
shr ah, 4
add ax, 3030h
std
stosw
cld
loop @B

invoke MessageBox,0, addr FString,0,0

invoke ExitProcess,0

end start


At first run it empty and see how ticks use it, at my computer 88 ticks.

I find very interesting article about asm code optimization (sad Russian) there author talking about processors history from very beginning. SHR is not the slow command



hi hi in russian. Ray Duncan ??? Optimization asm program. can't find it in English
Title: Re: Testing an odd number
Post by: dedndave on June 05, 2011, 01:02:38 PM
the code looks quite similar to MichaelW's code that we use...

http://www.masm32.com/board/index.php?topic=770.msg5281#msg5281

i think there is room for improvement - which is one of the projects i am currently working on
the problem lies with the use of CPUID for serializing instructions
on some CPU's, CPUID is, let's call it "erratic", as it does not always take the same number of ticks to execute
on most newer CPU's, it is more stable
still, it takes something like 80 clock cycles, which is kinda long   :P

i think i have found a better way, but i have to wade through some other things to get there - lol
Title: Re: Testing an odd number
Post by: bomz on June 05, 2011, 01:43:14 PM
any code may be simplified
Title: Re: Testing an odd number
Post by: bomz on June 05, 2011, 07:57:43 PM
this 1.5 time quickly
mov ebx, eax
shr eax, 2
add eax, ebx
shr eax, 1

than
mov ebx, 10
mul ebx


strange.  IntelGenuine (http://smiles.kolobok.us/big_standart/mocking.gif)
Title: Re: Testing an odd number
Post by: bomz on June 05, 2011, 07:59:51 PM
xor or and - very quickly.
Title: Re: Testing an odd number
Post by: dedndave on June 05, 2011, 08:13:02 PM
MUL is pretty fast on newer processors
and - this code has a lot of dependancies
        mov     ebx,eax
        shl     eax,2
        add     eax,ebx
        shl     eax,1

not to mention code bytes
in a loop, MUL might be the better choice

also interesting....
on a P4, MUL wants the register multiplier, EAX, and EDX "settled"
it's a good idea to perform some unrelated instruction just before the MUL
IMUL is not so sensitive - it even seems to like it   :P
also probably not true on newer cores
Title: Re: Testing an odd number
Post by: bomz on June 05, 2011, 08:17:16 PM
mul more universal. it multiply not only eax but put result to edx. also it mul not only 10. but in many cases we don't need mul edx and it possible simplify code
Title: Re: Testing an odd number
Post by: bomz on June 06, 2011, 05:21:13 AM
my code is not very correct. command CLD and STD get 48 ticks. PUSHF POPF too more than 50. may be thats true . of course CLD don't need so many ticks, but call some process.
Title: Re: Testing an odd number
Post by: dedndave on June 06, 2011, 09:31:11 AM
yah - those instructions are slow
i think the OS has to check for priviledge level changes or something for POPFD   :P
for the direction flag, it must keep track of the current state because API's expect it to be cleared
PUSHFD isn't too bad, because it does not alter flags
but CLD, STD, and POPFD are slow - about 100 cycles on my machine
CLC and STC are fast, at least
Title: Re: Testing an odd number
Post by: bomz on June 06, 2011, 10:29:03 AM

;std
;stosw
;cld

;mov word ptr[edi], ax
;sub edi, 2

stosw
sub edi, 4
Title: Re: Testing an odd number
Post by: dedndave on June 06, 2011, 10:33:14 AM
 :bg
        mov     [edi],ax

i suspect which is fastest depends on which generation of CPU it's running on
Title: Re: Testing an odd number
Post by: bomz on June 06, 2011, 10:34:23 AM
need mov     [edi], ax
Title: Re: Testing an odd number
Post by: bomz on June 06, 2011, 10:40:22 AM
I think STOS use special line
Title: Re: Testing an odd number
Post by: bomz on June 06, 2011, 10:43:39 AM
word ptr - seems no differences Pentium 4

STOS slowly
Title: Re: Testing an odd number
Post by: bomz on June 11, 2011, 07:29:08 PM
(http://smiles.kolobok.us/light_skin/wacko.gif)

strange that all "combine" directives slowly. because than cpu execute code it never now the next command....may be my cpu very old


QuoteNEXT:

loop NEXT

slowly than 25%

QuoteNEXT:
sub ecx, 1
cmp ecx, 0
jnz NEXT


Quoteinc eax

slowly than

Quoteadd eax, 1


in my code was interesting situation lea quickly than sub

scasb lodsb movsb stosb cmpsb - slow
Title: Re: Testing an odd number
Post by: dedndave on June 12, 2011, 12:45:12 AM
NEXT:
sub ecx, 1
cmp ecx, 0
jnz NEXT


no need to CMP
SUB will set or clear the zero flag   :P
NEXT:
sub ecx, 1
jnz NEXT


DEC works, too - and is a single-byte instruction for dword general registers
NEXT:
dec ecx
jnz NEXT
Title: Re: Testing an odd number
Post by: bomz on June 12, 2011, 07:08:29 AM
with cmp do the same loop