News:

MASM32 SDK Description, downloads and other helpful links
MASM32.com New Forum Link
masmforum WebSite

bin2dword

Started by MichaelW, May 03, 2006, 11:34:43 PM

Previous topic - Next topic

MichaelW

Timings for P3:

164 cycles, bin2dword (MichaelW) 32-bit input
153 cycles, b2dw (lingo) 32-bit input
144 cycles, bin2dw (EduardoS) 32-bit input
107 cycles, BinToDw (drizz) 32-bit input

61 cycles, bin2dword (MichaelW) 8-bit input
34 cycles, b2dw (lingo) 8-bit input
47 cycles, bin2dw (EduardoS) 8-bit input
39 cycles, BinToDw (drizz) 8-bit input
23 cycles, bin2byte_ex (Hutch) 8-bit input

14 cycles, bin2dword (MichaelW) 1-bit input
10 cycles, b2dw (lingo) 1-bit input
8 cycles, bin2dw (EduardoS) 1-bit input
7 cycles, BinToDw (drizz) 1-bit input


I would like to see the results for a good compiler-optimized version (where the coder knows enough to produce an optimal C source, I don't).




[attachment deleted by admin]
eschew obfuscation

EduardoS

#16
Quote from: MichaelW on May 07, 2006, 09:56:27 PM
I would like to see the results for a good compiler-optimized version (where the coder knows enough to produce an optimal C source, I don't).

Maybe the drizz's code is easy to convert:


unsigned int b2dw(char* ptr)
{
unsigned int ret = 0;
char tmp;
while(tmp = *(ptr++))
  ret = ret * 2 + (tmp & 1);
return ret;
}


Timings for Athlon 64:

137 cycles, bin2dword (MichaelW) 32-bit input
115 cycles, b2dw (lingo) 32-bit input
113 cycles, bin2dw (EduardoS) 32-bit input
100 cycles, BinToDw (drizz) 32-bit input

56 cycles, bin2dword (MichaelW) 8-bit input
28 cycles, b2dw (lingo) 8-bit input
26 cycles, bin2dw (EduardoS) 8-bit input
21 cycles, BinToDw (drizz) 8-bit input
15 cycles, bin2byte_ex (Hutch) 8-bit input

7 cycles, bin2dword (MichaelW) 1-bit input
5 cycles, b2dw (lingo) 1-bit input
5 cycles, bin2dw (EduardoS) 1-bit input
5 cycles, BinToDw (drizz) 1-bit input


EDIT: C++ Code fixed.

EduardoS

Using Visual C++ 2003 the generated assembly was:

?b2dw@@YAIPAD@Z PROC NEAR                                         ; b2dw, COMDAT

; 14   : unsigned int ret = 0;
; 15   : char tmp;
; 16   : while(tmp = *(ptr++))

            mov      edx, DWORD PTR _ptr$[esp-4]

; 17   :   ret = ret * 2 + (tmp & 1);

            xor       ecx, ecx
            mov      cl, BYTE PTR [edx]
            xor       eax, eax
            test      cl, cl
            je         SHORT $L9651
            npad     2
$L9632:
            and       ecx, 1
            inc        edx
            lea        eax, DWORD PTR [ecx+eax*2]
            mov      cl, BYTE PTR [edx]
            test      cl, cl
            jne        SHORT $L9632
$L9651:

; 18   : return ret;
; 19   : }


the timings (i included a unrolled version of bin2dw to avoid C++ being faster than mine one :bdg):

136 cycles, bin2dword (MichaelW) 32-bit input
114 cycles, b2dw (lingo) 32-bit input
112 cycles, bin2dw (EduardoS) 32-bit input
113 cycles, bin2dwc (Visual C++) 32-bit input
35 cycles, bin2dwu (C++ Killer) 32-bit input
99 cycles, BinToDw (drizz) 32-bit input

55 cycles, bin2dword (MichaelW) 8-bit input
28 cycles, b2dw (lingo) 8-bit input
26 cycles, bin2dw (EduardoS) 8-bit input
24 cycles, bin2dwc (Visual C++) 8-bit input
13 cycles, bin2dwu (C++ Killer) 8-bit input
21 cycles, BinToDw (drizz) 8-bit input
15 cycles, bin2byte_ex (Hutch) 8-bit input

7 cycles, bin2dword (MichaelW) 1-bit input
5 cycles, b2dw (lingo) 1-bit input
5 cycles, bin2dw (EduardoS) 1-bit input
2 cycles, bin2dwc (Visual C++) 1-bit input
5 cycles, bin2dwu (C++ Killer) 1-bit input
5 cycles, BinToDw (drizz) 1-bit input



[attachment deleted by admin]

lingo

“lingo, i'm disappointed !” :lol

OK, biger and faster again …

P4 Prescott 3.6GHz – XP pro SP2

242 cycles, bin2dword (MichaelW) 32-bit input
149 cycles, b2dw (lingo) 32-bit input
203 cycles, bin2dw (EduardoS) 32-bit input
214 cycles, bin2dwc (Visual C++) 32-bit input
140 cycles, bin2dwu (C++ Killer) 32-bit input
157 cycles, BinToDw (drizz) 32-bit input
109 cycles, b2dw1 (lingo-fast) 32-bit input

67 cycles, bin2dword (MichaelW) 8-bit input
43 cycles, b2dw (lingo) 8-bit input
39 cycles, bin2dw (EduardoS) 8-bit input
47 cycles, bin2dwc (Visual C++) 8-bit input
31 cycles, bin2dwu (C++ Killer) 8-bit input
37 cycles, BinToDw (drizz) 8-bit input
31 cycles, bin2byte_ex (Hutch) 8-bit input
36 cycles, b2dw1 (lingo-fast) 8-bit input

17 cycles, bin2dword (MichaelW) 1-bit input
12 cycles, b2dw (lingo) 1-bit input
10 cycles, bin2dw (EduardoS) 1-bit input
13 cycles, bin2dwc (Visual C++) 1-bit input
8 cycles, bin2dwu (C++ Killer) 1-bit input
12 cycles, BinToDw (drizz) 1-bit input
8 cycles, b2dw1 (lingo-fast) 1-bit input

Press any key to exit...



AMD Turion 64 ML-30 processor (1 MB L2 cache, 1.6 Ghz)
– XP pro SP2

136 cycles, bin2dword (MichaelW) 32-bit input
114 cycles, b2dw (lingo) 32-bit input
112 cycles, bin2dw (EduardoS) 32-bit input
113 cycles, bin2dwc (Visual C++) 32-bit input
35 cycles, bin2dwu (C++ Killer) 32-bit input
99 cycles, BinToDw (drizz) 32-bit input
42 cycles, b2dw1 (lingo-fast) 32-bit input

56 cycles, bin2dword (MichaelW) 8-bit input
28 cycles, b2dw (lingo) 8-bit input
26 cycles, bin2dw (EduardoS) 8-bit input
24 cycles, bin2dwc (Visual C++) 8-bit input
13 cycles, bin2dwu (C++ Killer) 8-bit input
21 cycles, BinToDw (drizz) 8-bit input
15 cycles, bin2byte_ex (Hutch) 8-bit input
12 cycles, b2dw1 (lingo-fast) 8-bit input

7 cycles, bin2dword (MichaelW) 1-bit input
5 cycles, b2dw (lingo) 1-bit input
5 cycles, bin2dw (EduardoS) 1-bit input
3 cycles, bin2dwc (Visual C++) 1-bit input
4 cycles, bin2dwu (C++ Killer) 1-bit input
5 cycles, BinToDw (drizz) 1-bit input
4 cycles, b2dw1 (lingo-fast) 1-bit input

Press any key to exit...


Regards,
Lingo


[attachment deleted by admin]

six_L

Quote
127 cycles, bin2dword (MichaelW) 32-bit input
209 cycles, b2dw (lingo) 32-bit input
130 cycles, bin2dw (EduardoS) 32-bit input
158 cycles, bin2dwc (Visual C++) 32-bit input
77 cycles, bin2dwu (C++ Killer) 32-bit input
97 cycles, BinToDw (drizz) 32-bit input
90 cycles, b2dw1 (lingo-fast) 32-bit input

35 cycles, bin2dword (MichaelW) 8-bit input
28 cycles, b2dw (lingo) 8-bit input
45 cycles, bin2dw (EduardoS) 8-bit input
38 cycles, bin2dwc (Visual C++) 8-bit input
22 cycles, bin2dwu (C++ Killer) 8-bit input
23 cycles, BinToDw (drizz) 8-bit input
34 cycles, bin2byte_ex (Hutch) 8-bit input
20 cycles, b2dw1 (lingo-fast) 8-bit input

10 cycles, bin2dword (MichaelW) 1-bit input
8 cycles, b2dw (lingo) 1-bit input
18 cycles, bin2dw (EduardoS) 1-bit input
4 cycles, bin2dwc (Visual C++) 1-bit input
5 cycles, bin2dwu (C++ Killer) 1-bit input
6 cycles, BinToDw (drizz) 1-bit input
4 cycles, b2dw1 (lingo-fast) 1-bit input

Press any key to exit...
regards

mnemonic

AMD Turion 64/XP Home SP2

bin2dword2:

136 cycles, bin2dword (MichaelW) 32-bit input
114 cycles, b2dw (lingo) 32-bit input
112 cycles, bin2dw (EduardoS) 32-bit input
113 cycles, bin2dwc (Visual C++) 32-bit input
35 cycles, bin2dwu (C++ Killer) 32-bit input
99 cycles, BinToDw (drizz) 32-bit input
42 cycles, b2dw1 (lingo-fast) 32-bit input

55 cycles, bin2dword (MichaelW) 8-bit input
28 cycles, b2dw (lingo) 8-bit input
26 cycles, bin2dw (EduardoS) 8-bit input
24 cycles, bin2dwc (Visual C++) 8-bit input
13 cycles, bin2dwu (C++ Killer) 8-bit input
21 cycles, BinToDw (drizz) 8-bit input
15 cycles, bin2byte_ex (Hutch) 8-bit input
12 cycles, b2dw1 (lingo-fast) 8-bit input

7 cycles, bin2dword (MichaelW) 1-bit input
5 cycles, b2dw (lingo) 1-bit input
5 cycles, bin2dw (EduardoS) 1-bit input
2 cycles, bin2dwc (Visual C++) 1-bit input
5 cycles, bin2dwu (C++ Killer) 1-bit input
5 cycles, BinToDw (drizz) 1-bit input
3 cycles, b2dw1 (lingo-fast) 1-bit input
Be kind. Everyone you meet is fighting a hard battle.--Plato
-------
How To Ask Questions The Smart Way

paranoidx

bin2dword2  P2 2.8 HT


Result1
198 cycles, bin2dword (MichaelW) 32-bit input
136 cycles, b2dw (lingo) 32-bit input
135 cycles, bin2dw (EduardoS) 32-bit input
150 cycles, bin2dwc (Visual C++) 32-bit input
80 cycles, bin2dwu (C++ Killer) 32-bit input
132 cycles, BinToDw (drizz) 32-bit input
65 cycles, b2dw1 (lingo-fast) 32-bit input

46 cycles, bin2dword (MichaelW) 8-bit input
33 cycles, b2dw (lingo) 8-bit input
23 cycles, bin2dw (EduardoS) 8-bit input
34 cycles, bin2dwc (Visual C++) 8-bit input
12 cycles, bin2dwu (C++ Killer) 8-bit input
24 cycles, BinToDw (drizz) 8-bit input
20 cycles, bin2byte_ex (Hutch) 8-bit input
10 cycles, b2dw1 (lingo-fast) 8-bit input

19 cycles, bin2dword (MichaelW) 1-bit input
9 cycles, b2dw (lingo) 1-bit input
0 cycles, bin2dw (EduardoS) 1-bit input <-- ??
1 cycles, bin2dwc (Visual C++) 1-bit input
4294967294 cycles, bin2dwu (C++ Killer) 1-bit input <-- ??
4 cycles, BinToDw (drizz) 1-bit input
4294967294 cycles, b2dw1 (lingo-fast) 1-bit input <---??

Result2:
188 cycles, bin2dword (MichaelW) 32-bit input
138 cycles, b2dw (lingo) 32-bit input
126 cycles, bin2dw (EduardoS) 32-bit input
158 cycles, bin2dwc (Visual C++) 32-bit input
79 cycles, bin2dwu (C++ Killer) 32-bit input
130 cycles, BinToDw (drizz) 32-bit input
70 cycles, b2dw1 (lingo-fast) 32-bit input

41 cycles, bin2dword (MichaelW) 8-bit input
34 cycles, b2dw (lingo) 8-bit input
28 cycles, bin2dw (EduardoS) 8-bit input
36 cycles, bin2dwc (Visual C++) 8-bit input
9 cycles, bin2dwu (C++ Killer) 8-bit input
34 cycles, BinToDw (drizz) 8-bit input
26 cycles, bin2byte_ex (Hutch) 8-bit input
13 cycles, b2dw1 (lingo-fast) 8-bit input

10 cycles, bin2dword (MichaelW) 1-bit input
2 cycles, b2dw (lingo) 1-bit input
10 cycles, bin2dw (EduardoS) 1-bit input
10 cycles, bin2dwc (Visual C++) 1-bit input
10 cycles, bin2dwu (C++ Killer) 1-bit input
3 cycles, BinToDw (drizz) 1-bit input
2 cycles, b2dw1 (lingo-fast) 1-bit input


Result1 I get on normal execute
Result2 I get after a couple of runs and it returns back to result1

NightWare

guys,

if you want to make test, ok, but do it correctly... there is Align4, Align8 and Align16 procs on the same test... all must be 16 bytes aligned ! or you have to test the 2 possible Align8, or the 4 possible Align4 for all the procs... coz there is differencies... otherwise the résults are completly useless...