News:

MASM32 SDK Description, downloads and other helpful links
MASM32.com New Forum Link
masmforum WebSite

Timings for AMD, P4, Core Duo

Started by jj2007, February 15, 2009, 10:19:07 AM

Previous topic - Next topic

jj2007

I am cooking some algos and would love to see timings for other processors, especially AMDs, Core Duo, P4.
Any volunteers? Just double-click the exe and post the output. Pm me if you want to have the rather confused source :red
Thanxalot, jj

[attachment deleted by admin]

MichaelW

How about a P3:

 (SSE1)

Source len=4096

5194     clocks, mode s1, DestA
5200     clocks, mode s1, DestB

5193     clocks, mode d1, DestA
5200     clocks, mode d1, DestB

6198     clocks, mode c1, DestA
6205     clocks, mode c1, DestB

12366    clocks, mode m1, DestA
12368    clocks, mode m1, DestB

Source len=128

213      clocks, mode s1, DestA
211      clocks, mode s1, DestB

213      clocks, mode d1, DestA
213      clocks, mode d1, DestB

222      clocks, mode c1, DestA
222      clocks, mode c1, DestB

407      clocks, mode m1, DestA
407      clocks, mode m1, DestB

Source len=16

71       clocks, mode s1, DestA
71       clocks, mode s1, DestB

74       clocks, mode d1, DestA
74       clocks, mode d1, DestB

38       clocks, mode c1, DestA
38       clocks, mode c1, DestB
         --- OK ---

eschew obfuscation

BlackVortex

My results (returned to stock CPU/memory speeds for this test)
AMD Athlon(tm) 64 X2 Dual Core Processor 4200+ (SSE3)

Source len=4096

4622 clocks, mode s1, DestA
4443 clocks, mode s1, DestB

4957 clocks, mode d1, DestA
4872 clocks, mode d1, DestB

8688 clocks, mode c1, DestA
8728 clocks, mode c1, DestB

20870 clocks, mode m1, DestA
27694 clocks, mode m1, DestB

Source len=128

107 clocks, mode s1, DestA
100 clocks, mode s1, DestB

103 clocks, mode d1, DestA
94 clocks, mode d1, DestB

323 clocks, mode c1, DestA
323 clocks, mode c1, DestB

926 clocks, mode m1, DestA
933 clocks, mode m1, DestB

Source len=16

104 clocks, mode s1, DestA
103 clocks, mode s1, DestB

105 clocks, mode d1, DestA
106 clocks, mode d1, DestB

53 clocks, mode c1, DestA
53 clocks, mode c1, DestB
--- OK ---


EDIT: What kind of computations does this do/test ?
And are lower numbers better or worse ?

Neil

Intel(R) Core(TM)2 Quad  CPU   Q9550  @ 2.83GHz (SSE4)

Source len=4096

3062     clocks, mode s1, DestA
3571     clocks, mode s1, DestB

2752     clocks, mode d1, DestA
2627     clocks, mode d1, DestB

4130     clocks, mode c1, DestA
4130     clocks, mode c1, DestB

8223     clocks, mode m1, DestA
8236     clocks, mode m1, DestB

Source len=128

123      clocks, mode s1, DestA
133      clocks, mode s1, DestB

132      clocks, mode d1, DestA
129      clocks, mode d1, DestB

134      clocks, mode c1, DestA
134      clocks, mode c1, DestB

278      clocks, mode m1, DestA
281      clocks, mode m1, DestB

Source len=16

55       clocks, mode s1, DestA
55       clocks, mode s1, DestB

60       clocks, mode d1, DestA
77       clocks, mode d1, DestB

22       clocks, mode c1, DestA
22       clocks, mode c1, DestB
         --- OK ---

jj2007

Quote from: BlackVortex on February 15, 2009, 10:35:46 AM
EDIT: What kind of computations does this do/test ?
And are lower numbers better or worse ?

It's an lstrcpy-type algo, and low numbers are better. The purpose is to see the difference between source/destination data alignment.
Michael: Thanks for the P3 test. It's good to know that the branch to the non-SSE2 algo worked ;-)
Neil: Thanks :thumbu
Here are my own results.

Intel(R) Celeron(R) M CPU        420  @ 1.60GHz (SSE3)

Source len=4096

2754     clocks, mode s1, DestA
3095     clocks, mode s1, DestB

2788     clocks, mode d1, DestA
2543     clocks, mode d1, DestB

5153     clocks, mode c1, DestA
5152     clocks, mode c1, DestB

8235     clocks, mode m1, DestA
8236     clocks, mode m1, DestB

Source len=128

126      clocks, mode s1, DestA
133      clocks, mode s1, DestB

139      clocks, mode d1, DestA
133      clocks, mode d1, DestB

167      clocks, mode c1, DestA
167      clocks, mode c1, DestB

284      clocks, mode m1, DestA
287      clocks, mode m1, DestB

Source len=16

57       clocks, mode s1, DestA
57       clocks, mode s1, DestB

68       clocks, mode d1, DestA
76       clocks, mode d1, DestB

27       clocks, mode c1, DestA
27       clocks, mode c1, DestB

rags

jj, here are my results:


             Intel(R) Pentium(R) 4 CPU 2.53GHz (SSE2)

Source len=4096

9569     clocks, mode s1, DestA
9123     clocks, mode s1, DestB

11921    clocks, mode d1, DestA
8383     clocks, mode d1, DestB

4562     clocks, mode c1, DestA
4387     clocks, mode c1, DestB

8497     clocks, mode m1, DestA
8616     clocks, mode m1, DestB

Source len=128

409      clocks, mode s1, DestA
373      clocks, mode s1, DestB

482      clocks, mode d1, DestA
371      clocks, mode d1, DestB

164      clocks, mode c1, DestA
162      clocks, mode c1, DestB

288      clocks, mode m1, DestA
285      clocks, mode m1, DestB

Source len=16

142      clocks, mode s1, DestA
142      clocks, mode s1, DestB

177      clocks, mode d1, DestA
187      clocks, mode d1, DestB

29       clocks, mode c1, DestA
25       clocks, mode c1, DestB
        --- OK ---
God made Man, but the monkey applied the glue -DEVO

jj2007

Quote from: rags on February 15, 2009, 11:39:29 AM
jj, here are my results:


Thanks, I am very disappointed.  :dazzled:
Could you please give it another try? I want to see if at least the 16-byte aligned version brings some improvement on the P4...


[attachment deleted by admin]

rags

Ok JJ, I ran each one twice for you.
testbed1, run 1:

              Intel(R) Pentium(R) 4 CPU 2.53GHz (SSE2)

Source len=4096

9551     clocks, mode s1, DestA
9101     clocks, mode s1, DestB

11894    clocks, mode d1, DestA
8358     clocks, mode d1, DestB

4440     clocks, mode c1, DestA
4316     clocks, mode c1, DestB

8688     clocks, mode m1, DestA
8579     clocks, mode m1, DestB

Source len=128

409      clocks, mode s1, DestA
397      clocks, mode s1, DestB

484      clocks, mode d1, DestA
380      clocks, mode d1, DestB

164      clocks, mode c1, DestA
161      clocks, mode c1, DestB

286      clocks, mode m1, DestA
298      clocks, mode m1, DestB

Source len=16

142      clocks, mode s1, DestA
116      clocks, mode s1, DestB

171      clocks, mode d1, DestA
179      clocks, mode d1, DestB

29       clocks, mode c1, DestA
27       clocks, mode c1, DestB
;---------------------------------------------------------------------
testbed1, run 2:
              Intel(R) Pentium(R) 4 CPU 2.53GHz (SSE2)

Source len=4096

9589     clocks, mode s1, DestA
9209     clocks, mode s1, DestB

11889    clocks, mode d1, DestA
8444     clocks, mode d1, DestB

4310     clocks, mode c1, DestA
4546     clocks, mode c1, DestB

8920     clocks, mode m1, DestA
8522     clocks, mode m1, DestB

Source len=128

413      clocks, mode s1, DestA
375      clocks, mode s1, DestB

486      clocks, mode d1, DestA
355      clocks, mode d1, DestB

195      clocks, mode c1, DestA
176      clocks, mode c1, DestB

284      clocks, mode m1, DestA
287      clocks, mode m1, DestB

Source len=16

142      clocks, mode s1, DestA
116      clocks, mode s1, DestB

163      clocks, mode d1, DestA
165      clocks, mode d1, DestB

29       clocks, mode c1, DestA
29       clocks, mode c1, DestB

;---------------------------------------------------------------------
testbed2, run 1:
              Intel(R) Pentium(R) 4 CPU 2.53GHz (SSE2)

Source len=4096

9562     clocks, mode s1, DestA
9159     clocks, mode s1, DestB
1611     clocks, mode s1, DestC

11937    clocks, mode d1, DestA
8450     clocks, mode d1, DestB
1630     clocks, mode d1, DestC

4456     clocks, mode c1, DestA
4189     clocks, mode c1, DestB
4612     clocks, mode c1, DestC

8942     clocks, mode m1, DestA
8879     clocks, mode m1, DestB
8764     clocks, mode m1, DestC

Source len=128

451      clocks, mode s1, DestA
380      clocks, mode s1, DestB
187      clocks, mode s1, DestC

484      clocks, mode d1, DestA
376      clocks, mode d1, DestB
234      clocks, mode d1, DestC

166      clocks, mode c1, DestA
162      clocks, mode c1, DestB
162      clocks, mode c1, DestC

289      clocks, mode m1, DestA
300      clocks, mode m1, DestB
343      clocks, mode m1, DestC
;---------------------------------------------------------------------
testbed2, run 2:
              Intel(R) Pentium(R) 4 CPU 2.53GHz (SSE2)

Source len=4096

9685     clocks, mode s1, DestA
9141     clocks, mode s1, DestB
1626     clocks, mode s1, DestC

12009    clocks, mode d1, DestA
8356     clocks, mode d1, DestB
1628     clocks, mode d1, DestC

4397     clocks, mode c1, DestA
4264     clocks, mode c1, DestB
4261     clocks, mode c1, DestC

8532     clocks, mode m1, DestA
8588     clocks, mode m1, DestB
8601     clocks, mode m1, DestC

Source len=128

450      clocks, mode s1, DestA
378      clocks, mode s1, DestB
191      clocks, mode s1, DestC

484      clocks, mode d1, DestA
374      clocks, mode d1, DestB
212      clocks, mode d1, DestC

166      clocks, mode c1, DestA
165      clocks, mode c1, DestB
162      clocks, mode c1, DestC

297      clocks, mode m1, DestA
346      clocks, mode m1, DestB
298      clocks, mode m1, DestC
;---------------------------------------------------------------------

God made Man, but the monkey applied the glue -DEVO

jj2007

Quote from: rags on February 15, 2009, 01:03:49 PM
Ok JJ, I ran each one twice for you.

Intel(R) Pentium(R) 4 CPU 2.53GHz (SSE2)

Source len=4096

9562     clocks, mode s1, DestA
9159     clocks, mode s1, DestB
1611     clocks, mode s1, DestC


Thanxalot, that cheered me up again :bg
(DestC means source and destination are on 16-byte boundaries)

PBrennick

JJ,
Here are my results. A question, the reults from a previous test in another posting listed my CPU as Itanium;
This one says Celeron. How come?

Quote
Intel(R) Celeron(R) CPU 1.70GHz (SSE2)

Source len=4096

9505     clocks, mode s1, DestA
9131     clocks, mode s1, DestB

12294    clocks, mode d1, DestA
8123     clocks, mode d1, DestB

4521     clocks, mode c1, DestA
4260     clocks, mode c1, DestB

8515     clocks, mode m1, DestA
8667     clocks, mode m1, DestB

Source len=128

435      clocks, mode s1, DestA
394      clocks, mode s1, DestB

494      clocks, mode d1, DestA
411      clocks, mode d1, DestB

162      clocks, mode c1, DestA
159      clocks, mode c1, DestB

286      clocks, mode m1, DestA
482      clocks, mode m1, DestB

Source len=16

146      clocks, mode s1, DestA
121      clocks, mode s1, DestB

174      clocks, mode d1, DestA
189      clocks, mode d1, DestB

29       clocks, mode c1, DestA
25       clocks, mode c1, DestB
         --- OK ---

Paul
The GeneSys Project is available from:
The Repository or My crappy website

jj2007

Quote from: PBrennick on February 15, 2009, 01:40:55 PM
JJ,
Here are my results. A question, the reults from a previous test in another posting listed my CPU as Itanium;
This one says Celeron. How come?


Paul,
Thanks. The first version of the ShowCPU algo tried to identify processors by family and model. The newer one uses the brand string, which is more precise. Your Celeron behaves like a P4 - see rags post above, following which I posted a second version above as TestBed2.zip. The latter gives also timings for the ideal case where both source and destination are para-aligned.

rags

God made Man, but the monkey applied the glue -DEVO

FORTRANS

Hi,

   Not sure if you want some older CPU's, but here goes.  PIII
with Windows 2000, and Pentium MMX with Windows 98.  The
PIII/P3 had some control characters that I replaces with the
equivalent text in an editor.

Regards,

Steve N.


This is your CPU:
Model           4
Family          5
Step              3
Manufacturer   GenuineIntel
Description   Intel P1 (1993+), MMX
Brand name   


Source len=4096

13082    clocks, mode s1, DestA
12991    clocks, mode s1, DestB
12963    clocks, mode s1, DestC

12952    clocks, mode d1, DestA
12958    clocks, mode d1, DestB
12954    clocks, mode d1, DestC

12920    clocks, mode c1, DestA
12915    clocks, mode c1, DestB
12911    clocks, mode c1, DestC

51220    clocks, mode m1, DestA
51229    clocks, mode m1, DestB
51230    clocks, mode m1, DestC

Source len=128

468    clocks, mode s1, DestA
468    clocks, mode s1, DestB
467    clocks, mode s1, DestC

462    clocks, mode d1, DestA
458    clocks, mode d1, DestB
461    clocks, mode d1, DestC

424    clocks, mode c1, DestA
421    clocks, mode c1, DestB
422    clocks, mode c1, DestC

1655    clocks, mode m1, DestA
1678    clocks, mode m1, DestB
1673    clocks, mode m1, DestC
    --- OK ---

This is your CPU:
Model           8
Family          6
Step              3
Manufacturer   GenuineIntel
Description   Intel P3 (2000+), SSE1
Brand name   ^A^A^B^C

^A^A^B^C (SSE1)

Source len=4096

5219    clocks, mode s1, DestA
5216    clocks, mode s1, DestB
5236    clocks, mode s1, DestC

5218    clocks, mode d1, DestA
5228    clocks, mode d1, DestB
5217    clocks, mode d1, DestC

6245    clocks, mode c1, DestA
6233    clocks, mode c1, DestB
6249    clocks, mode c1, DestC

12441    clocks, mode m1, DestA
12430    clocks, mode m1, DestB
12429    clocks, mode m1, DestC

Source len=128

212    clocks, mode s1, DestA
212    clocks, mode s1, DestB
212    clocks, mode s1, DestC

214    clocks, mode d1, DestA
213    clocks, mode d1, DestB
214    clocks, mode d1, DestC

223    clocks, mode c1, DestA
223    clocks, mode c1, DestB
223    clocks, mode c1, DestC

408    clocks, mode m1, DestA
409    clocks, mode m1, DestB
410    clocks, mode m1, DestC
    --- OK ---


PBrennick

JJ,
Thanx for the explanation, my CPU is, indeed, a Celeron, 1.70Ghz. It actually clocks at 1.69Ghz, though. The difference between the Spec. and the actual is so slight I doubt it has any significant impact on any testing I may choose to do. Do my results look okay to you?

Paul
The GeneSys Project is available from:
The Repository or My crappy website

UlliN

Hi,

two more results :

AMD Athlon(tm) 64 FX-57 Processor (SSE3)

Source len=4096

1934     clocks, mode s1, DestA
1941     clocks, mode s1, DestB
1358     clocks, mode s1, DestC

2103     clocks, mode d1, DestA
2191     clocks, mode d1, DestB
1357     clocks, mode d1, DestC

3807     clocks, mode c1, DestA
3806     clocks, mode c1, DestB
3801     clocks, mode c1, DestC

12648    clocks, mode m1, DestA
12631    clocks, mode m1, DestB
12433    clocks, mode m1, DestC

Source len=128

96       clocks, mode s1, DestA
91       clocks, mode s1, DestB
80       clocks, mode s1, DestC

104      clocks, mode d1, DestA
95       clocks, mode d1, DestB
81       clocks, mode d1, DestC

143      clocks, mode c1, DestA
143      clocks, mode c1, DestB
143      clocks, mode c1, DestC

407      clocks, mode m1, DestA
413      clocks, mode m1, DestB
407      clocks, mode m1, DestC
         --- OK ---


Intel(R) Core(TM)2 Duo CPU     E8500  @ 3.16GHz (SSE4)

Source len=4096

3059     clocks, mode s1, DestA
3577     clocks, mode s1, DestB
1541     clocks, mode s1, DestC

2776     clocks, mode d1, DestA
2628     clocks, mode d1, DestB
1550     clocks, mode d1, DestC

3282     clocks, mode c1, DestA
3281     clocks, mode c1, DestB
3279     clocks, mode c1, DestC

8270     clocks, mode m1, DestA
8238     clocks, mode m1, DestB
8416     clocks, mode m1, DestC

Source len=128

123      clocks, mode s1, DestA
136      clocks, mode s1, DestB
92       clocks, mode s1, DestC

133      clocks, mode d1, DestA
131      clocks, mode d1, DestB
95       clocks, mode d1, DestC

113      clocks, mode c1, DestA
114      clocks, mode c1, DestB
113      clocks, mode c1, DestC

290      clocks, mode m1, DestA
293      clocks, mode m1, DestB
292      clocks, mode m1, DestC
         --- OK ---


Regards
Ulli