The MASM Forum Archive 2004 to 2012

General Forums => The Laboratory => Topic started by: jj2007 on February 15, 2009, 10:19:07 AM

Title: Timings for AMD, P4, Core Duo
Post by: jj2007 on February 15, 2009, 10:19:07 AM
I am cooking some algos and would love to see timings for other processors, especially AMDs, Core Duo, P4.
Any volunteers? Just double-click the exe and post the output. Pm me if you want to have the rather confused source :red
Thanxalot, jj

[attachment deleted by admin]
Title: Re: Timings for AMD, P4, Core Duo
Post by: MichaelW on February 15, 2009, 10:32:38 AM
How about a P3:

 (SSE1)

Source len=4096

5194     clocks, mode s1, DestA
5200     clocks, mode s1, DestB

5193     clocks, mode d1, DestA
5200     clocks, mode d1, DestB

6198     clocks, mode c1, DestA
6205     clocks, mode c1, DestB

12366    clocks, mode m1, DestA
12368    clocks, mode m1, DestB

Source len=128

213      clocks, mode s1, DestA
211      clocks, mode s1, DestB

213      clocks, mode d1, DestA
213      clocks, mode d1, DestB

222      clocks, mode c1, DestA
222      clocks, mode c1, DestB

407      clocks, mode m1, DestA
407      clocks, mode m1, DestB

Source len=16

71       clocks, mode s1, DestA
71       clocks, mode s1, DestB

74       clocks, mode d1, DestA
74       clocks, mode d1, DestB

38       clocks, mode c1, DestA
38       clocks, mode c1, DestB
         --- OK ---

Title: Re: Timings for AMD, P4, Core Duo
Post by: BlackVortex on February 15, 2009, 10:35:46 AM
My results (returned to stock CPU/memory speeds for this test)
AMD Athlon(tm) 64 X2 Dual Core Processor 4200+ (SSE3)

Source len=4096

4622 clocks, mode s1, DestA
4443 clocks, mode s1, DestB

4957 clocks, mode d1, DestA
4872 clocks, mode d1, DestB

8688 clocks, mode c1, DestA
8728 clocks, mode c1, DestB

20870 clocks, mode m1, DestA
27694 clocks, mode m1, DestB

Source len=128

107 clocks, mode s1, DestA
100 clocks, mode s1, DestB

103 clocks, mode d1, DestA
94 clocks, mode d1, DestB

323 clocks, mode c1, DestA
323 clocks, mode c1, DestB

926 clocks, mode m1, DestA
933 clocks, mode m1, DestB

Source len=16

104 clocks, mode s1, DestA
103 clocks, mode s1, DestB

105 clocks, mode d1, DestA
106 clocks, mode d1, DestB

53 clocks, mode c1, DestA
53 clocks, mode c1, DestB
--- OK ---


EDIT: What kind of computations does this do/test ?
And are lower numbers better or worse ?
Title: Re: Timings for AMD, P4, Core Duo
Post by: Neil on February 15, 2009, 10:53:40 AM
Intel(R) Core(TM)2 Quad  CPU   Q9550  @ 2.83GHz (SSE4)

Source len=4096

3062     clocks, mode s1, DestA
3571     clocks, mode s1, DestB

2752     clocks, mode d1, DestA
2627     clocks, mode d1, DestB

4130     clocks, mode c1, DestA
4130     clocks, mode c1, DestB

8223     clocks, mode m1, DestA
8236     clocks, mode m1, DestB

Source len=128

123      clocks, mode s1, DestA
133      clocks, mode s1, DestB

132      clocks, mode d1, DestA
129      clocks, mode d1, DestB

134      clocks, mode c1, DestA
134      clocks, mode c1, DestB

278      clocks, mode m1, DestA
281      clocks, mode m1, DestB

Source len=16

55       clocks, mode s1, DestA
55       clocks, mode s1, DestB

60       clocks, mode d1, DestA
77       clocks, mode d1, DestB

22       clocks, mode c1, DestA
22       clocks, mode c1, DestB
         --- OK ---
Title: Re: Timings for AMD, P4, Core Duo
Post by: jj2007 on February 15, 2009, 10:58:06 AM
Quote from: BlackVortex on February 15, 2009, 10:35:46 AM
EDIT: What kind of computations does this do/test ?
And are lower numbers better or worse ?

It's an lstrcpy-type algo, and low numbers are better. The purpose is to see the difference between source/destination data alignment.
Michael: Thanks for the P3 test. It's good to know that the branch to the non-SSE2 algo worked ;-)
Neil: Thanks :thumbu
Here are my own results.

Intel(R) Celeron(R) M CPU        420  @ 1.60GHz (SSE3)

Source len=4096

2754     clocks, mode s1, DestA
3095     clocks, mode s1, DestB

2788     clocks, mode d1, DestA
2543     clocks, mode d1, DestB

5153     clocks, mode c1, DestA
5152     clocks, mode c1, DestB

8235     clocks, mode m1, DestA
8236     clocks, mode m1, DestB

Source len=128

126      clocks, mode s1, DestA
133      clocks, mode s1, DestB

139      clocks, mode d1, DestA
133      clocks, mode d1, DestB

167      clocks, mode c1, DestA
167      clocks, mode c1, DestB

284      clocks, mode m1, DestA
287      clocks, mode m1, DestB

Source len=16

57       clocks, mode s1, DestA
57       clocks, mode s1, DestB

68       clocks, mode d1, DestA
76       clocks, mode d1, DestB

27       clocks, mode c1, DestA
27       clocks, mode c1, DestB
Title: Re: Timings for AMD, P4, Core Duo
Post by: rags on February 15, 2009, 11:39:29 AM
jj, here are my results:


             Intel(R) Pentium(R) 4 CPU 2.53GHz (SSE2)

Source len=4096

9569     clocks, mode s1, DestA
9123     clocks, mode s1, DestB

11921    clocks, mode d1, DestA
8383     clocks, mode d1, DestB

4562     clocks, mode c1, DestA
4387     clocks, mode c1, DestB

8497     clocks, mode m1, DestA
8616     clocks, mode m1, DestB

Source len=128

409      clocks, mode s1, DestA
373      clocks, mode s1, DestB

482      clocks, mode d1, DestA
371      clocks, mode d1, DestB

164      clocks, mode c1, DestA
162      clocks, mode c1, DestB

288      clocks, mode m1, DestA
285      clocks, mode m1, DestB

Source len=16

142      clocks, mode s1, DestA
142      clocks, mode s1, DestB

177      clocks, mode d1, DestA
187      clocks, mode d1, DestB

29       clocks, mode c1, DestA
25       clocks, mode c1, DestB
        --- OK ---
Title: Re: Timings for AMD, P4, Core Duo
Post by: jj2007 on February 15, 2009, 12:46:03 PM
Quote from: rags on February 15, 2009, 11:39:29 AM
jj, here are my results:


Thanks, I am very disappointed.  :dazzled:
Could you please give it another try? I want to see if at least the 16-byte aligned version brings some improvement on the P4...


[attachment deleted by admin]
Title: Re: Timings for AMD, P4, Core Duo
Post by: rags on February 15, 2009, 01:03:49 PM
Ok JJ, I ran each one twice for you.
testbed1, run 1:

              Intel(R) Pentium(R) 4 CPU 2.53GHz (SSE2)

Source len=4096

9551     clocks, mode s1, DestA
9101     clocks, mode s1, DestB

11894    clocks, mode d1, DestA
8358     clocks, mode d1, DestB

4440     clocks, mode c1, DestA
4316     clocks, mode c1, DestB

8688     clocks, mode m1, DestA
8579     clocks, mode m1, DestB

Source len=128

409      clocks, mode s1, DestA
397      clocks, mode s1, DestB

484      clocks, mode d1, DestA
380      clocks, mode d1, DestB

164      clocks, mode c1, DestA
161      clocks, mode c1, DestB

286      clocks, mode m1, DestA
298      clocks, mode m1, DestB

Source len=16

142      clocks, mode s1, DestA
116      clocks, mode s1, DestB

171      clocks, mode d1, DestA
179      clocks, mode d1, DestB

29       clocks, mode c1, DestA
27       clocks, mode c1, DestB
;---------------------------------------------------------------------
testbed1, run 2:
              Intel(R) Pentium(R) 4 CPU 2.53GHz (SSE2)

Source len=4096

9589     clocks, mode s1, DestA
9209     clocks, mode s1, DestB

11889    clocks, mode d1, DestA
8444     clocks, mode d1, DestB

4310     clocks, mode c1, DestA
4546     clocks, mode c1, DestB

8920     clocks, mode m1, DestA
8522     clocks, mode m1, DestB

Source len=128

413      clocks, mode s1, DestA
375      clocks, mode s1, DestB

486      clocks, mode d1, DestA
355      clocks, mode d1, DestB

195      clocks, mode c1, DestA
176      clocks, mode c1, DestB

284      clocks, mode m1, DestA
287      clocks, mode m1, DestB

Source len=16

142      clocks, mode s1, DestA
116      clocks, mode s1, DestB

163      clocks, mode d1, DestA
165      clocks, mode d1, DestB

29       clocks, mode c1, DestA
29       clocks, mode c1, DestB

;---------------------------------------------------------------------
testbed2, run 1:
              Intel(R) Pentium(R) 4 CPU 2.53GHz (SSE2)

Source len=4096

9562     clocks, mode s1, DestA
9159     clocks, mode s1, DestB
1611     clocks, mode s1, DestC

11937    clocks, mode d1, DestA
8450     clocks, mode d1, DestB
1630     clocks, mode d1, DestC

4456     clocks, mode c1, DestA
4189     clocks, mode c1, DestB
4612     clocks, mode c1, DestC

8942     clocks, mode m1, DestA
8879     clocks, mode m1, DestB
8764     clocks, mode m1, DestC

Source len=128

451      clocks, mode s1, DestA
380      clocks, mode s1, DestB
187      clocks, mode s1, DestC

484      clocks, mode d1, DestA
376      clocks, mode d1, DestB
234      clocks, mode d1, DestC

166      clocks, mode c1, DestA
162      clocks, mode c1, DestB
162      clocks, mode c1, DestC

289      clocks, mode m1, DestA
300      clocks, mode m1, DestB
343      clocks, mode m1, DestC
;---------------------------------------------------------------------
testbed2, run 2:
              Intel(R) Pentium(R) 4 CPU 2.53GHz (SSE2)

Source len=4096

9685     clocks, mode s1, DestA
9141     clocks, mode s1, DestB
1626     clocks, mode s1, DestC

12009    clocks, mode d1, DestA
8356     clocks, mode d1, DestB
1628     clocks, mode d1, DestC

4397     clocks, mode c1, DestA
4264     clocks, mode c1, DestB
4261     clocks, mode c1, DestC

8532     clocks, mode m1, DestA
8588     clocks, mode m1, DestB
8601     clocks, mode m1, DestC

Source len=128

450      clocks, mode s1, DestA
378      clocks, mode s1, DestB
191      clocks, mode s1, DestC

484      clocks, mode d1, DestA
374      clocks, mode d1, DestB
212      clocks, mode d1, DestC

166      clocks, mode c1, DestA
165      clocks, mode c1, DestB
162      clocks, mode c1, DestC

297      clocks, mode m1, DestA
346      clocks, mode m1, DestB
298      clocks, mode m1, DestC
;---------------------------------------------------------------------

Title: Re: Timings for AMD, P4, Core Duo
Post by: jj2007 on February 15, 2009, 01:20:57 PM
Quote from: rags on February 15, 2009, 01:03:49 PM
Ok JJ, I ran each one twice for you.

Intel(R) Pentium(R) 4 CPU 2.53GHz (SSE2)

Source len=4096

9562     clocks, mode s1, DestA
9159     clocks, mode s1, DestB
1611     clocks, mode s1, DestC


Thanxalot, that cheered me up again :bg
(DestC means source and destination are on 16-byte boundaries)
Title: Re: Timings for AMD, P4, Core Duo
Post by: PBrennick on February 15, 2009, 01:40:55 PM
JJ,
Here are my results. A question, the reults from a previous test in another posting listed my CPU as Itanium;
This one says Celeron. How come?

Quote
Intel(R) Celeron(R) CPU 1.70GHz (SSE2)

Source len=4096

9505     clocks, mode s1, DestA
9131     clocks, mode s1, DestB

12294    clocks, mode d1, DestA
8123     clocks, mode d1, DestB

4521     clocks, mode c1, DestA
4260     clocks, mode c1, DestB

8515     clocks, mode m1, DestA
8667     clocks, mode m1, DestB

Source len=128

435      clocks, mode s1, DestA
394      clocks, mode s1, DestB

494      clocks, mode d1, DestA
411      clocks, mode d1, DestB

162      clocks, mode c1, DestA
159      clocks, mode c1, DestB

286      clocks, mode m1, DestA
482      clocks, mode m1, DestB

Source len=16

146      clocks, mode s1, DestA
121      clocks, mode s1, DestB

174      clocks, mode d1, DestA
189      clocks, mode d1, DestB

29       clocks, mode c1, DestA
25       clocks, mode c1, DestB
         --- OK ---

Paul
Title: Re: Timings for AMD, P4, Core Duo
Post by: jj2007 on February 15, 2009, 01:51:15 PM
Quote from: PBrennick on February 15, 2009, 01:40:55 PM
JJ,
Here are my results. A question, the reults from a previous test in another posting listed my CPU as Itanium;
This one says Celeron. How come?


Paul,
Thanks. The first version of the ShowCPU algo tried to identify processors by family and model. The newer one uses the brand string, which is more precise. Your Celeron behaves like a P4 - see rags post above, following which I posted a second version above as TestBed2.zip. The latter gives also timings for the ideal case where both source and destination are para-aligned.
Title: Re: Timings for AMD, P4, Core Duo
Post by: rags on February 15, 2009, 02:09:37 PM
You welcome JJ :U
Title: Re: Timings for AMD, P4, Core Duo
Post by: FORTRANS on February 15, 2009, 02:33:42 PM
Hi,

   Not sure if you want some older CPU's, but here goes.  PIII
with Windows 2000, and Pentium MMX with Windows 98.  The
PIII/P3 had some control characters that I replaces with the
equivalent text in an editor.

Regards,

Steve N.


This is your CPU:
Model           4
Family          5
Step              3
Manufacturer   GenuineIntel
Description   Intel P1 (1993+), MMX
Brand name   


Source len=4096

13082    clocks, mode s1, DestA
12991    clocks, mode s1, DestB
12963    clocks, mode s1, DestC

12952    clocks, mode d1, DestA
12958    clocks, mode d1, DestB
12954    clocks, mode d1, DestC

12920    clocks, mode c1, DestA
12915    clocks, mode c1, DestB
12911    clocks, mode c1, DestC

51220    clocks, mode m1, DestA
51229    clocks, mode m1, DestB
51230    clocks, mode m1, DestC

Source len=128

468    clocks, mode s1, DestA
468    clocks, mode s1, DestB
467    clocks, mode s1, DestC

462    clocks, mode d1, DestA
458    clocks, mode d1, DestB
461    clocks, mode d1, DestC

424    clocks, mode c1, DestA
421    clocks, mode c1, DestB
422    clocks, mode c1, DestC

1655    clocks, mode m1, DestA
1678    clocks, mode m1, DestB
1673    clocks, mode m1, DestC
    --- OK ---

This is your CPU:
Model           8
Family          6
Step              3
Manufacturer   GenuineIntel
Description   Intel P3 (2000+), SSE1
Brand name   ^A^A^B^C

^A^A^B^C (SSE1)

Source len=4096

5219    clocks, mode s1, DestA
5216    clocks, mode s1, DestB
5236    clocks, mode s1, DestC

5218    clocks, mode d1, DestA
5228    clocks, mode d1, DestB
5217    clocks, mode d1, DestC

6245    clocks, mode c1, DestA
6233    clocks, mode c1, DestB
6249    clocks, mode c1, DestC

12441    clocks, mode m1, DestA
12430    clocks, mode m1, DestB
12429    clocks, mode m1, DestC

Source len=128

212    clocks, mode s1, DestA
212    clocks, mode s1, DestB
212    clocks, mode s1, DestC

214    clocks, mode d1, DestA
213    clocks, mode d1, DestB
214    clocks, mode d1, DestC

223    clocks, mode c1, DestA
223    clocks, mode c1, DestB
223    clocks, mode c1, DestC

408    clocks, mode m1, DestA
409    clocks, mode m1, DestB
410    clocks, mode m1, DestC
    --- OK ---

Title: Re: Timings for AMD, P4, Core Duo
Post by: PBrennick on February 15, 2009, 03:42:44 PM
JJ,
Thanx for the explanation, my CPU is, indeed, a Celeron, 1.70Ghz. It actually clocks at 1.69Ghz, though. The difference between the Spec. and the actual is so slight I doubt it has any significant impact on any testing I may choose to do. Do my results look okay to you?

Paul
Title: Re: Timings for AMD, P4, Core Duo
Post by: UlliN on February 15, 2009, 05:36:58 PM
Hi,

two more results :

AMD Athlon(tm) 64 FX-57 Processor (SSE3)

Source len=4096

1934     clocks, mode s1, DestA
1941     clocks, mode s1, DestB
1358     clocks, mode s1, DestC

2103     clocks, mode d1, DestA
2191     clocks, mode d1, DestB
1357     clocks, mode d1, DestC

3807     clocks, mode c1, DestA
3806     clocks, mode c1, DestB
3801     clocks, mode c1, DestC

12648    clocks, mode m1, DestA
12631    clocks, mode m1, DestB
12433    clocks, mode m1, DestC

Source len=128

96       clocks, mode s1, DestA
91       clocks, mode s1, DestB
80       clocks, mode s1, DestC

104      clocks, mode d1, DestA
95       clocks, mode d1, DestB
81       clocks, mode d1, DestC

143      clocks, mode c1, DestA
143      clocks, mode c1, DestB
143      clocks, mode c1, DestC

407      clocks, mode m1, DestA
413      clocks, mode m1, DestB
407      clocks, mode m1, DestC
         --- OK ---


Intel(R) Core(TM)2 Duo CPU     E8500  @ 3.16GHz (SSE4)

Source len=4096

3059     clocks, mode s1, DestA
3577     clocks, mode s1, DestB
1541     clocks, mode s1, DestC

2776     clocks, mode d1, DestA
2628     clocks, mode d1, DestB
1550     clocks, mode d1, DestC

3282     clocks, mode c1, DestA
3281     clocks, mode c1, DestB
3279     clocks, mode c1, DestC

8270     clocks, mode m1, DestA
8238     clocks, mode m1, DestB
8416     clocks, mode m1, DestC

Source len=128

123      clocks, mode s1, DestA
136      clocks, mode s1, DestB
92       clocks, mode s1, DestC

133      clocks, mode d1, DestA
131      clocks, mode d1, DestB
95       clocks, mode d1, DestC

113      clocks, mode c1, DestA
114      clocks, mode c1, DestB
113      clocks, mode c1, DestC

290      clocks, mode m1, DestA
293      clocks, mode m1, DestB
292      clocks, mode m1, DestC
         --- OK ---


Regards
Ulli

Title: Re: Timings for AMD, P4, Core Duo
Post by: jj2007 on February 15, 2009, 06:24:25 PM
Quote from: PBrennick on February 15, 2009, 03:42:44 PM
JJ,
Thanx for the explanation, my CPU is, indeed, a Celeron, 1.70Ghz. It actually clocks at 1.69Ghz, though. The difference between the Spec. and the actual is so slight I doubt it has any significant impact on any testing I may choose to do.
Probably not. Cycles shouldn't change anyway.

Quote
Do my results look okay to you?

Paul


They look almost identical to rags' P4. I suspect you would get the same dramatic factor 5 improvement for the DestC case (where source and destination are aligned 16).

For the curious: s1 and d1 are SSE2 algos, c1 stands for crt_strcpy, and m1 means Masm32 library szCopy ;-)
Title: Re: Timings for AMD, P4, Core Duo
Post by: jj2007 on February 15, 2009, 06:28:01 PM
Quote from: FORTRANS on February 15, 2009, 02:33:42 PM
Hi,

   Not sure if you want some older CPU's, but here goes.

Thanks, Steve. Looks fine. By the way: How did you convince the exe to display the long version of the CPU description? I thought I had coded the short version only ;-)

Title: Re: Timings for AMD, P4, Core Duo
Post by: Mark Jones on February 16, 2009, 04:29:16 AM
From the latest assemble,

AMD Athlon(tm) 64 X2 Dual Core Processor 4000+ (SSE3)

Source len=4096

1677     clocks, mode s1, DestA
1693     clocks, mode s1, DestB
1371     clocks, mode s1, DestC

2246     clocks, mode d1, DestA
1902     clocks, mode d1, DestB
1367     clocks, mode d1, DestC

3863     clocks, mode c1, DestA
3841     clocks, mode c1, DestB
3850     clocks, mode c1, DestC

12520    clocks, mode m1, DestA
12623    clocks, mode m1, DestB
12503    clocks, mode m1, DestC

Source len=128

89       clocks, mode s1, DestA
90       clocks, mode s1, DestB
85       clocks, mode s1, DestC

104      clocks, mode d1, DestA
101      clocks, mode d1, DestB
86       clocks, mode d1, DestC

149      clocks, mode c1, DestA
150      clocks, mode c1, DestB
149      clocks, mode c1, DestC

409      clocks, mode m1, DestA
416      clocks, mode m1, DestB
421      clocks, mode m1, DestC
         --- OK ---
Title: Re: Timings for AMD, P4, Core Duo
Post by: sinsi on February 16, 2009, 04:52:39 AM

Intel(R) Core(TM)2 Quad CPU    Q6600  @ 2.40GHz (SSE4)

Source len=4096

2942     clocks, mode s1, DestA
3687     clocks, mode s1, DestB
1447     clocks, mode s1, DestC

2801     clocks, mode d1, DestA
2705     clocks, mode d1, DestB
1442     clocks, mode d1, DestC

3115     clocks, mode c1, DestA
3113     clocks, mode c1, DestB
3114     clocks, mode c1, DestC

4155     clocks, mode m1, DestA
4157     clocks, mode m1, DestB
4159     clocks, mode m1, DestC

Source len=128

122      clocks, mode s1, DestA
132      clocks, mode s1, DestB
85       clocks, mode s1, DestC

136      clocks, mode d1, DestA
132      clocks, mode d1, DestB
94       clocks, mode d1, DestC

134      clocks, mode c1, DestA
135      clocks, mode c1, DestB
134      clocks, mode c1, DestC

169      clocks, mode m1, DestA
175      clocks, mode m1, DestB
168      clocks, mode m1, DestC
         --- OK ---

Mode m1 doesn't seem to like amd does it?

You seem to like numbers jj...later this week I'll be building a 'new' dev box (p3 1000) - even more numbers for you  :bg
Title: Re: Timings for AMD, P4, Core Duo
Post by: jj2007 on February 16, 2009, 07:24:49 AM
Quote from: sinsi on February 16, 2009, 04:52:39 AM


Mode m1 doesn't seem to like amd does it?

Indeed, Mark's figures look incredibly slow for the Masm32lib szCopy algo. Among the standard ones, crt_strcpy (c1) is clearly the best - I threw lstrcpy out because it was too bad in all tests.
Title: Re: Timings for AMD, P4, Core Duo
Post by: sinsi on February 16, 2009, 07:37:23 AM
For the more curious, what are DestA etc. ? There is a fair bit of difference in the numbers.
Title: Re: Timings for AMD, P4, Core Duo
Post by: jj2007 on February 16, 2009, 08:14:45 AM
Quote from: sinsi on February 16, 2009, 07:37:23 AM
For the more curious, what are DestA etc. ? There is a fair bit of difference in the numbers.

Different degrees of misalignent against a 16-byte boundary. SSE2 can work with non-aligned data, but it gets slow - so the algo checks whether aligning is possible; if yes, it goes for movaps etc., if no, it has to decide whether to use movups for the source and movaps for the destination, or vice versa. The problem is some processors are faster with source alignment, others with destination alignment...

Intel(R) Celeron(R) M CPU        420  @ 1.60GHz (SSE3)
Data (mis-)alignment:
diff src-DestA:        n*16+4
diff src-DestB:        n*16+12
diff src-DestC:        n*16+0

Source len=4096

2762     clocks, mode s1, DestA
3096     clocks, mode s1, DestB
1488     clocks, mode s1, DestC

2776     clocks, mode d1, DestA
2534     clocks, mode d1, DestB
1501     clocks, mode d1, DestC

5160     clocks, mode c1, DestA
5186     clocks, mode c1, DestB
5634     clocks, mode c1, DestC

8278     clocks, mode m1, DestA
8308     clocks, mode m1, DestB
8299     clocks, mode m1, DestC
Title: Re: Timings for AMD, P4, Core Duo
Post by: jj2007 on February 16, 2009, 09:19:59 AM
And one more for the really curious. A P4 is a P4...  :dazzled: ??
Quote from: rags on February 15, 2009, 01:03:49 PM

              Intel(R) Pentium(R) 4 CPU 2.53GHz (SSE2)

Source len=4096

9562     clocks, mode s1, DestA
9159     clocks, mode s1, DestB
1611     clocks, mode s1, DestC

11937    clocks, mode d1, DestA
8450     clocks, mode d1, DestB
1630     clocks, mode d1, DestC

4456     clocks, mode c1, DestA  = crt_strcpy
4189     clocks, mode c1, DestB
4612     clocks, mode c1, DestC

              Intel(R) Pentium(R) 4 CPU 3.40GHz (SSE3)

Source len=4096

8587     clocks, mode s1, DestA
9119     clocks, mode s1, DestB
3505     clocks, mode s1, DestC

3692     clocks, mode d1, DestA
4752     clocks, mode d1, DestB
2096     clocks, mode d1, DestC

9255     clocks, mode c1, DestA  = crt_strcpy
8544     clocks, mode c1, DestB
5532     clocks, mode c1, DestC
Title: Re: Timings for AMD, P4, Core Duo
Post by: rags on February 16, 2009, 11:59:45 AM
JJ, I ran testbed2 again:

              Intel(R) Pentium(R) 4 CPU 2.53GHz (SSE2)

Source len=4096

9628     clocks, mode s1, DestA
9091     clocks, mode s1, DestB
1617     clocks, mode s1, DestC

11898    clocks, mode d1, DestA
8331     clocks, mode d1, DestB
1619     clocks, mode d1, DestC

4535     clocks, mode c1, DestA
4485     clocks, mode c1, DestB
4156     clocks, mode c1, DestC

could different amounts of onboard cache or system ram account for the differences?
I'm not sure how much cache I have , I bought this p4 used from a friend.
I have 2gb system ram.
Title: Re: Timings for AMD, P4, Core Duo
Post by: jj2007 on February 16, 2009, 12:42:55 PM
Quote from: rags on February 16, 2009, 11:59:45 AM
JJ, I ran testbed2 again:
...
could different amounts of onboard cache or system ram account for the differences?
I'm not sure how much cache I have , I bought this p4 used from a friend.
I have 2gb system ram.

Don't know what the exact reason is. It's interesting though that the "brand strings" for our processors are identical, while yours is an SSE2, and mine is SSE3. Note also that crt_strcpy runs a lot faster on your (older) processor - as if Microsoft had optimised this algo for early P4's ...

EDIT: It seems you have a Northwood, while I have a Prescott P4 (Wiki (http://en.wikipedia.org/wiki/Pentium_4)):

Northwood
... A 2.4 GHz P4 was released in April 2002, and the bus speed increased from 400 MT/s to 533 MT/s for a 2.26 GHz, 2.4 GHz, and 2.53 GHz part in May, 2.66 GHz and 2.8 GHz parts in August

Prescott
On February 1, 2004, Intel introduced a new core codenamed "Prescott". ...  Some programs benefitted from Prescott's doubled cache and SSE3 instructions, whereas others were more crippled by its long, inefficient pipeline.

So the lesson is: Don't rely on the CPUID brand string...
Title: Re: Timings for AMD, P4, Core Duo
Post by: rags on February 16, 2009, 01:29:52 PM
Good digging JJ. :)
Title: Re: Timings for AMD, P4, Core Duo
Post by: dsouza123 on February 16, 2009, 10:33:21 PM

Athlon Thunderbird 1170 Mhz

---TestBed1---
AMD Athlon(tm) Processor

Source len=4096

5190     clocks, mode s1, DestA
5215     clocks, mode s1, DestB

5191     clocks, mode d1, DestA
5214     clocks, mode d1, DestB

5174     clocks, mode c1, DestA
5195     clocks, mode c1, DestB

16521    clocks, mode m1, DestA
16558    clocks, mode m1, DestB

Source len=128

200      clocks, mode s1, DestA
200      clocks, mode s1, DestB

201      clocks, mode d1, DestA
201      clocks, mode d1, DestB

184      clocks, mode c1, DestA
184      clocks, mode c1, DestB

535      clocks, mode m1, DestA
536      clocks, mode m1, DestB

Source len=16

43       clocks, mode s1, DestA
47       clocks, mode s1, DestB

45       clocks, mode d1, DestA
45       clocks, mode d1, DestB

30       clocks, mode c1, DestA
30       clocks, mode c1, DestB
         --- OK ---

---TestBed2---
AMD Athlon(tm) Processor

Source len=4096

5193     clocks, mode s1, DestA
5210     clocks, mode s1, DestB
5201     clocks, mode s1, DestC

5209     clocks, mode d1, DestA
5190     clocks, mode d1, DestB
5213     clocks, mode d1, DestC

5175     clocks, mode c1, DestA
5197     clocks, mode c1, DestB
5175     clocks, mode c1, DestC

16557    clocks, mode m1, DestA
16543    clocks, mode m1, DestB
17011    clocks, mode m1, DestC

Source len=128

198      clocks, mode s1, DestA
198      clocks, mode s1, DestB
198      clocks, mode s1, DestC

201      clocks, mode d1, DestA
200      clocks, mode d1, DestB
200      clocks, mode d1, DestC

184      clocks, mode c1, DestA
184      clocks, mode c1, DestB
184      clocks, mode c1, DestC

541      clocks, mode m1, DestA
536      clocks, mode m1, DestB
542      clocks, mode m1, DestC
         --- OK ---