The MASM Forum Archive 2004 to 2012

General Forums => The Campus => Topic started by: untio on February 14, 2010, 11:38:37 AM

Title: What is faster
Post by: untio on February 14, 2010, 11:38:37 AM
Hi,
I need to know what of these instruction sets is faster:
Set 1:
mov ecx, 0100h
label:
dec ecx
jnz label
Or set 2:
mov ecx, 0100h
label:
loop label
Because I need speed in a crc32 calculation.

Thank you in advance.
Title: Re: What is faster
Post by: dedndave on February 14, 2010, 11:42:31 AM
set 1 is faster on most pentiums (if not all)
but - you can measure it
in the Laboratory sub-forum, first post of the first thread are timing macros   :U

give me a few minutes and i will write a little program
Title: Re: What is faster
Post by: dedndave on February 14, 2010, 11:54:23 AM
Pentium 4 Prescott:
Loop
592     clock cycles
581     clock cycles
596     clock cycles
Dec ECX
436     clock cycles
434     clock cycles
436     clock cycles
Title: Re: What is faster
Post by: FORTRANS on February 14, 2010, 01:42:56 PM
Hi,

   Gackelfish!

Regards,

Steve N


P-III

G:\WORK>loopdec
Loop
1483    clock cycles
1483    clock cycles
1483    clock cycles
Dec ECX
522     clock cycles
523     clock cycles
523     clock cycles
Press any key to continue ...
Title: Re: What is faster
Post by: donkey on February 14, 2010, 01:53:56 PM
AMD Athlon X2 QL-62

Loop
788     clock cycles
778     clock cycles
778     clock cycles
Dec ECX
521     clock cycles
521     clock cycles
522     clock cycles
Title: Re: What is faster
Post by: Gunner on February 14, 2010, 02:16:32 PM
Loop
529     clock cycles
528     clock cycles
536     clock cycles
Dec ECX
399     clock cycles
402     clock cycles
403     clock cycles
Press any key to continue ...
Title: Re: What is faster
Post by: Neil on February 14, 2010, 02:51:59 PM
Intel Quad core 9550

Loop
1301      clock cycles
1301      clock cycles
1302      clock cycles
Dec ECX
280        clock cycles
281        clock cycles
281        clock cycles
Title: Re: What is faster
Post by: jj2007 on February 14, 2010, 03:58:21 PM
Celeron M

Loop
1615    clock cycles
1613    clock cycles
1613    clock cycles
Dec ECX
290     clock cycles
290     clock cycles
290     clock cycles
Title: Re: What is faster
Post by: raymond on February 15, 2010, 05:20:30 AM
CoreDuo 2, 1.89 GHz

Loop
1325   clock cycles
1325   clock cycles
1324   clock cycles
Dec ECX
286    clock cycles
286    clock cycles
286    clock cycles

It's still surprising to see significant variations for both codes between the various processors, specially for the "Dec ECX" case which would be expected to be almost standard based on clock cycles.

BTW, I also added a test replacing the "Dec ECX" instruction by "Sub ECX,1". The timings were identical.
Title: Re: What is faster
Post by: untio on February 16, 2010, 04:08:14 PM
Hi,
The answers are clear.
I'll use dec ecx.

A lot of thanks
Title: Re: What is faster
Post by: Astro on February 19, 2010, 08:53:17 PM
Hi,

As a rough guide, if you look in the MASM32 help file opcodes.chm you will see that loop takes 6 cycles, and dec only 1 on a 486. As the above results show though, "your milage may vary".

Best regards,
Robin.
Title: Re: What is faster
Post by: dacid on February 20, 2010, 08:07:01 PM
AMD Athlon(tm) 64 X2 Dual Core Processor 6000+

Loop
781     clock cycles
781     clock cycles
781     clock cycles
Dec ECX
524     clock cycles
523     clock cycles
524     clock cycles
Title: Re: What is faster
Post by: BlackVortex on February 21, 2010, 05:05:00 AM
Loop
1294    clock cycles
1294    clock cycles
1294    clock cycles
Dec ECX
279     clock cycles
279     clock cycles
279     clock cycles

on my Core2Duo e8400. Seems pentium4+AMD failed. Anyone with an i5 or i7 ?   :green
Title: Re: What is faster
Post by: Ghandi on February 24, 2010, 10:46:32 AM

Loop
1300    clock cycles
1300    clock cycles
1299    clock cycles
Dec ECX
280     clock cycles
280     clock cycles
281     clock cycles
Press any key to continue ...


E5200 P4 DualCore 2.7ghz
Title: Re: What is faster
Post by: clive on February 24, 2010, 01:58:24 PM
Quote from: untio on February 14, 2010, 11:38:37 AM
Because I need speed in a crc32 calculation.

What specific CRC32 polynomial are you using?

This might be of interest http://www.masm32.com/board/index.php?topic=13420.0

Also if you are iterating the loop a lot, the top of the loop should be aligned on a 16 byte boundary, a cache line boundary may be even better.

-Clive