The MASM Forum Archive 2004 to 2012

General Forums => The Campus => Topic started by: Farabi on October 29, 2011, 09:05:13 AM

Title: Mul VS 32x Add
Post by: Farabi on October 29, 2011, 09:05:13 AM
Which is faster? Mul or 32 times addition? If Mul is slower I will reinvent the mul instruction.
Title: Re: Mul VS 32x Add
Post by: Vortex on October 29, 2011, 09:28:38 AM
Hi Onan,

If I understand correctly, you want to multiply a number by 32. Is that correct? If this is the case, why not use the shl instruction?

shl eax,5 ; multiply eax by 2^5 = 32
Title: Re: Mul VS 32x Add
Post by: dedndave on October 29, 2011, 03:06:21 PM
MUL is surprisingly fast ~8 clock cycles
but, i am sure Erol is rigt - SHL is probably ~3 clock cycles or less
Title: Re: Mul VS 32x Add
Post by: Farabi on October 29, 2011, 04:03:36 PM
No not that Erol, I mean something like this

http://en.wikipedia.org/wiki/Binary_multiplier#Example
1011   (this is 11 in binary)
     x 1110   (this is 14 in binary)
     ======
       0000   (this is 1011 x 0)
      1011    (this is 1011 x 1, shifted one position to the left)
     1011     (this is 1011 x 1, shifted two positions to the left)
  + 1011      (this is 1011 x 1, shifted three positions to the left)
  =========
   10011010   (this is 154 in binary)
Title: Re: Mul VS 32x Add
Post by: Farabi on October 29, 2011, 04:05:01 PM
Well I guess that Intel Corp Guys were far sophisticated than me on this stuff.
Title: Re: Mul VS 32x Add
Post by: dedndave on October 29, 2011, 04:44:29 PM
that method of multiplying was faster than MUL on older processors (like the 8088)
one any modern processor, MUL or IMUL is much faster (and much smaller   :P )
Title: Re: Mul VS 32x Add
Post by: dedndave on October 29, 2011, 05:17:47 PM
ok - little surprise here   :eek

i used a multiplier constant of 10
for that constant, only 2 bits are set (1010)
if the constant has more than 2 bits set, MUL or IMUL will be faster   :P

prescott w/htt:
Pentium 4 Prescott (2005+), MMX, SSE3
X 10 SHL/ADD: 6 6 6 6 6
X 10     MUL: 7 7 7 7 7
X 10    IMUL: 7 7 7 7 7
Title: Re: Mul VS 32x Add
Post by: raymond on October 29, 2011, 06:28:09 PM
If you are multiplying by a constant 10, how fast in comparison would the following be:

lea eax,[eax*4+eax]
add eax,eax
Title: Re: Mul VS 32x Add
Post by: dedndave on October 29, 2011, 06:57:02 PM
thanks for reminding me, Ray   :P
Pentium 4 Prescott (2005+), MMX, SSE3
X 10         SHL/ADD: 6 6 6 6 6
X 10             MUL: 7 7 7 7 7
X 10            IMUL: 7 7 7 7 7
X 10 LEA [EAX*4+EAX]: 7 7 7 7 7
X 10 ADD [EAX*8+EAX]: 7 7 7 7 7


i believe those methods will be faster on most processors newer than the P4's
Title: Re: Mul VS 32x Add
Post by: jj2007 on October 29, 2011, 07:00:02 PM
Core (2006+), MMX, SSE3
X 10         SHL/ADD: 4 4 4 4 4
X 10             MUL: 6 6 6 6 6
X 10            IMUL: 6 6 6 6 6
X 10 LEA [EAX*4+EAX]: 3 3 3 3 3
X 10 ADD [EAX*8+EAX]: 3 3 3 3 3
Title: Re: Mul VS 32x Add
Post by: MichaelW on October 29, 2011, 08:57:05 PM

P3 (2000+), MMX, SSE1
X 10         SHL/ADD: 6 6 6 6 6
X 10             MUL: 6 6 6 6 6
X 10            IMUL: 7 7 7 7 7
X 10 LEA [EAX*4+EAX]: 6 6 6 6 6
X 10 ADD [EAX*8+EAX]: 6 6 6 6 6

Title: Re: Mul VS 32x Add
Post by: Farabi on October 31, 2011, 08:39:24 AM
I bet that intel Guy did not use one IC for the mul, but multiple. On MCS51, Inc 24 clock Add 12 Clock and MUL and DIV 48 Clock, obviously, they done something special except for the INC instruction.