News:

MASM32 SDK Description, downloads and other helpful links
MASM32.com New Forum Link
masmforum WebSite

Mul VS 32x Add

Started by Farabi, October 29, 2011, 09:05:13 AM

Previous topic - Next topic

Farabi

Which is faster? Mul or 32 times addition? If Mul is slower I will reinvent the mul instruction.
Those who had universe knowledges can control the world by a micro processor.
http://www.wix.com/farabio/firstpage

"Etos siperi elegi"

Vortex

Hi Onan,

If I understand correctly, you want to multiply a number by 32. Is that correct? If this is the case, why not use the shl instruction?

shl eax,5 ; multiply eax by 2^5 = 32

dedndave

MUL is surprisingly fast ~8 clock cycles
but, i am sure Erol is rigt - SHL is probably ~3 clock cycles or less

Farabi

No not that Erol, I mean something like this

http://en.wikipedia.org/wiki/Binary_multiplier#Example
1011   (this is 11 in binary)
     x 1110   (this is 14 in binary)
     ======
       0000   (this is 1011 x 0)
      1011    (this is 1011 x 1, shifted one position to the left)
     1011     (this is 1011 x 1, shifted two positions to the left)
  + 1011      (this is 1011 x 1, shifted three positions to the left)
  =========
   10011010   (this is 154 in binary)
Those who had universe knowledges can control the world by a micro processor.
http://www.wix.com/farabio/firstpage

"Etos siperi elegi"

Farabi

Well I guess that Intel Corp Guys were far sophisticated than me on this stuff.
Those who had universe knowledges can control the world by a micro processor.
http://www.wix.com/farabio/firstpage

"Etos siperi elegi"

dedndave

that method of multiplying was faster than MUL on older processors (like the 8088)
one any modern processor, MUL or IMUL is much faster (and much smaller   :P )

dedndave

ok - little surprise here   :eek

i used a multiplier constant of 10
for that constant, only 2 bits are set (1010)
if the constant has more than 2 bits set, MUL or IMUL will be faster   :P

prescott w/htt:
Pentium 4 Prescott (2005+), MMX, SSE3
X 10 SHL/ADD: 6 6 6 6 6
X 10     MUL: 7 7 7 7 7
X 10    IMUL: 7 7 7 7 7

raymond

If you are multiplying by a constant 10, how fast in comparison would the following be:

lea eax,[eax*4+eax]
add eax,eax
When you assume something, you risk being wrong half the time
http://www.ray.masmcode.com

dedndave

thanks for reminding me, Ray   :P
Pentium 4 Prescott (2005+), MMX, SSE3
X 10         SHL/ADD: 6 6 6 6 6
X 10             MUL: 7 7 7 7 7
X 10            IMUL: 7 7 7 7 7
X 10 LEA [EAX*4+EAX]: 7 7 7 7 7
X 10 ADD [EAX*8+EAX]: 7 7 7 7 7


i believe those methods will be faster on most processors newer than the P4's

jj2007

Core (2006+), MMX, SSE3
X 10         SHL/ADD: 4 4 4 4 4
X 10             MUL: 6 6 6 6 6
X 10            IMUL: 6 6 6 6 6
X 10 LEA [EAX*4+EAX]: 3 3 3 3 3
X 10 ADD [EAX*8+EAX]: 3 3 3 3 3

MichaelW


P3 (2000+), MMX, SSE1
X 10         SHL/ADD: 6 6 6 6 6
X 10             MUL: 6 6 6 6 6
X 10            IMUL: 7 7 7 7 7
X 10 LEA [EAX*4+EAX]: 6 6 6 6 6
X 10 ADD [EAX*8+EAX]: 6 6 6 6 6

eschew obfuscation

Farabi

I bet that intel Guy did not use one IC for the mul, but multiple. On MCS51, Inc 24 clock Add 12 Clock and MUL and DIV 48 Clock, obviously, they done something special except for the INC instruction.
Those who had universe knowledges can control the world by a micro processor.
http://www.wix.com/farabio/firstpage

"Etos siperi elegi"