News:

MASM32 SDK Description, downloads and other helpful links
MASM32.com New Forum Link
masmforum WebSite

Multiply timings - reg32, FPU, SSE2

Started by jj2007, December 26, 2011, 08:55:27 PM

Previous topic - Next topic

qWord

Win7-x64
Intel(R) Core(TM) i5 CPU       M 520  @ 2.40GHz (SSE4)
222 cycles for shl      res=10000
197 cycles for mul      res=10000
87 cycles for imul      res=10000
260 cycles for fimul    res=10000
219 cycles for fmul     res=10000
164 cycles for pmuludq+movd     res=0
190 cycles for pmuludq+mem4     res=10000
239 cycles for movss+mulps      res=10000

209 cycles for shl      res=10000
169 cycles for mul      res=10000
82 cycles for imul      res=10000
260 cycles for fimul    res=10000
171 cycles for fmul     res=10000
160 cycles for pmuludq+movd     res=0
202 cycles for pmuludq+mem4     res=10000
207 cycles for movss+mulps      res=10000


--- ok ---

FPU in a trice: SmplMath
It's that simple!

jj2007

Yep, that's it. In the meantime I found this old thread by googling for xmm abi - it's actually the top hit :bg
There is also a post by sinsi pointing to the x64 register usage page.

Mystery solved, thanks to all :U

P.S.: Timings for P4:
Intel(R) Pentium(R) 4 CPU 3.40GHz (SSE3)
318 cycles for shl      res=10000
210 cycles for mul      res=10000
95 cycles for imul      res=10000 <<<<<<<<<<<< !!
569 cycles for fimul    res=10000
273 cycles for fmul     res=10000
599 cycles for pmuludq+movd     res=10000
274 cycles for pmuludq+mem4     res=10000
274 cycles for movss+mulps      res=10000