News:

MASM32 SDK Description, downloads and other helpful links
MASM32.com New Forum Link
masmforum WebSite

This is too slow

Started by frktons, November 18, 2010, 03:10:21 AM

Previous topic - Next topic

Antariy

Quote from: dedndave on November 22, 2010, 01:38:33 AM
no Alex - same problem

First row is not copied properly? I have changed only one testing description.

dedndave

the first row ?
the first algo data row is where it quits
see the previous posts with the short results - it looks the same

Antariy

Dave, how about this? First row should shows.
This is not CPU issues, this is entanglement of string lengths.

dedndave

i know - but i was trying to give Frank something to think about, in terms of CPU/SSE   :bdg

yes - that one works, Alex

┌─────────────────────────────────────────────────────────────[22-Nov-2010 at 01:46 GMT]─┐
│OS  : Microsoft Windows XP Professional Service Pack 2 (build 2600)                     │
│CPU : Intel(R) Pentium(R) 4 CPU 3.00GHz with 2 logical core(s) with SSE3                │
├──────────────────────────────────┬─────────┬──────────┬──────────┬──────────┬──────────┤
│        Algorithm notes           │Proc Size│ Test # 1 │ Test # 2 │ Test # 3 │ Test # 4 │
├──────────────────────────────────┼─────────┼──────────┼──────────┼──────────┼──────────┤
│01 ustrv$ + GetNumberFormat       │    95   │   74,853 │   76,643 │   73,079 │   77,197 │
├──────────────────────────────────┼─────────┼──────────┼──────────┼──────────┼──────────┤
│02 udw2str + GetNumberFormat      │    65   │   71,043 │   75,563 │   70,802 │   86,603 │
├──────────────────────────────────┼─────────┼──────────┼──────────┼──────────┼──────────┤
│03 wsprintf + GetNumberFormat     │    73   │   92,975 │   96,372 │  100,538 │   90,121 │
├──────────────────────────────────┼─────────┼──────────┼──────────┼──────────┼──────────┤
│04 Clive - IDIV and Stack         │   120   │    8,812 │    9,840 │    9,729 │   11,084 │
├──────────────────────────────────┼─────────┼──────────┼──────────┼──────────┼──────────┤
│05 Clive - reciprocal IMUL        │   157   │    3,566 │    3,369 │    3,184 │    3,952 │
├──────────────────────────────────┼─────────┼──────────┼──────────┼──────────┼──────────┤
│06 Hutch ustr$ + format algo      │   159   │   12,113 │   12,253 │   11,941 │   11,992 │
├──────────────────────────────────┼─────────┼──────────┼──────────┼──────────┼──────────┤


frktons

Instead of the:

   SSE2    EQU  ON


I should do something like:

CALL ChkSSE2
.if eax
  SSE2  EQU  ON
.else
  SSE2  EQU  OFF
.endif

I don't actually know if this syntax is correct. Is it?  
Mind is like a parachute. You know what to do in order to use it :-)

dedndave

well - you are going to want to test for MMX, SSE, SSE2, SSE3   :bg
use the routine i posted earlier
store the result in a dword and use BT on that dword to see if a feature is present

or the "&"

if FeatureFlags & 20h

Antariy

Quote from: dedndave on November 22, 2010, 01:47:35 AM
i know - but i was trying to give Frank something to think about, in terms of CPU/SSE   :bdg

yes - that one works, Alex

Thanks!  :U

I'm changed, in ProgData.inc:

AlgoDescSize     DWORD SIZEOF AlgoDesc-1


:lol

Try to make this changement in original source, and tell results  :U

hutch--

Looks good here Frank.


┌─────────────────────────────────────────────────────────────[22-Nov-2010 at 01:52 GMT]─┐
│OS  : Microsoft Windows XP Professional Service Pack 3 (build 2600)                     │
│CPU : Intel(R) Core(TM)2 Quad CPU Q9650 @ 3.00GHz with 4 logical core(s) with SSE4.1    │
├──────────────────────────────────┬─────────┬──────────┬──────────┬──────────┬──────────┤
│        Algorithm notes           │Proc Size│ Test # 1 │ Test # 2 │ Test # 3 │ Test # 4 │
├──────────────────────────────────┼─────────┼──────────┼──────────┼──────────┼──────────┤
│01 ustrv$ + GetNumberFormat       │    95   │   32,115 │   32,048 │   31,269 │   31,559 │
├──────────────────────────────────┼─────────┼──────────┼──────────┼──────────┼──────────┤
│02 udw2str + GetNumberFormat      │    65   │   32,146 │   31,748 │   32,162 │   31,828 │
├──────────────────────────────────┼─────────┼──────────┼──────────┼──────────┼──────────┤
│03 wsprintf + GetNumberFormat     │    73   │   37,187 │   37,389 │   37,251 │   37,388 │
├──────────────────────────────────┼─────────┼──────────┼──────────┼──────────┼──────────┤
│04 Clive - IDIV and Stack         │   120   │    2,583 │    2,583 │    2,583 │    2,577 │
├──────────────────────────────────┼─────────┼──────────┼──────────┼──────────┼──────────┤
│05 Clive - reciprocal IMUL        │   157   │    2,052 │    2,017 │    2,009 │    2,019 │
├──────────────────────────────────┼─────────┼──────────┼──────────┼──────────┼──────────┤
│06                                │         │          │          │          │          │
├──────────────────────────────────┼─────────┼──────────┼──────────┼──────────┼──────────┤
│07                                │         │          │          │          │          │
├──────────────────────────────────┼─────────┼──────────┼──────────┼──────────┼──────────┤
│08                                │         │          │          │          │          │
├──────────────────────────────────┼─────────┼──────────┼──────────┼──────────┼──────────┤
│09                                │         │          │          │          │          │
├──────────────────────────────────┼─────────┼──────────┼──────────┼──────────┼──────────┤
│10                                │         │          │          │          │          │
├──────────────────────────────────┼─────────┼──────────┼──────────┼──────────┼──────────┤
│11                                │         │          │          │          │          │
├──────────────────────────────────┼─────────┼──────────┼──────────┼──────────┼──────────┤
│12                                │         │          │          │          │          │
├──────────────────────────────────┼─────────┼──────────┼──────────┼──────────┼──────────┤
│13                                │         │          │          │          │          │
├──────────────────────────────────┼─────────┼──────────┼──────────┼──────────┼──────────┤
│14                                │         │          │          │          │          │
├──────────────────────────────────┼─────────┼──────────┼──────────┼──────────┼──────────┤
│15                                │         │          │          │          │          │
├──────────────────────────────────┼─────────┼──────────┼──────────┼──────────┼──────────┤
│16                                │         │          │          │          │          │
├──────────────────────────────────┴─────────┴──────────┴──────────┴──────────┴──────────┤
│ Esc         Exit       Copy       Run       View       Save       Info       F1 Help   │
└────────────────────────────────────────────────────────────────────────────────────────┘
Download site for MASM32      New MASM Forum
https://masm32.com          https://masm32.com/board/index.php

Antariy

Quote from: frktons on November 22, 2010, 01:48:02 AM
Instead of the:

   SSE2    EQU  ON


I should do something like:

CALL ChkSSE2
.if eax
  SSE2  EQU  ON
.else
  SSE2  EQU  OFF
.endif

I don't actually know if this syntax is correct. Is it?  

No, you cannot mix macro equations and code - equations is compile time stuff.


Quote from: dedndave on November 22, 2010, 01:49:55 AM
well - you are going to want to test for MMX, SSE, SSE2, SSE3? ?:bg

Dave, at original thread of "New TestBed", I'm already posted fully functional and working manager of the algos, which is exclude unsupported algos from testing at runtime. Its decision based on result returned by CPUID code at the start of the program.

dedndave

that's it Alex - you found it   :U

as for mixing code and equates - you can handle that as an assembly-time conditional
but, that isn't how i would do it - lol

frktons

Quote from: dedndave on November 22, 2010, 01:49:55 AM
well - you are going to want to test for MMX, SSE, SSE2, SSE3   :bg
use the routine i posted earlier
store the result in a dword and use BT on that dword to see if a feature is present

I think the routine for CPU detection has already everything I need. At least, I think
Alex, the routine is yours, what do you think?
Quote from: Antariy on November 22, 2010, 01:50:49 AM
Quote from: dedndave on November 22, 2010, 01:47:35 AM
i know - but i was trying to give Frank something to think about, in terms of CPU/SSE   :bdg

yes - that one works, Alex

Thanks!  :U

I'm changed, in ProgData.inc:

AlgoDescSize     DWORD SIZEOF AlgoDesc-1


:lol

Try to make this changement in original source, and tell results  :U


This is logical as well.  :U I'll change it, but the routine already works with Dave's workaround.
Maybe this will make it not necessary to use the workaround. Better if Dave tests it.

Quote from: hutch-- on November 22, 2010, 01:53:37 AM
Looks good here Frank.

Yes Steve. Thanks. It has some problem with oldies.
Mind is like a parachute. You know what to do in order to use it :-)

Antariy


dedndave

Frank
my work-around was merely a debuging tool used to help isolate the problem

Alex's fix is the right way to do it - and it works
remove my temporary code

frktons

Quote from: dedndave on November 22, 2010, 01:58:43 AM
Frank
my work-around was merely a debuging tool used to help isolate the problem

Alex's fix is the right way to do it - and it works
remove my temporary code

Already done.  :P
Mind is like a parachute. You know what to do in order to use it :-)

Antariy

I think the routine for CPU detection has already everything I need. At least, I think
Alex, the routine is yours, what do you think?


I'm already sayed many times that my CPUid code returns the maximal supported instruction set. And can be used for determination of execution of the algos. Moreover, that "ultra hidded" feature is used in the my "Algos Manager" :P :lol

Just read comments for AxCPUid code.

This is logical as well.  :U I'll change it, but the routine already works with Dave's workaround.

Daves workaround is straightforward as it is possible, and show that culprit is the zero byte, as mentinioned. But it is slow enough - search of string, and replace the nulls. Some time ago you didn't wants to search "Here can be your advertisement" for my manger  :lol

Maybe this will make it not necessary to use the workaround. Better if Dave tests it.

He already test it, it work.