News:

MASM32 SDK Description, downloads and other helpful links
MASM32.com New Forum Link
masmforum WebSite

More Tests needed

Started by frktons, November 13, 2010, 11:21:28 PM

Previous topic - Next topic

frktons

The New Testbed has been destructured and rewritten in part.
Now it manages up to 16 algos and uses a screen with 40 rows and 90 columns.
All the algos, descriptions, data and procs are in include files.

Please test it on your machines and post the partial screens like the following:


┌────────────────────────────────────────────────────────────────────────────────────────┐
│OS  : Microsoft Windows 7 Ultimate Edition, 64-bit (build 7600)                         │
│CPU : Intel(R) Core(TM)2 CPU 6600 @ 2.40GHz with 2 logical core(s) with SSSE3           │
├──────────────────────────────────┬─────────┬──────────┬──────────┬──────────┬──────────┤
│        Algorithm notes           │Proc Size│ Test # 1 │ Test # 2 │ Test # 3 │ Test # 4 │
├──────────────────────────────────┼─────────┼──────────┼──────────┼──────────┼──────────┤
│01 Alex / MMX - PUNPCKLBW MOVQ    │    64   │    3.751 │    3.746 │    3.748 │    3.743 │
├──────────────────────────────────┼─────────┼──────────┼──────────┼──────────┼──────────┤
│02 Frank / 486 - MOV-BSWAP        │    43   │   10.702 │   10.708 │   10.703 │   10.712 │
├──────────────────────────────────┼─────────┼──────────┼──────────┼──────────┼──────────┤
│03 Frank / XMM PUNPCKLBW MOVDQA   │    45   │    2.347 │    2.347 │    2.348 │    2.350 │
├──────────────────────────────────┼─────────┼──────────┼──────────┼──────────┼──────────┤
│04 Alex / MMX - PUNPCKLBW MOVNTQ  │    64   │    7.141 │    7.211 │    7.208 │    7.140 │
├──────────────────────────────────┼─────────┼──────────┼──────────┼──────────┼──────────┤
│05 Frank / 386 - MOV-SHIFT        │    42   │   10.353 │   10.161 │   10.291 │   10.370 │
├──────────────────────────────────┼─────────┼──────────┼──────────┼──────────┼──────────┤


Frank
Mind is like a parachute. You know what to do in order to use it :-)

MichaelW

The only problem I see is the 0 logical cores.

┌────────────────────────────────────────────────────────────────────────────────────────┐
│OS  : Microsoft Windows 2000 Professional Service Pack 4 (build 2195)                   │
│CPU : Pentium III with 0 logical core(s) with SSE1                                      │
├──────────────────────────────────┬─────────┬──────────┬──────────┬──────────┬──────────┤
│        Algorithm notes           │Proc Size│ Test # 1 │ Test # 2 │ Test # 3 │ Test # 4 │
├──────────────────────────────────┼─────────┼──────────┼──────────┼──────────┼──────────┤
│01 Alex / MMX - PUNPCKLBW MOVQ    │    64   │   17.532 │   17.551 │   17.560 │   17.558 │
├──────────────────────────────────┼─────────┼──────────┼──────────┼──────────┼──────────┤
│02 Frank / 486 - MOV-BSWAP        │    43   │   19.390 │   19.380 │   19.383 │   19.377 │
├──────────────────────────────────┼─────────┼──────────┼──────────┼──────────┼──────────┤
│03 Frank / XMM PUNPCKLBW MOVDQA   │    45   │    1.591 │    1.598 │    1.591 │    1.592 │
├──────────────────────────────────┼─────────┼──────────┼──────────┼──────────┼──────────┤
│04 Alex / MMX - PUNPCKLBW MOVNTQ  │    64   │   10.154 │   10.152 │   10.153 │   10.152 │
├──────────────────────────────────┼─────────┼──────────┼──────────┼──────────┼──────────┤
│05 Frank / 386 - MOV-SHIFT        │    42   │   18.874 │   18.893 │   18.886 │   18.955 │
├──────────────────────────────────┼─────────┼──────────┼──────────┼──────────┼──────────┤
eschew obfuscation

Antariy

Quote from: MichaelW on November 14, 2010, 12:11:44 AM
The only problem I see is the 0 logical cores.

That is because this thing is not implemented on Intel's CPU which is earlyer than PIV. I used CPUID EAX=1 EBX 16...23 bits to get this value. But at time of PIII CPUID EAX=1 - EBX was "undefined".
This is can be fixed, of course, but funny enough  :lol

Frank, find this place at sources, use context search for shr ebx,16


mov eax,1
cpuid
shr ebx,16
and ebx,255


and add this line after and ebx,255:

sete al
add bl,al




Alex

frktons

OK Alex.

Changed and posted here.

The results MichaelW posted seems a bit strange:

01 Alex / MMX - PUNPCKLBW MOVQ    │    64   │   17.532 │   17.551 │   17.560 │   17.558


and


03 Frank / XMM PUNPCKLBW MOVDQA   │    45   │    1.591 │    1.598 │    1.591 │    1.592

::)



Frank
Mind is like a parachute. You know what to do in order to use it :-)

Antariy

Quote from: frktons on November 14, 2010, 01:39:17 AM
OK Alex.

Changed and posted here.

The results MichaelW posted seems a bit strange:

01 Alex / MMX - PUNPCKLBW MOVQ    │    64   │   17.532 │   17.551 │   17.560 │   17.558


and


03 Frank / XMM PUNPCKLBW MOVDQA   │    45   │    1.591 │    1.598 │    1.591 │    1.592

::)

Your XMM code is SSE2 - it is not work on PIII. But PIII is not generate #UD for some of SSE2 instructions (and used in your proc), but treat them as MMX. So, results just show nothing.

Frank, you new archive does not contain the sources :P

00402E17 C1EB10                 shr     ebx,10h
00402E1A 81E3FF000000           and     ebx,0FFh
00402E20 0F94C0                 sete    al
00402E23 02D8                   add     bl,al
00402E25 895C2404               mov     [esp+4],ebx




Alex

frktons

Quote from: Antariy on November 14, 2010, 02:01:09 AM
Your XMM code is SSE2 - it is not work on PIII. But PIII is not generate #UD for some of SSE2 instructions (and used in your proc), but treat them as MMX. So, results just show nothing.

Frank, you new archive does not contain the sources :P

00402E17 C1EB10                 shr     ebx,10h
00402E1A 81E3FF000000           and     ebx,0FFh
00402E20 0F94C0                 sete    al
00402E23 02D8                   add     bl,al
00402E25 895C2404               mov     [esp+4],ebx


Alex


I only posted the modified pieces, not everything. Just overwrite previous files on the folder
and all the stuff will be updated.  :U
Mind is like a parachute. You know what to do in order to use it :-)

Antariy

Frank, you putted ProgData.inc into archive, instead of ProgProc.inc.

frktons

Quote from: Antariy on November 14, 2010, 02:07:25 AM
Frank, you putted ProgData.inc into archive, instead of ProgProc.inc.


The few neurons still ON didn't realize it.  :dazzled:
Here it is the ProgProc.inc
Mind is like a parachute. You know what to do in order to use it :-)

Slugsnack

For the 0 logical cores problem, not sure if you want to do it but you could alternatively use GetProcessAffinityMask

dedndave

yah - that is the easy way
seeing as you should call that function anyways to get the system mask
(i also like to select core 0 while reading CPUID to insure the results all come from the same place)

you can use the system affinity mask to get total logical cores
then, examine the HTT bit in 0_1:EDX[10]
if they have hyper-threading, divide the total logical cores by 2 to find physical cores
otherwise, the bits counted in the system affinity mask represent the number of physical cores

i am having a look at your Info page, Frank
give me some time   :P

Antariy

Quote from: dedndave on November 14, 2010, 07:19:11 PM
yah - that is the easy way

Nothing can be simpler that using of native CPUs info with help of couple commands :P

Antariy

Quote from: Slugsnack on November 14, 2010, 06:35:27 PM
For the 0 logical cores problem, not sure if you want to do it but you could alternatively use GetProcessAffinityMask

0 cores is not a problem - that feature is just not exist at old CPUs. I guess, if we get zero as counter of cores - that is nonsense due to unimplementation of that feature, and we can just set counter to 1, that is done currently.

frktons

Quote from: Antariy on November 14, 2010, 09:37:21 PM
Quote from: Slugsnack on November 14, 2010, 06:35:27 PM
For the 0 logical cores problem, not sure if you want to do it but you could alternatively use GetProcessAffinityMask

0 cores is not a problem - that feature is just not exist at old CPUs. I guess, if we get zero as counter of cores - that is nonsense due to unimplementation of that feature, and we can just set counter to 1, that is done currently.


Hi Alex, can we keep the:


and add this line after and ebx,255:

sete al
add bl,al


To display "1 core" or are you thinking about something else?

Frank
Mind is like a parachute. You know what to do in order to use it :-)

Antariy

Quote from: frktons on November 14, 2010, 10:29:24 PM
Quote from: Antariy on November 14, 2010, 09:37:21 PM
Quote from: Slugsnack on November 14, 2010, 06:35:27 PM
For the 0 logical cores problem, not sure if you want to do it but you could alternatively use GetProcessAffinityMask

0 cores is not a problem - that feature is just not exist at old CPUs. I guess, if we get zero as counter of cores - that is nonsense due to unimplementation of that feature, and we can just set counter to 1, that is done currently.


Hi Alex, can we keep the:


and add this line after and ebx,255:

sete al
add bl,al


To display "1 core" or are you thinking about something else?


For 99.99% I think that is simplest possible way. Dave's way much more harder (and slower :P), and Dave's way is not guarantee right results (because counter which is returned by CPU is the real counter, and value returned by OS is counter which is implemented by OS :P).

Mine current solution is: if EBX 16...23 bits was zero (on old CPUs), then counter of cores will be set to 1. That is simplest and reliable enough way :P

frktons

Quote from: Antariy on November 14, 2010, 10:40:14 PM
For 99.99% I think that is simplest possible way. Dave's way much more harder (and slower :P), and Dave's way is not guarantee right results (because counter which is returned by CPU is the real counter, and value returned by OS is counter which is implemented by OS :P).

Mine current solution is: if EBX 16...23 bits was zero (on old CPUs), then counter of cores will be set to 1. That is simplest and reliable enough way :P


OK  :U

We keep the code as it is  :P
Mind is like a parachute. You know what to do in order to use it :-)