News:

MASM32 SDK Description, downloads and other helpful links
MASM32.com New Forum Link
masmforum WebSite

A New Testbed for MASM32 Forum

Started by frktons, September 21, 2010, 05:25:26 PM

Previous topic - Next topic

frktons

Quote from: dedndave on November 10, 2010, 04:31:40 AM
it could make a measurement and calculate a loop count variable
that way, the tests always take ~0.5 seconds

I'm not sure I understood exactly what you meant.
Could you elaborate that point in some details?

Frank
Mind is like a parachute. You know what to do in order to use it :-)

dedndave

well - my initial thought was to run the code a few times and adjust the loop count so the test runs 0.5 seconds
but, a better approach might be to run another thread
in that thread, use Sleep,500
when that expires, the timer loop stops and tallies the time and iterations   :P
it would require a little modification on MichaelW's timing code

MichaelW

Perhaps something like this:

;==============================================================================
    include \masm32\include\masm32rt.inc
    .686
    include \masm32\macros\timers.asm
;==============================================================================
    .data
        loopcount dd 1000
        dummy     dd 0
    .code
;==============================================================================
start:
;==============================================================================

  L0:
    shl loopcount, 1
    invoke GetTickCount
    push eax
    mov esi, loopcount
    counter_begin esi, HIGH_PRIORITY_CLASS
        xchg dummy, edx
    counter_end
    invoke GetTickCount
    pop edx
    sub eax, edx
    cmp eax, 500
    jb L0

    print "loopcount   ",9
    print str$(loopcount),13,10,13,10

    invoke GetTickCount
    push eax
    mov esi, loopcount
    counter_begin esi, HIGH_PRIORITY_CLASS
        xchg dummy, edx
    counter_end
    print "actual time ",9
    invoke GetTickCount
    pop edx
    sub eax, edx
    print str$(eax),"ms",13,10,13,10

    inkey "Press any key to exit..."
    exit

;==============================================================================
end start


If the calibration takes too long, it may be possible to speed it up by doing something like a binary search for the loop count.
eschew obfuscation

Antariy

Quote from: frktons on November 10, 2010, 03:32:37 AM
The LOOP COUNT was 1.000.000 in the latest version.
I've set it to 100.000 that is fast enough and accurate enough as well.  :P

This is good enough.  :U

frktons

The fresh new release RC1 nearing the final release.
Managed strings for CPU description wider than 80 chars, and some
more cleanining and changing here and there.

Alex routine now is 200% faster than previous version thanks to movq
that replaced movntq  :P
Frank


┌────────────────────────────────────────────────────────────────────────────────────────┐
│OS  : Microsoft Windows 7 Ultimate Edition, 64-bit (build 7600)                         │
│CPU : Intel(R) Core(TM)2 CPU 6600 @ 2.40GHz with MMX, SSE1, SSE2, SSE3, SSSE3           │
│                                                                                        │
│Test: Conversion of a screen buffer from DOS to Windows CHAR_INFO structure             │
├──────────────────────────────────┬─────────┬──────────┬──────────┬──────────┬──────────┤
│        Algorithm notes           │Proc Size│ Test # 1 │ Test # 2 │ Test # 3 │ Test # 4 │
├──────────────────────────────────┼─────────┼──────────┼──────────┼──────────┼──────────┤
│01 Alex / MMX - PUNPCKLBW MOVQ    │    64   │    3.183 │    3.183 │    3.182 │    3.180 │
├──────────────────────────────────┼─────────┼──────────┼──────────┼──────────┼──────────┤
│02 Frank / 486 - MOV-BSWAP OVQ    │    43   │    8.397 │    8.379 │    8.368 │    8.375 │
├──────────────────────────────────┼─────────┼──────────┼──────────┼──────────┼──────────┤
│03 Frank / XMM PUNPCKLBW MOVDQA   │    44   │    2.338 │    2.337 │    2.337 │    2.338 │
├──────────────────────────────────┼─────────┼──────────┼──────────┼──────────┼──────────┤


Well, as I can see there is still some cleaning to do  :lol

Frank
Mind is like a parachute. You know what to do in order to use it :-)

Antariy


├──────────────────────────────────┬─────────┬──────────┬──────────┬──────────┬──────────┤
│        Algorithm notes           │Proc Size│ Test # 1 │ Test # 2 │ Test # 3 │ Test # 4 │
├──────────────────────────────────┼─────────┼──────────┼──────────┼──────────┼──────────┤
│01 Alex / MMX - PUNPCKLBW MOVQ    │    64   │    9.497 │    9.646 │    9.478 │    9.453 │
├──────────────────────────────────┼─────────┼──────────┼──────────┼──────────┼──────────┤
│02 Frank / 486 - MOV-BSWAP OVQ    │    43   │   13.957 │   14.035 │   13.947 │   14.038 │
├──────────────────────────────────┼─────────┼──────────┼──────────┼──────────┼──────────┤
│03 Frank / XMM PUNPCKLBW MOVDQA   │    44   │    4.605 │    4.668 │    4.606 │    4.616 │
├──────────────────────────────────┼─────────┼──────────┼──────────┼──────────┼──────────┤


Apparently, only big cache have meaning in speed up :P

frktons

The string for Algo description was not cleared. I fixed it.

Yes Alex cache in these memory movement seems very important.

Frank

Cleaned algo description display:


┌────────────────────────────────────────────────────────────────────────────────────────┐
│OS  : Microsoft Windows 7 Ultimate Edition, 64-bit (build 7600)                         │
│CPU : Intel(R) Core(TM)2 CPU 6600 @ 2.40GHz with MMX, SSE1, SSE2, SSE3, SSSE3           │
│                                                                                        │
│Test: Conversion of a screen buffer from DOS to Windows CHAR_INFO structure             │
├──────────────────────────────────┬─────────┬──────────┬──────────┬──────────┬──────────┤
│        Algorithm notes           │Proc Size│ Test # 1 │ Test # 2 │ Test # 3 │ Test # 4 │
├──────────────────────────────────┼─────────┼──────────┼──────────┼──────────┼──────────┤
│01 Alex / MMX - PUNPCKLBW MOVQ    │    64   │    3.191 │    3.180 │    3.188 │    3.192 │
├──────────────────────────────────┼─────────┼──────────┼──────────┼──────────┼──────────┤
│02 Frank / 486 - MOV-BSWAP        │    43   │    8.361 │    8.358 │    8.369 │    8.373 │
├──────────────────────────────────┼─────────┼──────────┼──────────┼──────────┼──────────┤
│03 Frank / XMM PUNPCKLBW MOVDQA   │    44   │    2.339 │    2.337 │    2.338 │    2.344 │
├──────────────────────────────────┼─────────┼──────────┼──────────┼──────────┼──────────┤

Mind is like a parachute. You know what to do in order to use it :-)

frktons

With the first MMX algo by Alex we now have:


┌────────────────────────────────────────────────────────────────────────────────────────┐
│OS  : Microsoft Windows 7 Ultimate Edition, 64-bit (build 7600)                         │
│CPU : Intel(R) Core(TM)2 CPU 6600 @ 2.40GHz with MMX, SSE1, SSE2, SSE3, SSSE3           │
│                                                                                        │
│Test: Conversion of a screen buffer from DOS to Windows CHAR_INFO structure             │
├──────────────────────────────────┬─────────┬──────────┬──────────┬──────────┬──────────┤
│        Algorithm notes           │Proc Size│ Test # 1 │ Test # 2 │ Test # 3 │ Test # 4 │
├──────────────────────────────────┼─────────┼──────────┼──────────┼──────────┼──────────┤
│01 Alex / MMX - PUNPCKLBW MOVQ    │    64   │    3.190 │    3.183 │    3.180 │    3.179 │
├──────────────────────────────────┼─────────┼──────────┼──────────┼──────────┼──────────┤
│02 Frank / 486 - MOV-BSWAP        │    43   │    8.998 │    9.004 │    9.012 │    9.006 │
├──────────────────────────────────┼─────────┼──────────┼──────────┼──────────┼──────────┤
│03 Frank / XMM PUNPCKLBW MOVDQA   │    44   │    2.345 │    2.338 │    2.340 │    2.339 │
├──────────────────────────────────┼─────────┼──────────┼──────────┼──────────┼──────────┤
│04 Alex / MMX - PUNPCKLBW MOVNTQ  │    64   │    6.080 │    5.991 │    5.998 │    6.007 │
├──────────────────────────────────┼─────────┼──────────┼──────────┼──────────┼──────────┤

:U

With the next release I'll post also this 4th test, and some improvement as well.

Enjoy

Frank
Mind is like a parachute. You know what to do in order to use it :-)

frktons

Somebody with SSE4 code CPU capable is needed for testing some routines
inside the Testbed.
If anybody own a SSE4 capable PC, please test the attached code and post the
partial screen, like the following:


┌────────────────────────────────────────────────────────────────────────────────────────┐
│OS  : Microsoft Windows 7 Ultimate Edition, 64-bit (build 7600)                         │
│CPU : Intel(R) Core(TM)2 CPU 6600 @ 2.40GHz with MMX, SSE1, SSE2, SSE3, SSSE3           │
│                                                                                        │
│Test: Conversion of a screen buffer from DOS to Windows CHAR_INFO structure             │
├──────────────────────────────────┬─────────┬──────────┬──────────┬──────────┬──────────┤
│        Algorithm notes           │Proc Size│ Test # 1 │ Test # 2 │ Test # 3 │ Test # 4 │
├──────────────────────────────────┼─────────┼──────────┼──────────┼──────────┼──────────┤
│01 Alex / MMX - PUNPCKLBW MOVQ    │    64   │    3.119 │    3.121 │    3.117 │    3.120 │
├──────────────────────────────────┼─────────┼──────────┼──────────┼──────────┼──────────┤
│02 Frank / 486 - MOV-BSWAP        │    43   │    8.354 │    8.364 │    8.346 │    8.347 │
├──────────────────────────────────┼─────────┼──────────┼──────────┼──────────┼──────────┤
│03 Frank / XMM PUNPCKLBW MOVDQA   │    44   │    2.337 │    2.336 │    2.336 │    2.339 │
├──────────────────────────────────┼─────────┼──────────┼──────────┼──────────┼──────────┤
│04 Alex / MMX - PUNPCKLBW MOVNTQ  │    64   │    6.059 │    6.157 │    6.152 │    6.065 │
├──────────────────────────────────┼─────────┼──────────┼──────────┼──────────┼──────────┤
│05 Frank / 386 - MOV-SHIFT        │    42   │    8.521 │    8.646 │    8.521 │    8.520 │
├──────────────────────────────────┼─────────┼──────────┼──────────┼──────────┼──────────┤


Thanks

Frank
Mind is like a parachute. You know what to do in order to use it :-)

hutch--

Frank,

The file in zip file RC1A will build but will not run on my XP SP3.
Download site for MASM32      New MASM Forum
https://masm32.com          https://masm32.com/board/index.php

frktons

Quote from: hutch-- on November 11, 2010, 11:11:21 AM
Frank,

The file in zip file RC1A will build but will not run on my XP SP3.

Could you try to use the executable inside the zip?
Can you give me some info about the ML version as well?

I tried to make it usable also with previous versions of MASM,
but you never know...

Thanks

Frank
Mind is like a parachute. You know what to do in order to use it :-)

dedndave

it runs ok here - even though my CPU does not support SSE4
oh - the one posted above doesn't, either   :P
┌────────────────────────────────────────────────────────────────────────────────────────┐
│OS  : Microsoft Windows XP Professional Service Pack 2 (build 2600)                     │
│CPU : Intel(R) Pentium(R) 4 CPU 3.00GHz with MMX, SSE1, SSE2, SSE3                      │
│                                                                                        │
│Test: Conversion of a screen buffer from DOS to Windows CHAR_INFO structure             │
├──────────────────────────────────┬─────────┬──────────┬──────────┬──────────┬──────────┤
│        Algorithm notes           │Proc Size│ Test # 1 │ Test # 2 │ Test # 3 │ Test # 4 │
├──────────────────────────────────┼─────────┼──────────┼──────────┼──────────┼──────────┤
│01 Alex / MMX - PUNPCKLBW MOVQ    │    64   │    8.111 │    7.890 │    8.220 │    9.305 │
├──────────────────────────────────┼─────────┼──────────┼──────────┼──────────┼──────────┤
│02 Frank / 486 - MOV-BSWAP        │    43   │   13.244 │   13.281 │   13.640 │   13.766 │
├──────────────────────────────────┼─────────┼──────────┼──────────┼──────────┼──────────┤
│03 Frank / XMM PUNPCKLBW MOVDQA   │    44   │    4.364 │    4.372 │    4.385 │    4.383 │
├──────────────────────────────────┼─────────┼──────────┼──────────┼──────────┼──────────┤
│04 Alex / MMX - PUNPCKLBW MOVNTQ  │    64   │    9.085 │    8.859 │    8.790 │    8.824 │
├──────────────────────────────────┼─────────┼──────────┼──────────┼──────────┼──────────┤
│05 Frank / 386 - MOV-SHIFT        │    42   │   13.168 │   13.170 │   13.154 │   13.200 │


Hutch - i cannot build it either
Apparently, ML 10 is required to assemble

frktons

Quote from: dedndave on November 11, 2010, 11:22:53 AM
it runs ok here - even though my CPU does not support SSE4
oh - the one posted above doesn't, either   :P
┌────────────────────────────────────────────────────────────────────────────────────────┐
│OS  : Microsoft Windows XP Professional Service Pack 2 (build 2600)                     │
│CPU : Intel(R) Pentium(R) 4 CPU 3.00GHz with MMX, SSE1, SSE2, SSE3                      │
│                                                                                        │
│Test: Conversion of a screen buffer from DOS to Windows CHAR_INFO structure             │
├──────────────────────────────────┬─────────┬──────────┬──────────┬──────────┬──────────┤
│        Algorithm notes           │Proc Size│ Test # 1 │ Test # 2 │ Test # 3 │ Test # 4 │
├──────────────────────────────────┼─────────┼──────────┼──────────┼──────────┼──────────┤
│01 Alex / MMX - PUNPCKLBW MOVQ    │    64   │    8.111 │    7.890 │    8.220 │    9.305 │
├──────────────────────────────────┼─────────┼──────────┼──────────┼──────────┼──────────┤
│02 Frank / 486 - MOV-BSWAP        │    43   │   13.244 │   13.281 │   13.640 │   13.766 │
├──────────────────────────────────┼─────────┼──────────┼──────────┼──────────┼──────────┤
│03 Frank / XMM PUNPCKLBW MOVDQA   │    44   │    4.364 │    4.372 │    4.385 │    4.383 │
├──────────────────────────────────┼─────────┼──────────┼──────────┼──────────┼──────────┤
│04 Alex / MMX - PUNPCKLBW MOVNTQ  │    64   │    9.085 │    8.859 │    8.790 │    8.824 │
├──────────────────────────────────┼─────────┼──────────┼──────────┼──────────┼──────────┤
│05 Frank / 386 - MOV-SHIFT        │    42   │   13.168 │   13.170 │   13.154 │   13.200 │


Apparentely you also learned how to copy and paste a text console screen  :clap:
:lol :lol :lol :lol :lol :lol

Did you compile it? If yes which version of MASM did you use?

Thanks Dave anyway  :U

Frank
Mind is like a parachute. You know what to do in order to use it :-)

dedndave

i have copy/pasted it before - review previous posts  :bg

frktons

Quote from: dedndave on November 11, 2010, 11:27:23 AM
i have copy/pasted it before - review previous posts  :bg
:lol :lol :lol :lol

Of course you did  :U
Mind is like a parachute. You know what to do in order to use it :-)