The MASM Forum Archive 2004 to 2012

General Forums => The Laboratory => Topic started by: hutch-- on November 23, 2010, 02:43:10 AM

Title: Yet Another Benchmark Interface ! :)
Post by: hutch-- on November 23, 2010, 02:43:10 AM
The attached zip file is basically a template for a simlified dialog interface for testing algorithms and displaying results. Its a rewrite of the last one that uses radio buttons to indicate which algo is being tested and it has had the CPUID code added to recognise the processor. It may not handle old processors correctly but its tests for CPUID and tries to get the processor string if CPUID is supported by the processor. The dialog was created with Ketil Olsen's ResEd and it should be reasonably straight forward to modify if you have or download ResEd. It has both a manifest file and a version control block to keep the idiot end of AV scanners happy.

When I remember how to do the OS version I will add that as well.  :P

Output is a text file that looks like this.


Intel(R) Core(TM)2 Quad CPU Q9650 @ 3.00GHz
Results 8 pass average
timing utoa       455 ms
timing utoa_ex    126 ms
timing udw2str    859 ms
timing vcrt ustr$ 671 ms
Title: Re: Yet Another Benchmark Interface ! :)
Post by: dedndave on November 23, 2010, 03:03:51 AM
GetVersionInfoEx, if i remember

results prescott w/htt
Intel(R) Pentium(R) 4 CPU 3.00GHz
Results 8 pass average
timing utoa       872 ms
timing utoa_ex    293 ms
timing udw2str    2638 ms
timing vcrt ustr$ 2785 ms
Title: Re: Yet Another Benchmark Interface ! :)
Post by: oex on November 23, 2010, 03:10:15 AM
Interesting, the version control block compiled correctly for me this time with 6.15 I've had problems with it in the past....

Having said that it's just crashed :lol

Not sure why but it's hanging after the tests have run

I get the following errors when compiling (I changed the drive letter correctly in the Makeit.bat


rsrc.rc(3) : fatal error RC1015: cannot open include file 'include'.
Microsoft (R) Windows Resource To Object Converter Version 5.00.1736.1
Copyright (C) Microsoft Corp. 1992-1997. All rights reserved.

CVTRES : fatal error CVT1101: cannot open rsrc.res for reading
Microsoft (R) Macro Assembler Version 6.15.8803
Copyright (C) Microsoft Corp 1981-2000.  All rights reserved.

Assembling: bmtemplate.asm


Tried again and it didnt even run the algorithms.... Not hugely important for me, probably something I'm doing wrong
Title: Re: Yet Another Benchmark Interface ! :)
Post by: dedndave on November 23, 2010, 03:36:40 AM
i had that problem on the last one

on this one - i just ran it
let it run 8 times (takes a while)
if i messed with the buttons on the last version while it was running, it died

as for getting reliable numbers.....
rather than running the test 8 times for 0.5 seconds (or however long) and taking the average,
i am getting good repeatable numbers by running each test 10 times in a loop (50 mS each), keeping only the lowest time

        mov     ecx,10
        mov     edx,-1

time_loop0:
        push    ecx
        push    edx
        counter_begin 59524,HIGH_PRIORITY_CLASS
        INVOKE  crt__itoa,dwVal,offset OutBuf,dwRadix
        counter_end
        pop     edx
        pop     ecx
        cmp     eax,edx
        jae     time_loop1

        xchg    eax,edx

time_loop1:
        dec     ecx
        jnz     time_loop0

        print   ustr$(edx),13,10

note: in the example, dwVal = 0FFFFFFFFh and dwRadix = 2
i adjusted the loop count to achieve ~0.5 seconds total time
Title: Re: Yet Another Benchmark Interface ! :)
Post by: oex on November 23, 2010, 04:03:54 AM
Oh nice 1.... Cheers Dave I thought I'd f*cked up :lol

AMD Sempron(tm) Processor 3100+
Results 8 pass average
timing utoa       939 ms
timing utoa_ex    267 ms
timing udw2str    1078 ms
timing vcrt ustr$ 2927 ms
Title: Re: Yet Another Benchmark Interface ! :)
Post by: hutch-- on November 23, 2010, 04:29:20 AM
Apart from thge batch file having my local path, I don't see why you are getting build problems unless you are running an unusual setup.

If you look at the benchmark algo it has this line above it.


    iterate equ <15000000>


which means it runs each algo 15 million times in each pass. You can lower the iteration count if it is too long for an older processor. Beware of the technique of taking the lowest time, it is unreliable in real time on a working machine, if an algorithm fluctuates by any large degree in real time then it probably has its limitations.

Its easy enough to disable the buttons while the test is running.
Title: Re: Yet Another Benchmark Interface ! :)
Post by: hutch-- on November 23, 2010, 04:38:35 AM
Here is a tweaked version, iteration count dropped by 33% and buttons disabled while the test is running. I removed the drive letter from my local paths.

This much, will you try the EXE before you build it just to avoid any build diffferences.
Title: Re: Yet Another Benchmark Interface ! :)
Post by: oex on November 23, 2010, 04:50:19 AM
Yep bug fixed it was just slow on my machine and I thought it had hung when I couldnt click after first test.... dont understand the rsrc error I may have changed something this end on my setup....

Again (because it got drowned in bug code) ty for the manefest code I've had problems building that in before but with this app it worked like a dream with 6.15, now I just have to work out what I did wrong before :bg

AMD Sempron(tm) Processor 3100+
Results 8 pass average
timing utoa       585 ms
timing utoa_ex    177 ms
timing udw2str    666 ms
timing vcrt ustr$ 1826 ms
Title: Re: Yet Another Benchmark Interface ! :)
Post by: dedndave on November 23, 2010, 04:51:31 AM
QuoteBeware of the technique of taking the lowest time, it is unreliable in real time on a working machine, if an algorithm fluctuates by any large degree in real time then it probably has its limitations.

you have to convince me on that one, Hutch
we want to measure clock cycles consumed in an algo
not how much time the system decides to give your thread   :bg
Title: Re: Yet Another Benchmark Interface ! :)
Post by: hutch-- on November 23, 2010, 06:32:39 AM
Dave,

This is the problem, cycle counts on a multithread processor is a theory where real time is what an algorithm runs under. Its the performance in real time in a task switching environment that matters as algorithms that run in applications run in real time, not theoretical cycle counts.

This is why I suffer the irritations of timing in real time, I get a result that is something like how an algo will work in practice.
Title: Re: Yet Another Benchmark Interface ! :)
Post by: frktons on November 23, 2010, 07:46:02 AM
Couldn't compile with Make.bat, no time to check why. Going to the airport.
Runned test:
(http://www.masm32.com/board/index.php?action=dlattach;topic=15416.0;id=8502)

Enjoy

Frank - Win/7 64 bit Areo OFF  :P Soon this will be displayed as well   :wink
Title: Re: Yet Another Benchmark Interface ! :)
Post by: dedndave on November 23, 2010, 10:48:20 AM
well, Hutch, we can always devise a test to find out how things go
i will be interested to see an example where testing one way over another reverses your final selection of code

we may not think about it very often, but we use Michael's timing macros in 2 different ways, really
1) comparing algos in the laboratory
2) piece-by-piece selection of small code sequences during the optimization process

the later is more or less an individual task, that we don't always see in the forum
but, perhaps the 2 goals require different solutions
Title: Re: Yet Another Benchmark Interface ! :)
Post by: ToutEnMasm on November 25, 2010, 05:32:49 PM
Quote
Intel(R) Celeron(R) CPU 2.80GHz
Results 8 pass average
timing utoa    574 ms
timing utoa_ex 189 ms
timing utoa2   375 ms
timing utoa3   500 ms
Title: Re: Yet Another Benchmark Interface ! :)
Post by: lingo on November 26, 2010, 01:09:48 AM
It is my small improvement of the utoa_exe algo from here (http://www.masm32.com/board/index.php?topic=14642.msg119269#msg119269): :wink
Intel(R) Core(TM)2 Duo CPU E8500 @ 3.16GHz
Results 8 pass average
timing utoa    267 ms
timing utoa_ex 78 ms
timing utoa2   156 ms
timing utoa3   234 ms
timing utoa_exLingo   62 ms

Title: Re: Yet Another Benchmark Interface ! :)
Post by: clive on November 26, 2010, 01:26:06 AM
Intel(R) Atom(TM) CPU N450 @ 1.66GHz
Results 8 pass average
timing utoa    1353 ms
timing utoa_ex 524 ms
timing utoa2   850 ms
timing utoa3   1388 ms
timing utoa_exLingo   515 ms
Title: Re: Yet Another Benchmark Interface ! :)
Post by: oex on November 26, 2010, 03:45:55 AM
AMD Sempron(tm) Processor 3100+
Results 8 pass average
timing utoa    615 ms
timing utoa_ex 189 ms
timing utoa2   338 ms
timing utoa3   587 ms
timing utoa_exLingo   176 ms
Title: Re: Yet Another Benchmark Interface ! :)
Post by: hutch-- on November 26, 2010, 04:36:08 AM
 :U

Very good results on this Core2 Quad. It has upped the time of Paul's algo by about 20%.


Intel(R) Core(TM)2 Quad CPU Q9650 @ 3.00GHz
Results 8 pass average
timing utoa    285 ms
timing utoa_ex 78 ms
timing utoa2   156 ms
timing utoa3   234 ms
timing utoa_exLingo   62 ms
Title: Re: Yet Another Benchmark Interface ! :)
Post by: hutch-- on November 26, 2010, 08:41:53 AM
Lingo,

Sad to say your modification of Paul Dixon's algo is broken. Attached is the test piece.
Title: Re: Yet Another Benchmark Interface ! :)
Post by: dedndave on November 26, 2010, 11:49:47 AM
for 900000000-999999999, it returns the string with a leading 0 (0900000000-0999999999)
Title: Re: Yet Another Benchmark Interface ! :)
Post by: lingo on November 26, 2010, 03:13:57 PM
Corrected.  Thank you for testing.. :wink
The technical error was:
Wrong: jc  Lo0
Corrected: jbe Lo0
Title: Re: Yet Another Benchmark Interface ! :)
Post by: dedndave on November 26, 2010, 03:30:35 PM
w/lingo's fixed proc
prescott w/htt
Intel(R) Pentium(R) 4 CPU 3.00GHz
Results 8 pass average
timing utoa            802 ms
timing utoa_ex         302 ms
timing utoa2           615 ms
timing utoa3           734 ms
timing utoa_exLingo    247 ms
Title: Re: Yet Another Benchmark Interface ! :)
Post by: brethren on November 26, 2010, 03:57:43 PM
AMD Turion(tm) 64 X2 Mobile Technology TL-52
Results 8 pass average
timing utoa    664 ms
timing utoa_ex 208 ms
timing utoa2   367 ms
timing utoa3   626 ms
timing utoa_exLingo   193 ms
Title: Re: Yet Another Benchmark Interface ! :)
Post by: hutch-- on November 26, 2010, 10:09:03 PM
Compliments, this version works fine.

Decyphered and fit to be read by human beings. Will put it into the benchmark a bit later.

Title: Re: Yet Another Benchmark Interface ! :)
Post by: lingo on November 26, 2010, 11:02:01 PM
Thanks Hutch but you used my old variant in lingo_fixed.zip,
so pls reload my  bmtemplate1.zip. file  :wink

Title: Re: Yet Another Benchmark Interface ! :)
Post by: hutch-- on November 26, 2010, 11:47:48 PM
All I did was tweak the jump you posted.
Title: Re: Yet Another Benchmark Interface ! :)
Post by: clive on November 26, 2010, 11:55:12 PM
Intel(R) Pentium(R) Dual CPU T2330 @ 1.60GHz
Results 8 pass average
timing utoa    563 ms
timing utoa_ex 173 ms
timing utoa2   329 ms
timing utoa3   466 ms
timing utoa_exLingo   163 ms


AMD Athlon(tm) II X2 215 Processor
Results 8 pass average
timing utoa    419 ms
timing utoa_ex 153 ms
timing utoa2   224 ms
timing utoa3   374 ms
timing utoa_exLingo   144 ms


Intel(R) Atom(TM) CPU N450 @ 1.66GHz
Results 8 pass average
timing utoa    1343 ms
timing utoa_ex 536 ms
timing utoa2   824 ms
timing utoa3   1339 ms
timing utoa_exLingo   526 ms


Intel(R) Celeron(R) CPU 900 @ 2.20GHz
Results 8 pass average
timing utoa    429 ms
timing utoa_ex 126 ms
timing utoa2   247 ms
timing utoa3   364 ms
timing utoa_exLingo   111 ms
Title: Re: Yet Another Benchmark Interface ! :)
Post by: hutch-- on November 27, 2010, 12:13:36 AM
Here is your later version.


Intel(R) Core(TM)2 Quad CPU Q9650 @ 3.00GHz
Results 8 pass average
timing utoa    281 ms
timing utoa_ex 78 ms
timing utoa2   156 ms
timing utoa3   234 ms
timing utoa_exLingo   62 ms


Lingo,

One question, it would be a much more useful algo if it balanced the CALL / RET pairing. I don't like to fiddle with complex algos, do you think it could be done as no big deal ?
Title: Re: Yet Another Benchmark Interface ! :)
Post by: lingo on November 27, 2010, 01:13:52 AM
"...do you think it could be done as no big deal ?"

You can change the end
from:
        pop   esi
jmp   edx
to:
        pop   esi
push  edx
ret

but you will lose a tick... :wink
You can try to preserve ecx too (like jj) ... :lol
Title: Re: Yet Another Benchmark Interface ! :)
Post by: hutch-- on November 27, 2010, 01:17:43 AM
No change in timing.


Intel(R) Core(TM)2 Quad CPU Q9650 @ 3.00GHz
Results 8 pass average
timing utoa    283 ms
timing utoa_ex 78 ms
timing utoa2   156 ms
timing utoa3   234 ms
timing utoa_exLingo   62 ms
Title: Re: Yet Another Benchmark Interface ! :)
Post by: FORTRANS on November 27, 2010, 01:41:09 PM
Hi,

   This is the bmlog.txt from hutch's code in Reply #6.


Results 8 pass average
timing utoa       1634 ms
timing utoa_ex    646 ms
timing udw2str    2443 ms
timing vcrt ustr$ 4435 ms


   The code in Reply #13 does not run under Windows 2000
on this machine.

Regards,

Steve N.
Title: Re: Yet Another Benchmark Interface ! :)
Post by: hutch-- on November 27, 2010, 01:45:53 PM
Steve,

I know the CPUID code does not get an old box properly, what processor did you run it on ?
Title: Re: Yet Another Benchmark Interface ! :)
Post by: FORTRANS on November 27, 2010, 02:01:28 PM
Hi,

   Oops, it is my Windows 2000 Pro, Pentium III system.
Want/need a Win98 system?

Regards,

Steve N.