News:

MASM32 SDK Description, downloads and other helpful links
MASM32.com New Forum Link
masmforum WebSite

Benchmark and test for htodw algos.

Started by hutch--, August 03, 2010, 07:05:52 AM

Previous topic - Next topic

lingo

Guys,
I'm wondering why you continue with these emotions?
It is clear Hutch is invented new complex phenomena (named from him wrongly problem of "code placement" link)
that occur in the Computer Science. So, he will provide soon a fundamentally new paradigm for modeling, analysis and speed optimizing of this phenomena and his first step was so named from JJ hutchtrick.  :toothy

hutch--

JJ,

that is what the mod was for, to reduce the variation by making every algo use the same string for testing. It still does not solve the problem with the identical string in each algo assigned to the data section that some code arrangements effected Lingo's algo but not the others.

I found all of this stuff while building the original benchmark, Lingos was the fastest but it was the only one that slowed down by this extent when code was added either before OR after it. The problem is consistency for a freestanding algo, it may be ego massaging to be the top of the food chain with test pieces for timing but its little use with an algorithm that is sensitive to code placement as it renders it inconsistent in general purpose use.
Download site for MASM32      New MASM Forum
https://masm32.com          https://masm32.com/board/index.php

hutch--

Now for Lingo continuing to flap off his mouth,

> So, he will provide soon a fundamentally new paradigm for modeling, analysis and speed optimizing of this phenomena and his first step was so named from JJ hutchtrick

It is in fact a fundamentally OLD paradigm for modelling, analysis and speed optimisation called REAL TIME TESTING as against theoretical test frameworks for no matter how ego massaging the test bed results are, the real time test in the one that matters.

In the first benchmark I posted after a lot of fiddling Lingo's algo was the fastest on my Core2 hardware but it was inconsistent with code placement (Big word for LINGO = OFFSET) and its time fluctuated where the others did not and it was prone to be slow on its first call which renders it as it is useless for general purpose data processing.

Now given that Lingo is too lazy to try and fix the problem so that a fast algo is available in general purpose terms there are enough hex to DWORD conversion algos around that are reliable and consistent to not bother with his if he is more interested in flapping his mouth off than coding something more reliable.

Ain't like I get paid for fixing other peoples code so its not like its any great loss.
Download site for MASM32      New MASM Forum
https://masm32.com          https://masm32.com/board/index.php

lingo

"but its little use with an algorithm that is sensitive to code placement as it renders it inconsistent in general purpose use."
and
" which renders it as it is useless for general purpose data processing."

Wrong again...Why?
According to you, in the worst case, time of my algo is equal or faster to other's algos but not SLOWER,
in the best "code placement "case it is times faster too...
Hence, it is FASTER than others algos ALWAYS, independent of the cases!


'he is more interested in flapping his mouth off than coding something more reliable.'
Why to continue to do that? You see from years what I receive after that... :wink

hutch--

This is the problem in terms of benchmarking, I do most of my work on a Core2 Quad but with a range of other machines to test on I get wildly different results across all of them. Now with your algo running at its fastest its faster on all of them except the antique Celeron but with fluctuations in its timing depending on the code placement it runs from about as fast as a short version to slower.

Now what I am after is an algo that is faster on most of the processors most of the time and while this one at its best is fast enough, when it is not at its best it performs poorly. I have tried to track it down with a number of methods, code location (OFFSET), leading and trailing code, inter-algo padding, different code and table alignments and even played with changing the table order (ascending, descending and interleaved) and got it faster in its worst cae but not running at its full speed.

Real time testing brings out all of these types of problems and they are the hardest ones to solve so flapping off at me when I have written none of the algos is a sure fire way for me to stop wasting my time and just pick something reliable as its hardly a high usage requirement to convert 1 to 8 character hex to DWORD.
Download site for MASM32      New MASM Forum
https://masm32.com          https://masm32.com/board/index.php

dedndave

i am curious...
wouldn't it make sense to take the l_tbl_n tables out of the code segment and place them in the data segment ?
it seems to me that the data cache would work more efficiently that way
particularly if the tables were somewhat close to the string being operated on

ecube

Thanks for the entertainment guys  :bg

lingo

"and place them in the data segment?"

But he has no data segment in his "test" file.
Ok! I inserted .data segment and some data in it:
.data
align 16
mask39h dq 3939393939393939h
Recompiled the file and oops.... :clap:

C:\5>lingoslow
312 atodw library
109 Alex short
47 Lingo long
110 clive short
Press any key to continue ...
C:\5>lingoslow
297 atodw library
109 Alex short
47 Lingo long
109 clive short
Press any key to continue ...
C:\5>lingoslow
296 atodw library
109 Alex short
47 Lingo long
93 clive short
Press any key to continue ...
C:\5>lingoslow
312 atodw library
109 Alex short
46 Lingo long
94 clive short
Press any key to continue ...
C:\5>lingoslow
312 atodw library
109 Alex short
47 Lingo long
94 clive short
Press any key to continue ...



E^cube,
Y a welcome.. :toothy

Antariy

Jochen's h2dtimings.zip at last post in 3rd page:

Intel(R) Celeron(R) CPU 2.13GHz
468 htodw JJ short (124 bytes)
3047 atodw library
1282 Alex short
703 Lingo long
766 Alex long
2078 clive short

469 htodw JJ short (124 bytes)
3000 atodw library
1281 Alex short
703 Lingo long
750 Alex long
2078 clive short

Press any key to continue ...


Bravo, Jochen!
If it not only works fast, but and right - this is very good algo!
If you remember, I talk about 40MB exe to you? It must implement similar things, but with bigger "span" :), and works about ~10 commands :)

Bravo!

Alex

Antariy

Note to community: I'm really contrive algo, similar to Jochen's algo. But I not implement it... So, Jochen make this very nice, and I don't  accuse Jochen in thefting, NOTE this :)
Jochen, BRAVO, you don't be too lazy to make this, and this - is great!

Next step of this - making table with all DWORDs in DWORD range :), but this is impossible (even on 64bits systems)



Alex

jj2007

Quote from: Antariy on August 05, 2010, 11:00:49 PM
Intel(R) Celeron(R) CPU 2.13GHz
468 htodw JJ short (124 bytes)
703 Lingo long

Bravo, Jochen!

Thanks, Alex :bg

However, I did not advertise this one because it is limited to 8-byte strings. If your file has fixed length strings, it will be fine (and it's not even SSE...)

hutch--

Dave,

it was easy enough to put the tables at the end of the algo into the .data section, it just takes adding the .data and .code tags. This is one of the mods I did in early testing when the times started to wander. There was no timing difference either way. I changed the alignment of the table to 4 instead of 16 but there was no change in timing, changed the main algo alignment to 4 but with no change and tried misaligning the lead to the algo but no change.

I have another benchmark testing 1 million random length hex strings which load in dynamic memory where the algo is close to consistent and its just barely faster than Alex's long version by a couple of percent on this Core2 Quad but I have yet to test it over other hardware.
Download site for MASM32      New MASM Forum
https://masm32.com          https://masm32.com/board/index.php

mineiro

Intel(R) Pentium(R) Dual CPU E2160 @ 1.80GHz
656 atodw library
172 Alex short
93 Lingo long
110 Alex long
172 clive short
Press any key to continue ...

lingo

Thanks, mineiro but try it 5 times consecutively  and  get the best results for every algo. :toothy
Next, please download my file h2dt.zip from page2 and try every exe file 5 times consecutively  and  get the best results for every algo. Thanks! :U


lingo

"Intel(R) Celeron(R) CPU 2.13GHz
468 htodw JJ short (124 bytes)->wrong
703 Lingo long
Bravo, Jochen!
Thanks, Alex ...bla..bla..blah.."


It is a new attempt of the two liars to manipulate the people again, because JJ didn't include the creation time of his table... :lol