News:

MASM32 SDK Description, downloads and other helpful links
MASM32.com New Forum Link
masmforum WebSite

Finding a character in a string - strchr.

Started by KeepingRealBusy, June 24, 2010, 04:25:24 AM

Previous topic - Next topic

lingo

Dave, please just redownload my code and test it again... :wink

KeepingRealBusy

Lingo,

Please explain in detail what you want me to do to "test it again".

Do you want me to test your .exe? If I do this, how can I test a short string? Your function does not fail with a long string and those were the only tests included with the timing driver calls. To test a short string requires that I re-assemble the source file and then I get into the same problem.

Have you updated your source and produced a new .zip? If so where is it? The path I see in the post for StrChar3a.zip is somewhere in the MASM32 site. Had the .zip been a link to your site, then downloading an updated copy of the .zip would potentially get an updated version that would work.

What assembler did you use to assemble your version? You will note that my makeit.bat had removed the \masm32\bin\ prefix for the ml command and I just call ml and get the Visual Studio masm (9.0). I assembled with masm (8.0) by adding back the \masm32\bin\ and got the same result, the same bad jne @f+15 that goes to the middle of the two byte instruction je @b instead of the start of the next instruction bsf edx,edx.

Please give me some more detail and I will be glad to test it again.

Dave.


lingo

#17
I changed it a bit to become more clear... :wink

KeepingRealBusy

Lingo,

I downloaded the zip and looked at the code. Ugh! I then executed the .exe without re-assembling (this is on my laptop without any development tools). Since it was based on my StrChar4.zip, it contained the short string test, and this test (all of them) all executed. The times were impressive. The code was short, but ugly! One symbol in the middle of 141 bytes, and 2 forward jumps to that symbol with declared offsets of +43, +29, and 3 back jumps to that symbol with offsets of 0, +78, and +62. If you are trying to impress, you didn't make it this time.

Dave.

jj2007

Quote from: KeepingRealBusy on June 26, 2010, 08:18:35 PM
Lingo,

... If you are trying to impress, you didn't make it this time.

He is not trying to impress anybody. He obfuscates his code by not commenting it and with these "odd" jumps because he is scared shitless that I might "steal" his miraculous code. To be honest, usually his code is faster than mine; but then, it often takes a while, with a little help from some friends here on the forum, until it's bug-free :green

KeepingRealBusy

JJ,

The odd jumps are ok, if and only if one knows what assembler is used to assemble the code. As I found out, his jne @f +15 jumped right into the middle of the other je @b and crashed. Code generation is an art and not a science, there are several ways to code some instructions, different lengths to be sure, and these can cause uncountable debug misery.

Dave.

oex

Quote from: KeepingRealBusy on June 26, 2010, 09:52:59 PM
Code generation is an art and not a science

You what? :lol Art has fluffy edges and completely unnecessary features, code generation is logic it contains only what it must to logically complete the task
We are all of us insane, just to varying degrees and intelligently balanced through networking

http://www.hereford.tv

clive

Quote from: oex on June 26, 2010, 10:00:16 PM
Quote from: KeepingRealBusy on June 26, 2010, 09:52:59 PM
Code generation is an art and not a science

You what? :lol Art has fluffy edges and completely unnecessary features, code generation is logic it contains only what it must to logically complete the task

Well the issue is going to be that only AMD and Intel have really accurate models of their cores, and quite a lot of the fine detail change between CPUs. Some of the differences can profoundly effect the performance/choice of instructions which perform the same task.

I would say there is an art to generating effective code, and that Lingo's code is pretty artful. A compiler can apply all the logic it wants, it's not going to "understand" your algorithm to any real degree.
It could be a random act of randomness. Those happen a lot as well.

Rockoon

When C++ compilers can be coerced to emit rcl and rcr, I *might* consider using one.

KeepingRealBusy

Quote from: clive on June 26, 2010, 10:43:04 PM

I would say there is an art to generating effective code, and that Lingo's code is pretty artful.


I especially liked his way of interleaving two different threads at the beginning (threads - maybe that is a bad term to use in a programming example because this is not a multi-core threading example), both building the argument for the character test while at the same time testing the first input for nulls, doing this by different indentations. I liked that. I also had not thought of using those sse instructions that way.

Dave.

Queue

If I turn on DEBUGHALT, it seems like every algo is failing on match null. Is anyone else having that problem?

Queue

KeepingRealBusy

Queue,

I did not try to re-assemble, just executed the .exe, and all tests ran. Without re-assembly, I cannot turn on DEBUGHALT. Later I will move this new .zip to my development system and try.

I just looked at the .asm file. It looks OK and when I tested my version (StrChar4.zip) I did test both ways, with DEBUGHALT, and then without DEBUGHALT to get accurate timing, and both versions ran all tests.

Have you tried to make the same test with my version?

Dave.

Queue

Yeah, I've tried with StrChar4, and Lingo's later 4a, and with either, after the cycle count for ''KRBNew3 match long string'' prints, it crashes (INT 3).

Edit - Ok, must be my processor (or OS)? I WAS trying it on a 1.3 GHz Athlon on Win98SE and it was failing on the match nulls, but on a 2.8 GHz Core2Duo on WinXP, it works fine.

Edit2 - If I comment out Dummy3 so that the data is aligned, my Athlon only fails on the match null in short string check for KRBLingo.

Queue

KeepingRealBusy

Queue,

I just move this to my development system, modified the makeit.bat file to define DEBUGHALT and emit a .lst file, modified the .asm to add a .list directive following the .data directive, and assembled. I checked the .list file and the check code was assembled. I executed the new .exe and all tests ran and verified, no halts.

Note: Lingo did not include a makeit.bat file, not has he replied to my question about which assembler he is using. I had to copy my makeit.bat file from my StrChar4 directory (which I included in my StrChar4.zip file).

Dave.

Queue

It looks like it's pure luck that it's succeeding on my Athlon when data's aligned. Adding arbitrary lengths of misalignment to the data makes it succeed or fail. For example, if I add between 10 and 16 bytes of padding before Src3, it succeeds on all algos, but between 1 and 9 makes it fail. For the short string null check, I need between 8 and 15 bytes of padding (I assume this is making it long enough to no longer be a short string?).

While I really have no idea what I'm doing, I can assemble and link this code, and it's interesting to me that it fails so arbitrarily on my old computer.

Queue