News:

MASM32 SDK Description, downloads and other helpful links
MASM32.com New Forum Link
masmforum WebSite

szLen optimize...

Started by denise_amiga, May 31, 2005, 07:42:44 PM

Previous topic - Next topic

herge

 Hi jj2007:

Your code is great!

But when I try to trace or Proceed a

CPUID

instruction. Windbg take's its sweet time.

The Go in Windbg works great. But on my

computer, Something weird is going on.

It's either my hardware or software

running on my computer.

Your program has always worked from a Dos

Box. [Command Prompt]

Regards herge
// Herge born  Brussels, Belgium May 22, 1907
// Died March 3, 1983
// Cartoonist of Tintin and Snowy

PBrennick

JJ,
Your implimentation looks fine to me. Looks liike he should ditch that debugger. There has always been issues with it, anyway. A few years back there was a thread about the pros and cons of it and it seems lots of people have had negative experiences with it.

By the way, I was not bashing your code, I was just wondering if his CPU has some limitations. Option number one is handled by all CPUs as far as I know, however.

Paul
The GeneSys Project is available from:
The Repository or My crappy website

herge

 Hi Paul:

What ever the problems with windbg, at least I can see my

code, I have a lot of trouble seeing the code in Ollly and have

not found out how to change it's font size.

Regards herge
// Herge born  Brussels, Belgium May 22, 1907
// Died March 3, 1983
// Cartoonist of Tintin and Snowy

Jimg

One of the secrets to tracing source code in Olly, is do NOT use the /Fl options when assembling.  It messes something up.

I usually use - /c /coff /Cp /nologo /Zi /Zd

and link - /SUBSYSTEM:WINDOWS /DEBUG /DEBUGTYPE:CV /INCREMENTAL:NO


To change the font of the source window, right click in the window, select appearance and font
Also you can change the font in the menu  Options/Appearance/Fonts

herge

 Hi Jimg:

1. Edit ollydbg.ini
2. Replace the last three lines with this:
Font name[7]=Font Herge
Font[7]=20,0,600,0,0,0,1,2,5,0,0
Face name[7]=Arial
3. Save and restart Olly
4. Right-click, choose Appearance/Font/Font Herge


jj2007 told me how to fix it.
Thanks jj2007.

Regards herge
// Herge born  Brussels, Belgium May 22, 1907
// Died March 3, 1983
// Cartoonist of Tintin and Snowy

BeeOnRope

Quote from: hutch-- on March 11, 2009, 10:21:19 AM
Years of reading posts leave you with a reasonably good idea of the value of an "atom cracking" string length algo. Let me think, "As useful as a hip pocket in a singlet", wjhat about the world's fastest "MessageBoxA" algo ? How about a hobbling horse in the Kentucky Derby ?  :P

Quote from: hutchNo I don't, I have been watching musical chairs on string length algos for at least the last 10 years, in about 99.9999999999999999999999999% of cases the slow byte scanner is more than fast enough and in the .0 --- 0001% of other cases Agner Fog's algo is even more than fast enough. Speed is greate but it must also be useful gains and string length algos are rarely ever a big deal.

I'm quite surprised by this statement.  I have been involved in writing and profiling enterprise software for years, and the str* functions are repeatedly found as some of the highest CPU users in various bits of code.  Sure, it is not going to be an issue for an MPEG encoder, but for applications that handle user input, communication with other components, whatever, I've often seen these functions be the bottleneck.  Using better string functions in cases has resulted in massive improvements in some workflows - even some we didn't know would be affected ahead of time.

Sure, you could argue that strlen itself is kind of useless, since at least someone new the length (at creation, for example) and this length could be passed around rather than using strlen, but the realities of software engineering, such as interop with other components, use of existing APIs, legacy code, and so on mean that it is useful in practice.  Other functions, such as strcpy are useful both in theory and in practice since they cannot be optimized away (unlike strlen, arguably). 

Saying that the str* functions are useless is like arguing that memcpy and friends aren't important either - since for many programs the former are used more than the latter.

NightWare

Quote from: BeeOnRope on March 31, 2009, 08:32:37 PM
Saying that the str* functions are useless is like arguing that memcpy and friends aren't important either - since for many programs the former are used more than the latter.
?
strlen algos are never used intensively (anway, not like memcopy) in a serious app, so the comparison is totally inappropriate. plus, if you code YOUR functions correctly (and stop using stupid win APIs), YOU DON'T NEED those algos, coz you "should" return the size with your function with a simple sub instruction... just for info, in ALL my sources i've used a strlen algo just ONCE, and only because i'm too lazy to update a counter, and because speed is not essential... i don't know of what your years of writing consist of, but you have things to learn... seriously...

MichaelW

Quote from: BeeOnRope on March 31, 2009, 08:32:37 PM
Sure, it is not going to be an issue for an MPEG encoder, but for applications that handle user input, communication with other components, whatever, I've often seen these functions be the bottleneck.  Using better string functions in cases has resulted in massive improvements in some workflows - even some we didn't know would be affected ahead of time.

A bottleneck for communication with other components, possibly, but the bottleneck for user input is obviously the user.


eschew obfuscation

jj2007

Quote from: NightWare on March 31, 2009, 09:49:15 PM
just for info, in ALL my sources i've used a strlen algo just ONCE

In all my sources, I use GOTO only once, and for a valid reason, but I just checked len() and found a value of about 6/kLine of code. I wouldn't mind getting rid of some of them, but it is not that easy in a general purpose app. For highly optimised graphics applications, that might be different, though.

hutch--

Bee,

I agree that string functions generally need to be fast, particularly when you are doing complex parsing but I would hold to my original comment that almost all string length requirements are more than adequately handled by the simplest byte scanner using one register. It is very rare to use long strings (> 1 meg) and where you do have an unusual case that has to repeatedly scan strings for their length, you write a different algo. Agner Fog's 1995 DWORD algo is still a very good performer here but if your task requires it you write a dedicated string length algo that is faster.

This is my favourite type of string length algo.


; «««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««

OPTION PROLOGUE:NONE
OPTION EPILOGUE:NONE

slen proc pstr:DWORD

    mov ecx, [esp+4]
    mov eax, ecx
    sub eax, 1

  @@:
    add eax, 1
    cmp BYTE PTR [eax], 0
    jne @B

    sub eax, ecx

    ret 4

slen endp

OPTION PROLOGUE:PrologueDef
OPTION EPILOGUE:EpilogueDef

; «««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««


A techynique I regularly use when tokenising a large string is to make a copy of the string if preserving the original matters, do a one pass in place tokenise on the data overwriting the line terminator with a zero and writing the start offset of each line to an array. Now this leaves me with an array of unaligned members but the tokenising method is faster than any data copy to array method by some considerable amount.

If I then need to get the length of any or all of the tokenised strings, I use the very small one above because in most instances its takeoff time makes it faster than the bigger clunkier ones that are very fast on single long strings but hopeless on variable length unaligned short strings.
Download site for MASM32      New MASM Forum
https://masm32.com          https://masm32.com/board/index.php

NightWare

hi hutch,

   lea eax,[ecx-1]
instead of
   mov eax,ecx
   sub eax,1
no ?

jj2007

Quote from: hutch-- on April 01, 2009, 01:37:24 AM

... and writing the start offset of each line to an array....

If I then need to get the length of any or all of the tokenised strings


Can't you just use (offset n+1)-(offset n)-2?

hutch--

NightWare,

Its a good mod but I tend to avoid LEA on a PIV as it is laggy. I would be interested to see if it has become faster again on a core 2 duo or quad.

JJ,

that suggestion makes sense except that you have to calculate the length reduction of either or both the CR and LF. If the task suited it your mod would be faster as the data is already present but it gets untidy if you pass the address of the tokenised string to another procedure.
Download site for MASM32      New MASM Forum
https://masm32.com          https://masm32.com/board/index.php

BeeOnRope

Quote from: NightWare on March 31, 2009, 09:49:15 PM
Quote from: BeeOnRope on March 31, 2009, 08:32:37 PM
Saying that the str* functions are useless is like arguing that memcpy and friends aren't important either - since for many programs the former are used more than the latter.
?
strlen algos are never used intensively (anway, not like memcopy) in a serious app, so the comparison is totally inappropriate. plus, if you code YOUR functions correctly (and stop using stupid win APIs), YOU DON'T NEED those algos, coz you "should" return the size with your function with a simple sub instruction... just for info, in ALL my sources i've used a strlen algo just ONCE, and only because i'm too lazy to update a counter, and because speed is not essential... i don't know of what your years of writing consist of, but you have things to learn... seriously...

That's an interesting statement.  You just said that string algos are *never* used in a serious app, yet I have been in developing several "serious" apps, and I've seen string functions, including strlen, be the bottleneck for interesting workflows for many of them.  It's very though to assert that something never happens when I'm saying plainly and without any particular secret motivation that I have seen exactly this in "serious" apps.

I didn't write the functions in question, rather noted the bottleneck in software developed by teams of hundreds of people - I already mentioned that in some cases it is possible to return a length (or to use a class that remembers it), but if you are interoperating with other code you may not have a choice because (a) you don't have the source (b) cannot legally modify the source (c) do not have the time to modify the source, etc.

BeeOnRope

Quote from: MichaelW on March 31, 2009, 10:04:04 PM
Quote from: BeeOnRope on March 31, 2009, 08:32:37 PM
Sure, it is not going to be an issue for an MPEG encoder, but for applications that handle user input, communication with other components, whatever, I've often seen these functions be the bottleneck.  Using better string functions in cases has resulted in massive improvements in some workflows - even some we didn't know would be affected ahead of time.

A bottleneck for communication with other components, possibly, but the bottleneck for user input is obviously the user.


Agreed - I wasn't totally clear there.  I meant dealing with text that originally came as user textual input, but is now be processed, perhaps repeatedly.  For example, string columns in a database often came originally from user input, but that happens (for example) once while the string itself may be queries, returned to clients, sorted, etc millions of times.  In such applications string functions may be useful and performance sensitive.