Hi! :dance:
I need your opinions about this function I wrote, I belive it's the fastest way (and simpliest).
StrLen proc Source:dword
push edi
push ecx
cld
xor ecx,ecx
not ecx
mov edi,Source
mov al,0
repnz scasb
not ecx
dec ecx
mov eax,ecx
pop ecx
pop edi
ret
StrLen endp
What do you think? ::)
It is simple, and it does appear to work correctly, but it is relatively slow compared to the two string length procedures in the MASM32 library. In the attached program I renamed your procedure so it would not conflict with the MASM32 StrLen procedure, eliminated the stack frame as was done for the other procedures, and commented out the cld as the instruction is (apparently) relatively slow, and unnecessary.
Results running on a P3:
MASM32 szLen : 73
MASM32 StrLen : 73
_StrLen : 73
MASM32 szLen : 152 cycles
MASM32 StrLen : 96 cycles
_StrLen : 333 cycles
[attachment deleted by admin]
Thanks for checking & "fixing" me :U
I have a problem with the code you sent - i see no output at the console :red
When I debug it with Olly I see it's stuck here (loop):
004014AD |> 6A 01 PUSH 1 ; /Timeout = 1. ms
004014AF |. E8 36000000 CALL <JMP.&kernel32.Sleep> ; \Sleep
004014B4 |. FF15 24204000 CALL DWORD PTR DS:[<&msvcrt._kbhit>] ; [_kbhit
004014BA |. 85C0 TEST EAX,EAX
004014BC |.^74 EF JE SHORT test.004014AD
So, no output nor the program quits. Wierd. Can you tell me what I'm doing wrong here?
(I compiled it under Chrome/MASM)
BTW I didn't know I can check speeds, so thanks hehe :8)
EDIT: Never mind I built it for Windows, not DOS :red
I'm tired of so many strlens...
http://www.masmforum.com/simple/index.php?topic=3414.0
eliran,
Your algo looks fine, the speed problem is due to the old string instruction SCASB which is very slow on modern machines. Even though we have all seen too many string length algos, they are in fact good practice at code optimisation so feel free to experiment with different designs to see how much faster you can make them. You will find that optimal solutions vary from one processor to another and while you may get one thats fast on a recent AMD, it may not be optimal for a recent Intel machine and of course vice versa.
You know, sense says that machine's operation is faster than code operation e.g
code:
func() {
do this;
do that;
do that too;
}
machine:
call func()
(.... then the machine does it all)
I hope you understood why I thought that SCASB is faster than any other possible algo :)
You made a reasonable assumption. In general, a single instruction will execute faster than a sequence of instructions that produces the same result. In this particular case you are using a single complex instruction that executes slower than a carefully selected and optimized sequence of simple instructions.
eliran,
Quote
I need your opinions about this function I wrote, I belive it's the fastest way (and simpliest).
Read these links and form your own opinion. Ratch
http://www.masmforum.com/simple/index.php?topic=1807.0
http://www.masmforum.com/simple/index.php?topic=2442.0