News:

MASM32 SDK Description, downloads and other helpful links
MASM32.com New Forum Link
masmforum WebSite

Looking for better "atol" algo.

Started by hutch--, August 12, 2010, 01:43:36 AM

Previous topic - Next topic

hutch--

I agree with Rockoon to this extent that the final test of any code design is how well it performs a specific task but this does not particularly help you when you are designing components to perform the task. The item being tested here is the conversion algorithm of an ascii representation of a signed number to a signed DWORD and to add another range of variable to it that is specific to one particular task would remove the refrence to any other particular task.

Now test bed design is not without its problems in that there must be some means of associating the test mechanism to the end task usage. I have always thought that the test technique that Michael developed and that JJ has further refined does something that is very hard to do in any other way, the capacity to test small sections of code against other small sections of code for purposes of comparison but I also know that like any other technique it has its limitations and I have seen these limitations when the timings start to get very low.

I have a background in testing code in real time as it more closely reflects how code is used in applications. Now this introduces another set of problems and getting consistent results is not without its difficulties either. Any ring3 application suffers from timing fluctuations due to higher privilege level interference then in a multitasking context and on a multicore processor task and core switching introduces yet another range of variables, prior core loading is another variable and for reasons that I don't claim to have quantified all that well, code OFFSET even when aligned often effects the speed of a particular algorithm and this is evident when you add another algo to test and it messes up the timings of ones already in the test bed.

The layout of the data is yet another variable which effects one of two alternate conditions, linear addressed data with all of its cached advantages versus random memory access with page thrashing disadvantages and the problem here is to differentiate between memory restrictions versus algorithm speed with data passed to it.

What I have tried to do with the testbed types I have posted here is reduce the range of variables while retaining the real time conditions, core selection, preload the core, adjust the inter algorithm padding and space the core loading with timed pauses between algo tests and let each algo set its own alignment as this is yet another variable from algo to algo.

What I have not tracked down is why some code changes timing depending on its OFFSET and the code placed both before and after it. Some algos are relatively insensitive to it where other fluctuate very badly when other code is added to the test piece.
Download site for MASM32      New MASM Forum
https://masm32.com          https://masm32.com/board/index.php

lingo

Wow, so much writing and no one believes you and no one wants to try your "testing "garbage...  :lol

"it's my ball, and i am taking it home with me - i don't wanna play anymore"

Try to catch as a tester dedndave due to he has nothing to do now and it seems his wife has left him because
his computer is very old and doesn't do the job anymore... :lol


hutch--

 :bg

Awe,

I did not think you would ruin your weekend just because gword writes fast algos.  :P
Download site for MASM32      New MASM Forum
https://masm32.com          https://masm32.com/board/index.php

dedndave

lol lingo
Zara and i are stuck together like glue
i bet your wife has little use for you
let's face it - how badly does she need fast code ?
and.....no matter how fast your code is, you have the personality of a slug

hutch--

 :bg

Now come on Dave, don't be too hard on him, once his wife gets a new i7 she may even let him use it. Then he may be able to write algos as fast as gword.  :P
Download site for MASM32      New MASM Forum
https://masm32.com          https://masm32.com/board/index.php

lingo

"let's face it - how badly does she need fast code ?"

When necessary I can write pretty slow and continuous code, still... :lol

"you have the personality of a slug"

I do not understand what that means because my specialty is high-voltage generators and transformers
rather then to be a specialist in different species of snails. :lol

"Zara and i are stuck together like glue"

Really? Interesting, if she is on the forum continually, as you, then who will feed the family and pets..
and why you replaced her photo with this ugly slug or snail...sorry, but I'm not a specialist like you in this area... :lol



"once his wife gets a new i7 she may even let him use it.
Highly doubt in it because she is a database administrator and I still hesitate because my car wants new winter tires  too...  :(

"Then he may be able to write algos as fast as gword."
Will be better to ask gWord for his opinion rather than to lying yourself and others with your garbage. :lol


hutch--

 :bg

> I still hesitate because my car wants new winter tires  too...

I think we all know this problem so you are excused for not writing code as fast as gWord until your wife buys an i7 and lets you use it.  :P
Download site for MASM32      New MASM Forum
https://masm32.com          https://masm32.com/board/index.php

dedndave

.....and your car has new tires
we all know how that can slow a person down

mineiro

I have tried this one and to me is hard. I'm posting the code here so anybody can do it better(like align) i think, but at least, this is better if cmp to original.


align 16
atol_m proc String:DWORD

pop eax
pop edx
push eax
xor ecx,ecx
xor eax,eax
xor al,[edx+ecx+0]
je @erro
xor eax,"-"
push eax
je @F
xor eax,0000001dh
@@:
xor cl,[ecx+edx+1]
je @F
lea eax,[eax+eax*4]
lea eax,[ecx+eax*2-30h]
xor cl,cl
or cl,[ecx+edx+2]
je @F
lea eax,[eax+eax*4]
lea eax,[ecx+eax*2-30h]
xor ecx,ecx
or cl,[ecx+edx+3]
je @F
lea eax,[eax+eax*4]
lea eax,[ecx+eax*2-30h]
xor cl,cl
xor cl,[ecx+edx+4]
je @F
lea eax,[eax+eax*4]
lea eax,[ecx+eax*2-30h]
xor ecx,ecx
xor cl,[ecx+edx+5]
je @F
lea eax,[eax+eax*4]
lea eax,[ecx+eax*2-30h]
xor cl,cl
xor cl,[ecx+edx+6]
je @F
lea eax,[eax+eax*4]
lea eax,[ecx+eax*2-30h]
xor ecx,ecx
xor cl,[ecx+edx+7]
je @F
lea eax,[eax+eax*4]
lea eax,[ecx+eax*2-30h]
xor ecx,ecx
xor cl,[ecx+edx+8]
je @F
lea eax,[eax+eax*4]
lea eax,[ecx+eax*2-30h]
xor ecx,ecx
xor cl,[ecx+edx+9]
je @F
lea eax,[eax+eax*4]
lea eax,[ecx+eax*2-30h]
xor ecx,ecx
xor cl,[ecx+edx+10]
je @F
lea eax,[eax+eax*4]
lea eax,[ecx+eax*2-30h]
pop ecx
neg eax
ret
@@:
pop ecx
or ecx,ecx
ja @F
neg eax
@@:
@erro:
ret
nop
atol_m endp

I have a question: is the number "2147483648" a bug??? Asking this because in these algos (including mine) it returns 80000000h, and this number it's not possible.
regards.

MichaelW

For a signed dword the decimal value 2147483648 is outside the valid range, -2147483648 to 2147483647.
eschew obfuscation

mineiro

Quote from: MichaelW on August 19, 2010, 02:52:55 AM
For a signed dword the decimal value 2147483648 is outside the valid range, -2147483648 to 2147483647.
thank you for the answer sr MichaelW, well, ...
zero == minus zero ok
nothing == minus nothing == zero ok (so, this is why about the question)
In these algos, 2147483648 == - 2147483648, these procedures here supose that this number have a signal, while it don't have.
regards.

mineiro

#41

align 4
atol_m2 proc String:DWORD
pop ebx ;to jmp ebx
pop edx
movzx eax,byte ptr [edx]
cmp eax,"-"
je @neg
jb @F
movzx ecx,byte ptr [edx+1]
test ecx,ecx
je @ssub30
lea eax, [eax+eax*4] ;lea eax, [eax+eax*4-30h*5]
lea eax, [ecx+eax*2-30h*11] ;lea eax, [ecx+eax*2-30h]
movzx ecx, byte ptr [edx+2]
test ecx,ecx
je @F
lea eax,[eax+eax*4]
lea eax,[ecx+eax*2-30h]
movzx ecx, byte ptr [edx+3]
test ecx,ecx ;cmp ecx,?? slow results in my pc
je @F
lea eax,[eax+eax*4]
lea eax,[ecx+eax*2-30h]
movzx ecx, byte ptr [edx+4]
test ecx,ecx
je @F
lea eax,[eax+eax*4]
lea eax,[ecx+eax*2-30h]
movzx ecx, byte ptr [edx+5]
test ecx,ecx
je @F
lea eax,[eax+eax*4]
lea eax,[ecx+eax*2-30h]
movzx ecx, byte ptr [edx+6]
test ecx,ecx
je @F
lea eax,[eax+eax*4]
lea eax,[ecx+eax*2-30h]
movzx ecx, byte ptr [edx+7]
test ecx,ecx
je @F
lea eax,[eax+eax*4] ;lea eax,[eax+eax*4]
lea eax,[ecx+eax*2-30h] ;lea eax,[ecx+eax*2-30h]
movzx ecx, byte ptr [edx+8] ;movzx ecx, byte ptr [edx+8]
test ecx,ecx ;jecxz @F ;is slow in my pc
je @F ;
lea eax,[eax+eax*4]
lea eax,[ecx+eax*2-30h]
movzx ecx, byte ptr [edx+9]
test ecx,ecx
je @F
lea eax,[eax+eax*4]
lea eax,[ecx+eax*2-30h]
@@:
jmp ebx ;dword ptr [esp-2*4]
@ssub30:
xor eax,30h  ;lea eax,[eax-30h]
jmp ebx  ;dword ptr [esp-2*4]

align 4
@neg:
movzx eax, byte ptr [edx+1]
test eax,eax
je @F
movzx ecx, byte ptr [edx+2]
test ecx,ecx
je @sub30
lea eax, [eax+eax*4]
lea eax, [ecx+eax*2-30h*11]
movzx ecx, byte ptr [edx+3]
test ecx,ecx
je @F
lea eax,[eax+eax*4]
lea eax,[ecx+eax*2-30h]
movzx ecx, byte ptr [edx+4]
test ecx,ecx
je @F
lea eax,[eax+eax*4]
lea eax,[ecx+eax*2-30h]
movzx ecx, byte ptr [edx+5]
test ecx,ecx
je @F
lea eax,[eax+eax*4]
lea eax,[ecx+eax*2-30h]
movzx ecx, byte ptr [edx+6]
test ecx,ecx
je @F
lea eax,[eax+eax*4]
lea eax,[ecx+eax*2-30h]
movzx ecx, byte ptr [edx+7]
test ecx,ecx
je @F
lea eax,[eax+eax*4]
lea eax,[ecx+eax*2-30h]
movzx ecx, byte ptr [edx+8]
test ecx,ecx
je @F
lea eax,[eax+eax*4]
lea eax,[ecx+eax*2-30h]
movzx ecx, byte ptr [edx+9]
test ecx,ecx
je @F
lea eax,[eax+eax*4]
lea eax,[ecx+eax*2-30h]
movzx ecx, byte ptr [edx+10]
test ecx,ecx
je @F
lea eax,[eax+eax*4]
lea eax,[ecx+eax*2-30h]
@@:
neg eax
@fim:
jmp ebx  ;dword ptr [esp-2*4]
@sub30:
xor eax,30h  ;lea eax,[eax-30h]
neg eax
jmp ebx  ;dword ptr [esp-2*4]
atol_m2 endp