Print Page - Couting Up

Title: Couting Up - not so fast
Post by: wizzra on November 13, 2006, 10:16:55 PM

Hey guys,

I played - and got stuck coding an algorithm, that is, coding it can be easy. But optimizing it could be a bit problematic.

the Algorithm:
array of 5 bytes (for example, i need upto 11 bytes+-):
array db 0, 0, 0, 0, 0

1. Count up the first byte up to FFh, once over FFh, reset to 0 and add 1 to the second byte.
2. goto 1. than count the second byte upto FFh, once over FFh, reset to 0 and add 1 to the third byte.
and so on..

example:

Code Select


loop1:
00 00 00 00 00
01 00 00 00 00 
02 00 00 00 00 
....
FF 00 00 00 00

loop2:
00 01 00 00 00
01 01 00 00 00 
02 01 00 00 00 
....
FF 01 00 00 00

loop 3:
00 02 00 00 00
01 02 00 00 00 
02 02 00 00 00 
....
FF 02 00 00 00

loop 4:
00 FF 00 00 00
01 FF 00 00 00 
02 FF 00 00 00 
....
FF FF 00 00 00

loop 5:
00 00 01 00 00
01 00 01 00 00 
02 00 01 00 00 
....
FF 00 01 00 00

in asm (simpe code) could be like this:

Code Select


.data
arr db 0,0,0,0,0,0,0

.code
xor eax,eax
lea edi,arr

begin:
cmp al,0ffh
    mov byte ptr [edi],al
    jnz _end
    xor eax,eax
    dec eax
    add byte ptr [edi+1],1
    cmp byte ptr [edi+1],0ffh
    jnz _end
    add byte ptr [edi+2],1
    cmp byte ptr [edi+2],0ffh
    jnz _end
    add byte ptr [edi+3],1
    cmp byte ptr [edi+3],0ffh
    jnz _end
    add byte ptr [edi+4],1
    cmp byte ptr [edi+4],0ffh
    jnz _end
    jmp _exit
_end:
    inc eax
    jmp begin
_exit:

its slow as hell, i know!
i tried making it more registers-wise, but managed to count upto FFFF and way pass it seems to be more trouble:

Code Select


.code

test2:
    xor edx,edx
    xor eax,eax
    xor ecx,ecx
    mov ecx,8
    mov ebx,01000000h
loopy:
    add edx,ebx
    cmp edx,0ffff0000h
    
jz sec_loop
    cmp edx,0ffffffffh
    jz _exit
    mov eax,edx
    and eax, 0ff000000h
    ror eax,24
    cmp eax,0ffh
    jz fix_add1
    jmp loopy
fix_add1:
    and edx,00ffffffh
    ror ebx,cl
    add edx,ebx
    rol
 ebx,cl
    jmp loopy
sec_loop:
    and edx,0000ffffh
    mov ecx,16
    jmp loopy

hope you guys know about this, and if a fast algo already exists i would be glad to use it, if not, would be nice if we can find a good algorithm for it.

Best regards
Wizzra.

Title: Re: Couting Up - not so fast
Post by: ecube on November 13, 2006, 10:33:18 PM

you can do something like this for parsing strings in arrays

Code Select


.586
.MODEL FLAT,STDCALL
OPTION CASEMAP:NONE


include \masm32\include\windows.inc
include \masm32\include\kernel32.inc
include \masm32\include\masm32.inc
include \masm32\include\user32.inc

includelib \masm32\lib\kernel32.lib
includelib \masm32\lib\masm32.lib
includelib \masm32\lib\user32.lib

        mNextListEntry MACRO ML
          cld
          xor     eax, eax
          or      ecx, -1
          repnz scasb
          cmp     byte ptr[edi], 0
          jnz     ML
        ENDM
.data
mystring db "beer",0
            db "is",0
            db "good",0,0
.code
start:

mov     edi, offset mystring
@loopstr:

invoke MessageBox,0,edi,edi,MB_OK
mNextListEntry @loopstr
invoke ExitProcess,0
end start

Also is Donkeys string libray he has great array handling procedures you can add and remove strings to easy

Title: Re: Couting Up - not so fast
Post by: drizz on November 13, 2006, 11:02:13 PM

Hi Ben,

use add with carry, treat the array like a big number

Code Select

	xor eax,eax;; array == ecx::ebx::eax
	xor ebx,ebx
	xor ecx,ecx
	xor edx,edx
	mov edi,0FF000000h; [ 11 ] bytes
@@:	
	add eax,1
	adc ebx,edx
	adc ecx,edx
	test ecx,edi
	jz @B

Title: Re: Couting Up - not so fast
Post by: stanhebben on November 13, 2006, 11:02:58 PM

First, don't worry about your code - it's relatively fast.

You could use the following code:

Code Select


	mov esi, 0
@@:
	add byte ptr [edi+esi], 1
	jnc @F
	add esi, 1
	jmp @B
@@:

Just curious, what do you want to use it for? If you have 11 bytes, you can go until 2^88. If one increment takes 1 cycle, then my computer needs 5.10^12 years to get there.

If you need a large counter, you could perform 2 32-bit increments, for example:

Code Select


add dword ptr [edi], 1
adc dword ptr [edi+4], 0

Edit: and of course, you could also use dwords instead of bytes, as drizz said above

Title: Re: Couting Up - not so fast
Post by: daydreamer on November 14, 2006, 09:26:04 AM

I had such code once, about the same as Stan's code, but it was ascii numbers counting up on an old 6502, perfect for scores, for example if you got 300 points, count up 3 times, much faster than a integer->ascii conversion each time and only the screen limited it to how many digits

Title: Re: Couting Up - not so fast
Post by: wizzra on November 14, 2006, 10:32:01 PM

Just curious, what do you want to use it for?
i was hoping to try some brute-force technique in my engine. but i guess there is no hope since even register wise it will take alot to reach to full 11bytes with FF using this algorithm.

the adc trick is also nice thnx drizz :) i'll see if i can manage to find other ideas from it.

Title: Re: Couting Up - not so fast
Post by: dsouza123 on November 14, 2006, 11:55:49 PM

Code Select


.686
.model flat,stdcall 
option casemap:none

include    \masm32\include\windows.inc
include    \masm32\include\kernel32.inc
include    \masm32\include\user32.inc
includelib \masm32\lib\kernel32.lib
includelib \masm32\lib\user32.lib

.data
align 16
   arr      db 16 dup (0)
   szMsg    db "Done",0
   szCap    db "Counter",0

.code
start:
;--------------------------------------------------------
   mov esi, offset arr           ; version using registers
   xor eax, eax
@@:
   add dword ptr [esi+0], 1
   jnc @B
   adc dword ptr [esi+4], eax
   jnc @B
   adc dword ptr [esi+8], eax
   jnc @B

;------------------- Alternative just using memory ---------------------
@@:                                 
   stc                           ; no register version
   adc dword ptr [arr+0], 0
   jnc @B
   adc dword ptr [arr+4], 0
   jnc @B
   adc dword ptr [arr+8], 0
   jnc @B

   invoke MessageBox,NULL,ADDR szMsg,ADDR szCap,MB_OK
   invoke ExitProcess,0
end start

Updating just the first 4 bytes from 0 to 2^32 - 1 takes 17 seconds on an Athlon 1171 Mhz.
The the first five instructions of the register version fit within a 16 byte paragraph.

Title: Re: Couting Up - not so fast
Post by: wizzra on November 15, 2006, 08:18:58 PM

Thanks guys for your help,
I guess there is no much escape from this solution. It will take alllot of and alllot of years until a cpu will exists which can compute such count in split seconds :) (at least as a single cpu that is - not talking about cpu farm)
such simple task tsk tsk :)

best i could some up from your posts.

eax:edx 0FFFFFFFFFFFFFFFFh range

Code Select

   mov esi, offset arr           ; version using registers
   xor eax, eax
   xor edx,edx
first:
   adc eax,1
   jnc first
@@:
   adc edx,eax
   jnc first
   jmp _exit

thanks guys

Title: Re: Couting Up - not so fast
Post by: wizzra on November 15, 2006, 08:21:03 PM

btw, the above code of mine will count regulary, but u can always swap the order of the bytes to get the orginal idea.

Title: Re: Couting Up - not so fast
Post by: dsouza123 on November 16, 2006, 01:19:11 PM

Looking at your code if carry was already set then it goes from 0 to 2 at the start,
using add eax, 1 would solve it, also esi isn't needed, but the dual use of eax
for both the first dword and as a 0 for the register parameter for the adc
instead of using an immediate 0 or another register was insightful.

Here is a pure register version.

Code Select


   xor eax, eax  ; only register version no memory or immediates
   xor ebx, ebx
   inc ebx
   xor ecx, ecx
   xor edx, edx
@@:
   add eax, ebx
   jnc @B          ; goes back to nearest @@:
   adc edx, ecx
   jnc @B          ; goes back to nearest @@:
   jmp _exit

Title: Re: Couting Up - not so fast
Post by: wizzra on November 16, 2006, 05:11:11 PM

yeah i noticed :) thanks,
though... its still damn slow!!! (relatively slow)

Title: Re: Couting Up - not so fast
Post by: stanhebben on November 28, 2006, 08:45:27 PM

Quote from: wizzra on November 16, 2006, 05:11:11 PM
yeah i noticed :) thanks,
though... its still damn slow!!! (relatively slow)

Maybe change the algorithm to something that doesn't have to count up at all. First develop a quick algo, then a quick implementation of it.

Title: Re: Couting Up - not so fast
Post by: Human on December 07, 2006, 11:48:17 PM

why everybody use jc, less jumps code is faster. for 11 bytes you can do
add eax,1
adc ebx,0
adc edx,0
and ecx is as counter how many times add
if you need just 8 bytes use
paddq mm0,mm1 and mm1 holds 1

dsouza using registers?
i dont know for what he needs that, but for 1 time update speed isnt necessery, for highscore he can change add eax,1 to add eax,points, also with even 1000 runs its faster to move all to regs, do additions and after all write all to memory, not use add [esi+0],1
all is about optimizing, sometimes even add [something+0],1 will be faster if we have complex code and need all registers, so with direct memory write we dont need to save and restore some reg, ofcourse my code is slow, because it cant be fast, every operation stalls pipeline and alus due next opcode is dependent on last, but he can mix it with other code to make it faster. but for just counting idea is useless like whole topic

The MASM Forum Archive 2004 to 2012

General Forums => The Laboratory => Topic started by: wizzra on November 13, 2006, 10:16:55 PM