The MASM Forum Archive 2004 to 2012

Project Support Forums => MASM32 => Topic started by: hutch-- on November 24, 2010, 01:16:07 AM

Title: Notification on the "udw2str" algo as unsound.
Post by: hutch-- on November 24, 2010, 01:16:07 AM
Its an algo that was written and contributed about 10 years ago and I have never used it but while testing a range of similar algorithms recently, the "udw2str" algorithm fails over 1 gig and should not be used. It will be replaced with an equate to another algorithm that has been exhaustively tested over the unsigned DWORD range.
Title: Re: Notification on the "udw2str" algo as unsound.
Post by: dedndave on November 24, 2010, 02:38:19 AM
wow - surprising it wasn't found sooner   :bg
Title: Re: Notification on the "udw2str" algo as unsound.
Post by: MichaelW on November 24, 2010, 02:43:58 AM
It may be unsound, but it's still good enough for my financial calculations.

Title: Re: Notification on the "udw2str" algo as unsound.
Post by: hutch-- on November 24, 2010, 04:24:37 AM
 :bg

Michael,

That just means you have not been working on your first billion yet.  :P

Seriously though, the algo is broken and I have 3 others that are both faster and exhaustively tested.

If you can be bothered, give this one a try.



;  ¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤

OPTION PROLOGUE:NONE
OPTION EPILOGUE:NONE

udw2str2 proc dwNumber:DWORD, pszString:DWORD

    mov eax, [esp+4]

    cmp eax, 9
    ja @F
    mov edx, 1
    jmp nxt
  @@:
    cmp eax, 99
    ja @F
    mov edx, 2
    jmp nxt
  @@:
    cmp eax, 999
    ja @F
    mov edx, 3
    jmp nxt
  @@:
    cmp eax, 9999
    ja @F
    mov edx, 4
    jmp nxt
  @@:
    cmp eax, 99999
    ja @F
    mov edx, 5
    jmp nxt
  @@:
    cmp eax, 999999
    ja @F
    mov edx, 6
    jmp nxt
  @@:
    cmp eax, 9999999
    ja @F
    mov edx, 7
    jmp nxt
  @@:
    cmp eax, 99999999
    ja @F
    mov edx, 8
    jmp nxt
  @@:
    cmp eax, 999999999
    ja @F
    mov edx, 9
    jmp nxt
  @@:
    mov edx, 10

  nxt:
    mov eax, [esp+4]            ; src
    mov ecx, [esp+8]            ; lpdest             ; buffer address

    push ebx
    push esi
    push edi

    mov esi, 0CCCCCCCDh         ; "magic number" multiplier for division by 10
    mov ebx, 10
    add ecx, edx                ; add required buffer length to start address
    mov BYTE PTR [ecx], 0       ; write the terminator
    sub ecx, 1                  ; decrement down to first write position

  @@:
    mul esi                     ; multiply by magic number
    shrd eax, edx, 3            ; binary fractional "remainder" back in EAX
    shr edx, 3                  ; EDX = quotient
    add eax, 1                  ; precaution against occasional "underflow"
    mov edi, edx                ; save current quotient for next division
    mul ebx                     ; x10 gets "decimal" remainder into EDX
    add dl, 30h                 ; convert remainder to ascii
    mov [ecx], dl               ; insert it into buffer
    sub ecx, 1                  ; decrement buffer location
    mov eax, edi                ; retrieve current quotient
    test eax, eax               ; test if done
    jz @F                       ; continue if not done

    mul esi                     ; multiply by magic number
    shrd eax, edx, 3            ; binary fractional "remainder" back in EAX
    shr edx, 3                  ; EDX = quotient
    add eax, 1                  ; precaution against occasional "underflow"
    mov edi, edx                ; save current quotient for next division
    mul ebx                     ; x10 gets "decimal" remainder into EDX
    add dl, 30h                 ; convert remainder to ascii
    mov [ecx], dl               ; insert it into buffer
    sub ecx, 1                  ; decrement buffer location
    mov eax, edi                ; retrieve current quotient
    test eax, eax               ; test if done
    jnz @B                      ; continue if not done

  @@:
    pop edi
    pop esi
    pop ebx

    mov eax, [esp+8]

    ret 8

udw2str2 endp

OPTION PROLOGUE:PrologueDef
OPTION EPILOGUE:EpilogueDef

;  ¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤
Title: Re: Notification on the "udw2str" algo as unsound.
Post by: dedndave on November 24, 2010, 05:59:30 AM
i can do my taxes with five digits - and that includes pennies   :lol
Title: Re: Notification on the "udw2str" algo as unsound.
Post by: brethren on November 24, 2010, 06:29:29 PM
Quote from: dedndave on November 24, 2010, 02:38:19 AM
wow - surprising it wasn't found sooner   :bg

lol, you must of been busy with the algo timings :lol
http://www.masm32.com/board/index.php?topic=14642.msg118452#msg118452
Title: Re: Notification on the "udw2str" algo as unsound.
Post by: brethren on November 24, 2010, 06:57:15 PM
hutch, i tested it against mine...just a bit of fun :P

Quote
129 cycles for udw2str2
104 cycles for u32toa

use console, assemble and link to build the test
Title: Re: Notification on the "udw2str" algo as unsound.
Post by: hutch-- on November 24, 2010, 07:25:18 PM
Looks good,

Here is the timing on my Core2 quad.


99 cycles for udw2str2
98 cycles for u32toa
Press any key to continue ...


Now the big question, have you done the exhaustive test over the whole unsigned range ? What beat the old version as that it failed over 1 gig, the older version was about 30% faster once it was optimised.
Title: Re: Notification on the "udw2str" algo as unsound.
Post by: brethren on November 24, 2010, 07:37:16 PM
i tested it by using msvcrt ultoa and my algo to create 2 strings which i then strcmp'ed, starting at 0 and ending at 4294967295 (inclusive). my algos output matched ms ultoa over the full range so i'm certain it works for all unsigned values
Title: Re: Notification on the "udw2str" algo as unsound.
Post by: hutch-- on November 24, 2010, 07:41:15 PM
Just ran the same test and its correct over the entire range.  :U

Will have a play as this algo has room to make it faster.
Title: Re: Notification on the "udw2str" algo as unsound.
Post by: brethren on November 24, 2010, 07:45:22 PM
i'll be interested to see how much faster you can make it :U
Title: Re: Notification on the "udw2str" algo as unsound.
Post by: MichaelW on November 24, 2010, 08:28:28 PM
I also tested against the CRT _ultoa function over the entire 32-bit range and found no differences in the returned strings. And I tested a non-null filled destination buffer and dumped the first 20 bytes of the buffer after the test to verify that the procedure is not writing to the buffer beyond the 11th byte.
Title: Re: Notification on the "udw2str" algo as unsound.
Post by: hutch-- on November 24, 2010, 09:21:15 PM
This tweak is testing up OK over the full unsigned range. Remove the PUSHAD POPAD pair dropped the time, removed the stack frame, unrolled it by 2 then added the code to calculate the result length then removed the reverse string loop at the end. In the benchmark I am using time dropped from 250 ms to 187 ms.


;  ¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤

OPTION PROLOGUE:NONE
OPTION EPILOGUE:NONE

ALIGN 16
; u32toa by brethren
utoa2 PROC, value:DWORD, szbuf:DWORD

    push ebx
    push esi
    push edi

    mov eax, [esp+4][12]       ; value
    mov esi, [esp+8][12]       ; szbuf

    cmp eax, 9
    ja lb1
    add esi, 1
    jmp nxt
  lb1:
    cmp eax, 99
    ja lb2
    add esi, 2
    jmp nxt
  lb2:
    cmp eax, 999
    ja lb3
    add esi, 3
    jmp nxt
  lb3:
    cmp eax, 9999
    ja lb4
    add esi, 4
    jmp nxt
  lb4:
    cmp eax, 99999
    ja lb5
    add esi, 5
    jmp nxt
  lb5:
    cmp eax, 999999
    ja lb6
    add esi, 6
    jmp nxt
  lb6:
    cmp eax, 9999999
    ja lb7
    add esi, 7
    jmp nxt
  lb7:
    cmp eax, 99999999
    ja lb8
    add esi, 8
    jmp nxt
  lb8:
    cmp eax, 999999999
    ja lb9
    add esi, 9
    jmp nxt
  lb9:
    add esi, 10

  nxt:

    mov ebx, 0CCCCCCCDh
    mov BYTE PTR [esi], 0
    sub esi, 1

  stlp:
    mov ecx, eax
    mul ebx
    shr edx, 3
    mov eax, edx
    lea edx, [edx+edx*4]
    lea edx, [edx+edx]
    sub ecx, edx
    add cl, '0'
    mov [esi], cl
    sub esi, 1
    test eax, eax
    jbe lpqt

    mov ecx, eax
    mul ebx
    shr edx, 3
    mov eax, edx
    lea edx, [edx+edx*4]
    lea edx, [edx+edx]
    sub ecx, edx
    add cl, '0'
    mov [esi], cl
    sub esi, 1
    test eax, eax
    ja stlp

  lpqt:
    pop edi
    pop esi
    pop ebx

    ret 8

utoa2 ENDP

OPTION PROLOGUE:PrologueDef
OPTION EPILOGUE:EpilogueDef

;  ¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤
Title: Re: Notification on the "udw2str" algo as unsound.
Post by: brethren on November 24, 2010, 09:33:59 PM
a 20% speed increase when run on my pc :clap: btw i really like how you've eliminated the need for reversing the string :wink

Quote131 cycles for udw2str2
104 cycles for u32toa
81 cycles for utoa2

the same improvements could be made to my dwtoa routine as its virtually identical apart from the sign test

.code
dwtoa PROC, val:DWORD, buf:PTR BYTE
pushad
mov eax, val
mov ebx, 0CCCCCCCDh
mov esi, buf

test eax, eax
jge @F
mov BYTE PTR [esi], '-'
add esi, 1
neg eax
@@:
mov edi, esi
@@:
mov ecx, eax
mul ebx
shr edx, 3
mov eax, edx
lea edx, [edx+edx*4]
add edx, edx
sub ecx, edx
add cl, '0'
mov [esi], cl
add esi, 1
test eax, eax
ja @B

mov BYTE PTR [esi], NULL
sub esi, 1
@@:
mov al, [edi]
mov bl, [esi]
mov [edi], bl
mov [esi], al
add edi, 1
sub esi, 1
cmp edi, esi
jb @B

popad
ret
dwtoa ENDP
Title: Re: Notification on the "udw2str" algo as unsound.
Post by: lingo on November 25, 2010, 05:22:56 AM
brethren,
The same but faster coz I started with binary search:  :wink
align 16
;u32toalin by lingo
db  11 Dup (0)
u32toalin  PROC, value:DWORD, szbuf:DWORD
pop     edx        
pop     eax           
cmp    eax, 99999             
pop     ecx       
jbe      L4
cmp     eax, 9999999       
push    edx
jbe     L2
cmp    eax, 999999999   
        push   ebx
ja       L8
cmp    eax, 99999999     
ja       L1
add    ecx, 8
push   esi   
jne     L9
L1 :
add    ecx, 9
push   esi
jne     L9
L2:
cmp    eax, 999999         
push   ebx
ja       L3
add    ecx, 6
push   esi
jne     L9
L3:
add    ecx, 7
push   esi
jne     L9
L4:
cmp    eax, 999             
push   edx       
jbe     L6
cmp    eax, 9999             
push   ebx
ja       L5
add    ecx, 4
push   esi
jne     L9
L5:
add    ecx, 5
push   esi
jne     L9
L6:
cmp    eax, 99            
push   ebx
jbe     L7
add    ecx, 3
push   esi
jne     L9
L8:
push   esi   
add    ecx, 10
L9:
mov    byte ptr [ecx], 0
mov    esi,  0CCCCCCCDh
@@:
    mov ebx, eax
    mul esi
    shr edx, 3
    sub ecx, 1
    mov eax, edx
    lea edx, [edx+edx*4-18h]
    lea edx, [edx+edx]
    sub ebx, edx
    mov [ecx], bl
    test eax, eax
    je   @f

    mov ebx, eax
    mul esi
    shr edx, 3
    sub ecx, 1
    mov eax, edx
    lea edx, [edx+edx*4-18h]
    lea edx, [edx+edx]
    sub ebx, edx
    mov [ecx], bl
    test eax, eax
    jne  @b
@@:
    pop esi
    mov eax, ecx
    pop ebx
    ret
L7:
    cmp     eax, 9            
    push    esi
    jbe     L10
    add     ecx, 2                     
    jne     L9
L10:
    add eax, 30h             
    pop esi
    mov [ecx],  ax
    pop ebx
    mov eax, ecx
    ret
u32toalin ENDP

and results:
76 cycles for udw2str2-Hutch
65 cycles for u32toalin-Lingo
Press any key to continue ...

Title: Re: Notification on the "udw2str" algo as unsound.
Post by: hutch-- on November 25, 2010, 09:27:03 AM
 :bg

Use the later version posted in the Lab with this code then try string lengths from 1 to 10.



udw2str2 proc dwNumber:DWORD, pszString:DWORD

    push ebx
    push esi

    mov eax, [esp+4][8]       ; value
    mov esi, [esp+8][8]       ; szbuf

    cmp eax, 9999
    ja lb4

    cmp eax, 999999
    ja lb6

    cmp eax, 9
    ja lb1
    add esi, 1
    jmp nxt
  lb1:
    cmp eax, 99
    ja lb2
    add esi, 2
    jmp nxt
  lb2:
    cmp eax, 999
    ja lb3
    add esi, 3
    jmp nxt
  lb3:
    cmp eax, 9999
    ja lb4
    add esi, 4
    jmp nxt
  lb4:
    cmp eax, 99999
    ja lb5
    add esi, 5
    jmp nxt
  lb5:
    cmp eax, 999999
    ja lb6
    add esi, 6
    jmp nxt
  lb6:
    cmp eax, 9999999
    ja lb7
    add esi, 7
    jmp nxt
  lb7:
    cmp eax, 99999999
    ja lb8
    add esi, 8
    jmp nxt
  lb8:
    cmp eax, 999999999
    ja lb9
    add esi, 9
    jmp nxt
  lb9:
    add esi, 10

  nxt:

    mov ebx, 0CCCCCCCDh
    mov BYTE PTR [esi], 0
    sub esi, 1

  stlp:
    mov ecx, eax
    mul ebx
    shr edx, 3
    mov eax, edx
    lea edx, [edx+edx*4]
    lea edx, [edx+edx-48]
    sub ecx, edx
    mov [esi], cl
    sub esi, 1
    test eax, eax
    jbe lpqt

    mov ecx, eax
    mul ebx
    shr edx, 3
    mov eax, edx
    lea edx, [edx+edx*4]
    lea edx, [edx+edx-48]
    sub ecx, edx
    mov [esi], cl
    sub esi, 1
    test eax, eax
    ja stlp

  lpqt:
    pop esi
    pop ebx

    ret 8

udw2str2 endp
Title: Re: Notification on the "udw2str" algo as unsound.
Post by: dedndave on November 25, 2010, 10:48:36 AM
wouldn't it make sense to test for longer values first ?
frequency of occurance
32 bit integers converted to decimal strings

unsigned (number of characters = number of digits)

1 digit  10
2 digits 90
3 digits 900
4 digits 9000
5 digits 90000
6 digits 900000
7 digits 9000000
8 digits 90000000
9 digits 900000000
10 digits 3294967296
Title: Re: Notification on the "udw2str" algo as unsound.
Post by: hutch-- on November 25, 2010, 11:08:28 AM
 :bg

Yes but it was not any faster. I think Lingo has the right idea, middle branching on a basic tree layout made the first run of the algo faster but it does not seem to effect the sustain rate.
Title: Re: Notification on the "udw2str" algo as unsound.
Post by: dedndave on November 25, 2010, 11:15:40 AM
it would be faster overall !!!
if you wrote a test piece to take the time for each of the 4294967296 possible values,
added those times together, then divided by the same number, you'd see an improvement
(that's why i posted the frequency of occurance chart)

yes - spliting it into groups reduces the number of CMP's/JMP's that are executed, i think
        cmp     eax,9999999
        ja      more_than_7_digits

        cmp     eax,9999
        ja      more_than_4_digits

;-------------------------------

        cmp     eax,999
        ja      have_4_digits

        cmp     eax,99
        ja      have_3_digits

        cmp     eax,9
        ja      have_2_digits

have_1_digit:
        add     esi,1
        jmp     nxt

have_2_digits:
        add     esi,2
        jmp     nxt

have_3_digits:
        add     esi,3
        jmp     nxt

have_4_digits:
        add     esi,4
        jmp     nxt

;-------------------------------

more_than_4_digits:
        cmp     eax,999999
        ja      have_7_digits

        cmp     eax,99999
        ja      have_6_digits

have_5_digit:
        add     esi,5
        jmp     nxt

have_6_digits:
        add     esi,6
        jmp     nxt

have_7_digits:
        add     esi,7
        jmp     nxt

;-------------------------------

more_than_7_digits:
        cmp     eax,999999999
        ja      have_10_digits

        cmp     eax,99999999
        ja      have_9_digits

have_8_digits:
        add     esi,8
        jmp     nxt

have_9_digits:
        add     esi,9
        jmp     nxt

have_10_digits:
        add     esi,10

;-------------------------------

nxt:
Title: Re: Notification on the "udw2str" algo as unsound.
Post by: dedndave on November 25, 2010, 11:23:56 AM
in this case, i had 9 digits max
i needed to scale the value - not just adjust a pointer
it might be better if i used 32-bit instructions for EBX and ECX, though
i know - i know - i can unroll it, too   :bg
;determine the number of digits in the first dword

;ECX = 0 or 1
;ESI = active index pointer (offset of highest order accumulator dword)
;EBP = stack frame base pointer

Ascii0: mov     cl,9                    ;ECX = 9
        mov     eax,[esi]               ;load the first dword
        mov     bl,3                    ;BL = 3
        push    eax                     ;save the first dword

;scale the first dword by 1,000 to count digits
;we may make 0, 1, or 2 passes on this loop

Ascii1: cmp     eax,1000000             ;is it  >= 1,000,000 ?
        jae     Ascii2                  ;yes - determine mod 3 count

        sub     cl,bl                   ;no - subtract 3 from count
        imul    eax,eax,1000            ;scale it by 1,000
        cmp     cl,bl                   ;last 3 digits ?
        ja      Ascii1                  ;no - until it is
Title: Re: Notification on the "udw2str" algo as unsound.
Post by: hutch-- on November 25, 2010, 01:18:39 PM
Dave,

here is a version that does the calculations top down with branch predictions backwards and its still no faster.


;  ¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤

OPTION PROLOGUE:NONE
OPTION EPILOGUE:NONE

ALIGN 16
; u32toa by brethren
utoa2 PROC, value:DWORD, szbuf:DWORD

    push ebx
    push esi

    mov eax, [esp+4][8]       ; value
    mov esi, [esp+8][8]       ; szbuf

    cmp eax, 1000000000
    jb lb9
    add esi, 10

  nxt:
    mov ebx, 0CCCCCCCDh
    mov BYTE PTR [esi], 0
    sub esi, 1

  align 4
  stlp:
    mov ecx, eax
    mul ebx
    shr edx, 3
    mov eax, edx
    lea edx, [edx+edx*4]
    lea edx, [edx+edx-48]
    sub ecx, edx
    mov [esi], cl
    sub esi, 1
    test eax, eax
    jbe lpqt

    mov ecx, eax
    mul ebx
    shr edx, 3
    mov eax, edx
    lea edx, [edx+edx*4]
    lea edx, [edx+edx-48]
    sub ecx, edx
    mov [esi], cl
    sub esi, 1
    test eax, eax
    ja stlp

  lpqt:
    pop esi
    pop ebx

    ret 8

  align 4
  lb1:
    add esi, 1
    jmp nxt
  lb2:
    cmp eax, 10
    jb lb1
    add esi, 2
    jmp nxt
  lb3:
    cmp eax, 100
    jb lb2
    add esi, 3
    jmp nxt
  lb4:
    cmp eax, 1000
    jb lb3
    add esi, 4
    jmp nxt
  lb5:
    cmp eax, 10000
    jb lb4
    add esi, 5
    jmp nxt
  lb6:
    cmp eax, 100000
    jb lb5
    add esi, 6
    jmp nxt
  lb7:
    cmp eax, 1000000
    jb lb6
    add esi, 7
    jmp nxt
  lb8:
    cmp eax, 10000000
    jb lb7
    add esi, 8
    jmp nxt
  lb9:
    cmp eax, 100000000
    jb lb8
    add esi, 9
    jmp nxt

utoa2 ENDP

OPTION PROLOGUE:PrologueDef
OPTION EPILOGUE:EpilogueDef

;  ¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤
Title: Re: Notification on the "udw2str" algo as unsound.
Post by: dedndave on November 25, 2010, 01:54:59 PM
well - it's no faster on your i7   :P
and - testing it with the same value over and over isn't going to reveal anything

i think the tree here is a good plan
http://www.masm32.com/board/index.php?topic=15427.msg126379#msg126379

i have to get my head into your test code - lol
Title: Re: Notification on the "udw2str" algo as unsound.
Post by: lingo on November 25, 2010, 02:52:44 PM
"Use the later version posted in the Lab..."

I did it and my algo is still faster... :wink
results are:
69 cycles for udw2str2-Hutch
65 cycles for u32toalin-Lingo
Press any key to continue ...
Title: Re: Notification on the "udw2str" algo as unsound.
Post by: brethren on November 25, 2010, 05:04:41 PM
lingos test4 on AMD Turion(tm) 64 X2 Mobile Technology TL-52
71 cycles for udw2str2-Hutch
77 cycles for u32toalin-Lingo
Press any key to continue ...

test2
80 cycles for udw2str2-Hutch
68 cycles for u32toalin-Lingo
Press any key to continue ...

suprisingly the only difference in the code in u32toalin between lingos test2 and test4(where's test3 :P) is:
test4

lea edx, [edx+edx*4-24]
lea edx, [edx+edx]

test2

lea edx, [edx+edx*4]
lea edx, [edx+edx-30h]


Title: Re: Notification on the "udw2str" algo as unsound.
Post by: jj2007 on November 25, 2010, 06:21:10 PM
Quote from: brethren on November 25, 2010, 05:04:41 PM
suprisingly the only difference in the code between lingos test2 and test4(where's test3 :P) is:

Same size, same cycle count on the Celeron M. But Hutch's algo runs at 94 cycles instead of 104.
Title: Re: Notification on the "udw2str" algo as unsound.
Post by: hutch-- on November 25, 2010, 10:38:22 PM
 :bg

Dave,

I mainly use the Core2 Quad, the i7 Win64 is for watching TV while I am writing code on the other box.  :P

It runs fine and is slightly faster here and there than the Core2 quad but I confess I don't like Win7 much.

The last algo I posted that does the lead calculation top down should have the lower overhead for 10 digit numbers as it just falls through into the conversion. I have my doubts with the benchmarks posted as they do not sweep the range down to small digit counts, that is why I am doing the timings with the dialog benchmark in the lab.

I played with middle range branching to the comparison list but it only seems to favour samples that are near that length and each leading branch slows the rest down. I could not get the algo faster when sweeping different length ranges by doing it so I gave up on the idea. I still think your logic of doing the calculation top down makes the most sense as the number counts are much higher for the large values.
Title: Re: Notification on the "udw2str" algo as unsound.
Post by: dedndave on November 26, 2010, 12:09:35 AM
even the Core2 Quad is fast on so many things
you need to run it on your ShitBox   :bg  (closely emulates what the rest of us can afford - lol)

i still think the tree in reply #18 is the way to fly, Hutch