News:

MASM32 SDK Description, downloads and other helpful links
MASM32.com New Forum Link
masmforum WebSite

Bug in udw2str function?

Started by michael_trm, June 16, 2005, 11:10:39 PM

Previous topic - Next topic

michael_trm

Bug in udw2str (udw2str converts an UNSIGNED DWORD to a string) function?
For example:

   mov eax , 80D309BEh
   invoke udw2str, eax,  offset Buffer

Must return 2 161 314 238, actually return "216 131 424",2Eh

Best regards,
   Michael Kuznetsov

MichaelW

Yes, I verified this problem. It starts with:

1073741829 = "107374183/"

And repeats when the last digit is 9. Eventually an additional problem with a trailing "." starts repeating when the last digit is 8, and then later another problem with some other character that repeats when the last digit is 7.
eschew obfuscation

Phil

#2
This link about reciprocal multiplication helps explain how the algorithm works. I was able to determine that the problem occured because the approximation of 1/10 that is used in the current library procedure cannot accurately handle numbers larger than the first error that Michael identified. To fix the problem I scaled the approximation of 1/10 by shifting it left by 3-bits and then shifted the resultant 'quotient' right by 3-bits. The quotient is produced in the hi-order bits by the multiply. It's certainly slower than the current library procedure because I added a shift. I also noticed that it doesn't store a final NULL byte. Maybe that is intentional. The decimal constant that had been used in the original code didn't give a clue about what was going on. It's previous value was 1999999Ah and I changed it to 0CCCCCCCDh to normalize it and get a full 32-bits of precision. The link explains it well.

Here's the modified procedure and the brute-force test that I ran to make sure it handles all values from 1 to 2^32-1.  It took more than 40-minutes to run on a 996 MHz P3 but it was worth knowing that it always produces the right values. (I'm trusting ustr2dw for the comaparison!)
; «««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««

    include \masm32\include\masm32rt.inc

    .data

      buffer db 16 dup (0)

    .code

; #########################################################################

  ; ---------------------------------------------------
  ; The original algorithm was written by comrade
  ; <comrade2k@hotmail.com>; http://www.comrade64.com/
  ;
  ;  It has been optimised by Alexander Yackubtchik
  ;  Commented and de-optimized by Phil (fix rounding)
  ; ---------------------------------------------------

  ; udw2str

  ; Parameters
  ;     dwNumber - 32-bit double-word to be converted
  ;     pszString - null-terminated string (output)
  ; Result
  ;     None

; #########################################################################

udw2str proc dwNumber:DWORD, pszString:DWORD

    push ebx
    push esi
    push edi

    mov     eax, [dwNumber]
    mov     esi, [pszString]
    mov     edi, [pszString]
    mov ecx,0CCCCCCCDh  ; 32-bit scaled approximation of 1/10 with rounding

  @@redo:
    mov ebx,eax                         ; save rest of dword to convert
    mul ecx                             ; multiply dword * 1/10 to divide
    shr edx,3                           ; remove scaling from hi-product
    mov eax,edx                         ; remember quotient (hi product)
    lea edx,[edx*4+edx]                 ; multiply quotient by 5
    add edx,edx                         ; then multiply by 2
    sub ebx,edx                         ; ebx = remainder
    add bl,'0'                          ; convert to ASCII and store
    mov [esi],bl
    inc esi
    test    eax, eax
    jnz     @@redo
    jmp     @@chks

  @@invs:                               ; reverse the string in place
    dec esi
    mov al, [edi]
    xchg [esi], al
    mov [edi], al
    inc edi
  @@chks:
    cmp edi, esi
    jb @@invs

    pop edi
    pop esi
    pop ebx

    ret

udw2str endp

; #########################################################################

start:

    xor ebx,ebx

top:

    add ebx,1
    jz  done
    invoke udw2str, ebx, ADDR buffer
    invoke ustr2dw, ADDR buffer
    cmp eax, ebx
    je top
    print "Failure: for hex "
    print hex$(ebx)
    print " decimal conversion was "
    print ADDR buffer,13,10
    jmp top

done:

    exit


end start

I should also be sure to point out that this is not in any way a *fix* for the current library procedure. It is just something that I did and tested that seems to fix the problem that was pointed out. I'm trying to find things to do around here that might end up saving others a little time and energy while I continue to learn MASM and x86 assembly.

MichaelW

Phil,

Good work. I took a look at the original code hoping for an easy fix, but then I realized that I would actually need to understand how it worked to fix it :bg

I used a different brute-force test that does a comparison between the strings returned by udw2str and the CRT _i64toa function (because there is apparently no unsigned 32-bit conversion available). Your version passes this test OK when the test value is increased from zero to 0FFFFFFFFh, but it bombs when the test value is decreased from 0FFFFFFFFh (at the point where the string length decreases from 10 to 9 digits). To correct this I added an instruction to append a null to the end of the string, and after that it passed the test OK in either direction.

; «««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««
    include \masm32\include\masm32rt.inc

    _udw2str PROTO :DWORD,:DWORD
; «««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««
    .data
        buffer1 db 100 dup(0)
        buffer2 db 100 dup(0)
        i64     dq 0
    .code
; «««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««
start:
; «««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««

    invoke GetTickCount
    push  eax

    ;xor   ebx,ebx
    ;.WHILE (ebx <= 0ffffffffh)
    mov   ebx,0ffffffffh
    .WHILE (ebx)
        invoke _udw2str,ebx,ADDR buffer1
        mov   dword ptr i64,ebx
        invoke crt__i64toa,i64,ADDR buffer2,10
        .IF (FUNC(szCmp,ADDR buffer1,ADDR buffer2) == 0)
            print "crt__i64toa: "
            print ADDR buffer2,32,32
            print "udw2str: "
            print ADDR buffer1,13,10
            .BREAK
        .ENDIF
        ;inc   ebx
        dec   ebx
        test  ebx,0fffffh
        jnz   @F
        print ADDR(buffer2),13,10 ; pacifier every 1048576 values
      @@:
    .ENDW

    invoke GetTickCount
    pop   ebx
    sub   eax, ebx
    print ustr$(eax),'ms',13,10

    mov   eax, input(13,10,"Press enter to exit...")
    exit
; «««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««

; #########################################################################

  ; ---------------------------------------------------
  ; The original algorithm was written by comrade
  ; <comrade2k@hotmail.com>; http://www.comrade64.com/
  ;
  ;  It has been optimised by Alexander Yackubtchik
  ;  Commented and de-optimized by Phil (fix rounding)
  ; ---------------------------------------------------

  ; udw2str

  ; Parameters
  ;     dwNumber - 32-bit double-word to be converted
  ;     pszString - null-terminated string (output)
  ; Result
  ;     None

; #########################################################################

_udw2str proc dwNumber:DWORD, pszString:DWORD

    push ebx
    push esi
    push edi

    mov     eax, [dwNumber]
    mov     esi, [pszString]
    mov     edi, [pszString]
    mov ecx,0CCCCCCCDh  ; 32-bit scaled approximation of 1/10 with rounding

  @@redo:
    mov ebx,eax                         ; save rest of dword to convert
    mul ecx                             ; multiply dword * 1/10 to divide
    shr edx,3                           ; remove scaling from hi-product
    mov eax,edx                         ; remember quotient (hi product)
    lea edx,[edx*4+edx]                 ; multiply quotient by 5
    add edx,edx                         ; then multiply by 2
    sub ebx,edx                         ; ebx = remainder
    add bl,'0'                          ; convert to ASCII and store
    mov [esi],bl
    inc esi
    test    eax, eax
    jnz     @@redo

    mov byte ptr[esi],0                 ; APPEND NULL

    jmp     @@chks

  @@invs:                               ; reverse the string in place
    dec esi
    mov al, [edi]
    xchg [esi], al
    mov [edi], al
    inc edi
  @@chks:
    cmp edi, esi
    jb @@invs

    pop edi
    pop esi
    pop ebx

    ret

_udw2str endp
; #########################################################################

; «««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««
end start


Test time 5.3 hours on my wimpy P3-500.


eschew obfuscation

Phil

Thanks for the reassurance Michael and also for the introduction to the CRT functions! I think the APPEND NULL should be added to the procedure but I thought maybe it was by design so I left is *as was* except for the scaled approximation and shr instruction. The link I'd mentioned from Douglas Jones mentions the math involved to determine the number of bits required in the approximation of the reciprocal to ensure that the quotient is always correct but I figured it would be quicker just to do the brute force approach rather than to fully understand what he was saying. Thanks for posting your test and for adding the final null to the routine.

I took a look at the dwtoa procedure in the current library today that is called by the str$ macro and it's almost identical to what we have added ... including the final null. It uses the decimal equivalent of 0xCCCCCCCD instead of the hex and prepends the '-' if the sign is negative. Other than that, it's the same code so putting the fix into the library shouldn't be a problem when Hutch is ready. I just wish I had thought to look there first! It would have saved me some reading and scratching my head time but still ... I have managed to learn something I hadn't known before and it was comforting to see that the extra shift that I added was probably the right way to do it!