Has anyone got recent code for dword to ascii, signed and unsigned ?

Started by hutch--, August 17, 2010, 04:40:43 PM

Previous topic - Next topic

hutch--

I have modified Ray's unsigned DWORD version so that it can be used from both the bufffer address or the return value. I have not timed it but I have tested it over the full unsigned DWORD range and it is producing the correct results for the whole range.


; ¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤

OPTION PROLOGUE:NONE
OPTION EPILOGUE:NONE

    align 16

utoa32 proc valu:DWORD,dst:DWORD

    mov eax, [esp+4]            ; valu
    mov ecx, [esp+8]            ; dst          ; buffer address

    push ebx
    push esi
    push edi

    mov esi, 0CCCCCCCDh         ; "magic number" multiplier for division by 10
    mov ebx, 10
    sub ecx, 1
@@:
    add ecx, 1                  ; increment to next digit in buffer
    mul esi                     ; multiply by magic number
    shrd eax, edx, 3            ; binary fractional "remainder" back in EAX
    shr edx, 3                  ; EDX = quotient
    add eax, 1                  ; precaution against occasional "underflow"
    mov edi, edx                ; save current quotient for next division
    mul ebx                     ; x10 gets "decimal" remainder into EDX
    add dl, 48                  ; convert remainder to ascii
    mov [ecx], dl               ; insert it into buffer
    mov eax, edi                ; retrieve current quotient
    test edi, edi               ; test if done
    jnz @B                      ; continue if not done

    mov BYTE PTR [ecx+1], 0     ; terminate string
    mov eax, [esp+20]           ; get 1st byte address in destination

  REPEAT 5                      ; reverse the string
    movzx edx, BYTE PTR [eax]
    movzx ebx, BYTE PTR [ecx]
    mov [eax], bl
    mov [ecx], dl
    add eax, 1
    sub ecx, 1
    cmp eax, ecx
    jge @F
  ENDM

  @@:

    pop edi
    pop esi
    pop ebx

    mov eax, [esp+8]            ; return the buffer address

    ret 8

utoa32 endp

OPTION PROLOGUE:PrologueDef
OPTION EPILOGUE:EpilogueDef

; ¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤
Download site for MASM32      New MASM Forum
https://masm32.com          https://masm32.com/board/index.php


hutch--

Thanks Paul, it looks like an interesting algo using the character table.
Download site for MASM32      New MASM Forum
https://masm32.com          https://masm32.com/board/index.php

jj2007

Just for fun:
Intel(R) Celeron(R) M CPU        420  @ 1.60GHz (SSE3)
1.6 Ghz: 200 cycles == 8 Mio loops per second

210     cycles for signed Str$(12345)
210     cycles for unsigned Str$(12345)
468     cycles for wsprintf 12345
69      cycles for umdtoa 12345
65      cycles for dwtoa 12345

207     cycles for signed Str$(12345)
210     cycles for unsigned Str$(12345)
453     cycles for wsprintf 12345
69      cycles for umdtoa 12345
65      cycles for dwtoa 12345

Some fun with Str$():
On FPU, ST(6):  6000.00
On FPU, ST(1):  1000.00
PI in ST(0):    3.141592653589793238
Qword:          1234567890123456789
Real4:          12345.67773437500000
Real8:          1.234567889999999925e+37
as xmm:         1.234567889999999925e+37
Real10:         -1.234567890123456789e-111
ecx=-1, uns.:   4294967295
ecx=-1, signed: -1
ecx=-1, cx:     65535
ecx=-1, cl:     255
Simple arithmetics (eax=100, xmm0=200, ecx=1000):
-eax*xmm0/cx+100=80.00000


You need good arguments to get me back to invoke dwtoa, reg32, addr buffer ;-)

Print "Some fun with Str$():", CrLf$
push 1000*7
REPEAT 7
fild dword ptr [esp]
sub dword ptr [esp], 1000
ENDM
fldpi
Print Str$("On FPU, ST(6):\t%6f\n", ST(6)) ; prints but 6+7 are trashed afterwards
Print Str$("On FPU, ST(1):\t%5f\n", ST(1))
Print Str$("PI in ST(0):\t%Jf\n", ST(0))
Print Str$("Qword:   \t%i\n", MyQword)
Print Str$("Real4:    \t%Jf\n", MyR4)
Print Str$("Real8:    \t%Jf\n", MyR8)
movq xmm1, MyR8
Print Str$("as xmm:    \t%Jf\n", f:xmm1)
Print Str$("Real10:   \t%Jf\n", MyR10)
mov ecx, -1
Print Str$("ecx=-1, uns.:\t%u\n", ecx)
Print Str$("ecx=-1, signed:\t%i\n", ecx)
Print Str$("ecx=-1, cx:\t%i\n", cx)
Print Str$("ecx=-1, cl:\t%i\n", cl)
Print "Simple arithmetics (eax=100, xmm0=200, ecx=1000):", CrLf$
mov eax, 200
movd xmm0, eax
mov eax, 100
mov ecx, 1000
Print Str$("-eax*xmm0/cx+100=%f\n", -eax*xmm0/cx+100)
pop eax

dedndave

well - speed is what they are going for   :P
however, not very often do you need to convert thousands of dwords to ascii in a big hurry - lol
taking that into consideration, the routine should be small, flexible and, of course, reliable

hutch--

The main factor in using Ray's algo is it delivers the correct results over the whole DWORD range. There may be minor speedups but its reasonably fast as it is allowing that I put back the string reverse so that the result began at the beginning of the allocated buffer.

I need to have a look at Paul Dixon's table based algo as it has the potential to be faster again, main problem is I will have to have a good play with it to comprehend how its done.
Download site for MASM32      New MASM Forum
https://masm32.com          https://masm32.com/board/index.php

dioxin

Hutch,
it'll not take long to verify if it works on all possible input values, signed and unsigned. If it does work then there's no need to understand why!

Paul.

dioxin

Hutch,
here's an attempt to explain the code, I don't know if it'll help you or confuse you even more:



Step 1. The sign (if needed)
There are 2 entry points, sdword: will calculate the signed result and udword will calculate the unsigned result.
Calling sdword sorts out the sign and then falls into udword as the routine is the same for both once the sign is sorted.




Step 2. Multiply the number by 2^45 \ 10000.
Imagine if the number was multiplied by 2^32. That would result in the number being transferred from eax to edx, just like a 32 bit shift.
To multiply by 2^32 \ 10000 will calculate the number\10000 to edx.
2^45 = 2^32 x 2^ 13 so to multiply the number by 2^45 \ 10000 will calculate the number\10000 shifted 13 bits left to edx

The 2^32 is hopefully clear.
The \10000 is splitting the 10 digit (max) number into 2 parts, the high 6 digits and the low 4 digits. \10000 is leaving 4 digits off.
The extra 2^13 is needed bacause the accuracy of the multiply causes rounding errors and those errors are reduced to < half a bit (i.e.no error) because of the extra 13 bits of accuracy.


step 3. Shift edx right 13 bits
This corrects for the 13 bit extra shift explained above and leaves just the top 6 digits in edx.
If this results in zero then the calculation of the first 6 digits can be skipped completely to save time.


Step 4. Calculate low 4 digits
We have the top 6 digits so multiply by 10000 will add 4 zeros to the end and subtract from the original number to give the lower 4 digits.


At this point we have the original number split into 2 parts, the top 6 digits in edx and the bottom 4 digits in edi.
We repeat a very similar process on each of those sets of digits to break them down into pairs of digits to look up in the table.


Step 5. Extract digits
Multiply a 6 digit number by 2^32\10000 + 1 and edx will receive the top 2 digits. The plus 1 is a rounding correction to make sure the result always rounds the correct way.

Subsequent multiplies by 100 will shift the next 2 digits into edx.

Each pair of digits is checked to see if it's no digits (00) 1 digit (0d) or 2 digits (dd)
No digits is ignored if it's at the start of a number. 1 digit is treated as a special case the first time it's found to account for an odd number of digits in the number as the leading zero needs to be supressed.
Once a digit is found, subsequent digits don't need to be checked so a faster "ZeroSupressed" code does the same thing but passes all digits to the result as they're all relevant once the first digit is found.
e.g if the number is 000100 then the first pair of zeroes is ignored. The next pair of digits (01) is treated as a special case and checked to see if it's 1 digit or 2 and treated accordingly. In this case it's 1 digit (the 1). Now that a valid digit is found, we must pass all following digits to the result as they're all relevant. The next 2 digits 00 are not supressed like the first pair or we'd get a result of 1 instead of 100.

There are 2 blocks of code which are very similar. The first block (labels next1, next2 etc.) searches for the first digit and as soon as the first significant digit is found it jumps to the corresponding position in the second block (label's zs1, zs2 etc.) where the leading zero supression checks are no longer needed so can be coded more efficiently.


When it comes to processing the lower 4 digits we multiply by 2^32\100 +1. The plus 1 is again the correction to ensure rounding goes the right way. This puts the first 2 digits in edx, as before and a subsequent multiply of eax by 100 gives the next 2 digits.


Since the numbers are extracted in order from sign, most significant character to Least signficant character there is no awkward reversing of the strings.
3 divisions are needed but they're all integrated with required multiplictaions so they take no aditional time.
Worst case there are only 7 multplies and no divides so the routine is quite fast.


Paul.

dioxin

I don't know if this one is of interest but it's a QWORD to ASCII version based on the same code. It has entry points for signed and unsigned DWORD and QWORD to ASCII.
It's a PowerBasic program so you'll need to extract the routine and put it in your favourite test program to check it out.

Paul.

hutch--

Paul,

I am tied up for the next day or so due to the election here in Australia but I will come back as soon as I can as this will be useful stuff.
Download site for MASM32      New MASM Forum
https://masm32.com          https://masm32.com/board/index.php

oex

Quote from: hutch-- on August 20, 2010, 06:37:46 PM
I am tied up for the next day or so due to the election here in Australia

We're all rooting for you Hutch :bg
We are all of us insane, just to varying degrees and intelligently balanced through networking

http://www.hereford.tv

dioxin

Hutch,
you have elections in Australia? I thought we just sent down a Governor to keep you in line.

Paul.

hutch--

 :bg

Paul,

I think we lost that lurk in a round of budget cutbacks in 1901. Still, we get the Royal family for free and it keeps a polititian out of the job of President.
Download site for MASM32      New MASM Forum
https://masm32.com          https://masm32.com/board/index.php

dedndave

i thought they hashed it out at the watering hole with a nice cold foster's - lol