Text only | Text with Images

The MASM Forum Archive 2004 to 2012

General Forums => The Laboratory => Topic started by: FORTRANS on September 19, 2008, 04:40:14 PM

Title: Binary to Decimal using multiply
Post by: FORTRANS on September 19, 2008, 04:40:14 PM

Hi,

I was thinking about doing a binary to decimal display routine
using a multiply rather than dividing by ten as is usually done.
I have read discussions of this, but never seen any code for the
X86. So I hope this is also relatively new to some of you.

The idea is simple in decimal, the number is treated as a
fraction, and multiplying by ten pulls off the leading digit.
123 => .123, 123 * 10 = 1.230, pull off the integer and repeat.
So on a decimal computer it probably could make sense.

So I coded up a routine to do the equivalent with a binary
number to how good it would be. I used byte numbers to do a proof
of concept. And basically it works, but the conventional divide
algorithm is better. Thus no effort to write a word or larger
routine.

It requires word multiplies to process a byte (actually three
digit) number, whereas a divide works with a byte divide. And the
suppression of leading zeroes looks to be more involved. With the
change in number of digits to print out, the first multiplication
would require a different multiplier, and the divide routines do
not require changes in the divisor. The first multiplier has a bit
of tolerance with a byte, but for usage up to 999 only 290H worked.
So a word version should be doable, though there seems no point.

Examples using the numbers 255 and 999:

Thinking in terms of fixed point arithmetic, 0FFH is .99609375;
.5 + .25 + .125 + .0625 + .03125 + .015625 + .0078125 + .00390625.
And 0290H (656) is then 2.5625. 2.5625 * 0.99609375 = 2.55249,
giving the first digit and the proper fraction so that multiplying
by ten can get the next digits. And 999 => 3E7H, 3E7 * 290 = 9FFF0,
which is decimal 9.99+. You can see that the math is not exact, a
minor nit that the divide algorithm avoids.

Code follows, the SCALL macro was written for DOS, so replace for
another environment. I can attach the macro file if it is allowed,
but it was not created by me. Dated 1984 by ZDS, with a boilerplate
message. Though I see no real proprietary content.

Comments?

Steve N.

Code Select


        TITLE - BINary to DECimal conversion, test inverted logic.
        COMMENT *
   Do a binary to decimal routine trying to use a left to right
output order using a multiply, rather than a divide.
   Do with bytes at first to simplify debugging.
26 June 2008 by SRN.
*

; - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
        .XCREF
        .XLIST
INCLUDE  DEFMS.ASM ; MACROs and MS-DOS definitions from Heath/Zenith software.
        .LIST
        .CREF

; - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
; Example of usage.

        MOV     AL,BYTE PTR [Dividend]
        CALL Bin2DecM   ; Prints AL with min 3 Digits.
        CALL Newline

        SCALL   EXIT    ; .EXE exit

; - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
; BINary to DECimal conversion a different way.  Assume unsigned numbers.
; The number in AL is printed to the console as three digits.  Actually
; lightly tested up to 999.
; 26 June 2008, 19 September 2008 cleaned up for posting.

;   INPUT:  AL contains the number to be converted to decimal ASCII.
;  OUTPUT:  No registers changed.

Bin2DecM:
        PUSH    AX      ; Save register used.  Change to EAX as needed.
        PUSH    BX
        PUSH    DX

        XOR     AH,AH   ; Optional to ensure AH is clear.

        MOV     BX,290H ; Multiplier to get first digit into DL.

        MUL     BX      ; Do a fixed point 8:8 x 8:8 multiply to get a 16:16.

        PUSH    AX      ; Calling DOS functions destroy AX.
        ADD     DL,'0'  ; Convert binary to ASCII.
        SCALL   CONOUT  ; Print leading decimal digit.
        POP     AX      ; And restore fraction.

        MOV     BX,10   ; Multiplier to get remaining digits.

        MUL     BX      ; And repeat as necessary.

        PUSH    AX
        ADD     DL,'0'
        SCALL   CONOUT
        POP     AX

        MUL     BX

        ADD     DL,'0'  ; The last digit, so no need to preserve AX.
        SCALL   CONOUT

        POP     DX      ; Restore and return.
        POP     BX
        POP     AX

        RET

Title: Re: Binary to Decimal using multiply
Post by: FORTRANS on September 20, 2008, 01:59:25 PM

And of course this morning, figured out how to do it all with byte
arithmetic. This is because the first multiplier has its low nybble
set to zero. One can shift 290H down to 29H and not lose precision.
This then places the integer in the high nybble of AH, were is in the
middle of bit soup, and must be shifted up into DL or down into the
low nybble of AH.

The first is equivalent to two multiplies and an implicit use of
word size logic. Not to mention getting the four bits out of AH/AX
and into DL/DX. Not much of a gain over explicit use of a single word
multiply with the integer ending up in DL for free.

The second is then equivalent to a multiply followed by a divide
through the use of a word shift (or byte shifts, and byte rotates
through carry), and then requires a move of AH to DL for each digit.
And on checking, it produces the wrong answer due to a loss of
precision.

Ugly. But it implies a word (dword) size routine might be possible
using only word (dword) logic, if the multiplier gets really lucky,
and you cheat a little.

Phooey, I thought this was dead. I guess i'll have to get a
stake to finish it off (calculate the numbers for those cases).
It figures that I only figure these things out after posting.

Oops,

Steve N.

Title: Re: Binary to Decimal using multiply
Post by: Mark_Larson on September 20, 2008, 06:27:00 PM

I am going to analyze your code. I know you are going to be posting new code. You want to make sure you don't do any 16-bit code in Windows in Intel processors. It is very slow. You want to use a byte or dword.

Quote from: FORTRANS on September 19, 2008, 04:40:14 PM
Code Select Expand
; - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ; BINary to DECimal conversion a different way. Assume unsigned numbers. ; The number in AL is printed to the console as three digits. Actually ; lightly tested up to 999. ; 26 June 2008, 19 September 2008 cleaned up for posting. ; INPUT: AL contains the number to be converted to decimal ASCII. ; OUTPUT: No registers changed. Bin2DecM: ; With Windows you don't need to preserve ax, or dx ( eax, or edx). ; you also don't have to preserve ecx, so you should use it in place of bx. ; pushing AX shouldn't work under windows, the stack is 32-bit PUSH AX ; Save register used. Change to EAX as needed. PUSH BX PUSH DX ;this causes a stall since it only updates part of hte ; register, use xor eax,eax XOR AH,AH ; Optional to ensure AH is clear. MOV BX,290H ; Multiplier to get first digit into DL.

Title: Re: Binary to Decimal using multiply
Post by: FORTRANS on September 21, 2008, 02:01:32 PM

Quote from: Mark_Larson on September 20, 2008, 06:27:00 PM
I am going to analyze your code.

Good. Nice to see some interest.

Quote
I know you are going to be posting new code. You want to make sure you don't do any 16-bit code in Windows in Intel processors. It is very slow. You want to use a byte or dword.

This is just "proof of concept" and/or algorithm tweaking. If it
proves to be useful I will follow your guidelines.

Code Select


; With Windows you don't need to preserve ax, or dx ( eax, or edx).
; you also don't have to preserve ecx, so you should use it in place of bx.
; pushing AX shouldn't work under windows, the stack is 32-bit

Noted. I am developing using DOS as I am more familar with it,
and not currently set up for windows. I _am_ trying to get a Windows
environment set up. I am preserving registers due to the debug
style environment. And of course if the word version pans out, I
will look at a double word version, and will try to follow Windows
coding conventions.

Code Select


        PUSH    AX      ; Save register used.  Change to EAX as needed.

Code Select


;this causes a stall since it only updates part of hte
; register, use xor eax,eax
        XOR     AH,AH   ; Optional to ensure AH is clear.

Um, no can do, the input is in AL. Is

Code Select

AND AX,00FFH better than

Code Select

XOR AH,AH? Is

Code Select

AND EAX,000000FFH again better?

Thank you for your inputs.

Regards,

Steve N.

Title: Re: Binary to Decimal using multiply
Post by: hutch-- on September 21, 2008, 02:23:35 PM

Steve,

If you can make the conversion is your direct coding from 16 bit to full 32 bit it will be like getting out of a T model ford into a formula one car, the difference is so great. Full FLAT memory model gives you gigabytes of address space, more instructions that are a lot faster and the addressing is cleaner and simpler.

Title: Re: Binary to Decimal using multiply
Post by: FORTRANS on September 22, 2008, 07:03:02 PM

Hi Hutch,

I'm working on it. It would be nice to not run out of memory.
But about half (or a bit more) of my projects targets an 80186
with DOS 5.0. Hmm, more instructions to misuse.

And I procrastinate. I've download your MASM32 package, and
I'll put it on a computer when I find the disk space on the proper
one of them them. And I got tied up on the current project.
Needed to code up a fixed point arithmetic decoder. Kept making
errors trying to calculate the constants.

Best regards,

Steve N.

Title: Re: Binary to Decimal using multiply
Post by: Mark_Larson on September 23, 2008, 05:12:43 PM

I rarely do "windows" programs. Drives me nuts. I always do "console" programs under Windows, which is exactly like running under DOS, except you can't use DOS interrupts. But you get the full 32-bit mode. It's also easier to program ( less coding) than doing Windows programming. Your programs are really similar to DOS programs ( I was a big DOS programmer, go figure).

I'd highly recommend you try it. You can still call the Win32 API, which is really important, and you don't have to worry about a Window.

The one exception is if I am doing 3D programming, I have to use a Window, but some 3D APIs allow for window creation through the API, and you don't have to do much.

SDL does that, and that is what I use. Supports Windows and Linux, so you can write one set of code and use both OSes. Supports threads, audio and other cool stuff, that also ports back and forth. Windows and Linux use really different threading models.

You have to do VERY little in order to do a Window in SDL as compared to Windows.

It also allows you to look for events (keyboard, mouse), other than that, there is no other Windows stuff that you have to handle.

It supports at the low level, both DirectX and OpenGl, or just a software renderer. I use the software renderer when I do my frame buffer for raytracing. If you want to do hardware acceleration you can pick DirectX or OpenGl, the api under SDL is the same for both. You just have to specifiy which one you want.

www.libsdl.org

I have been using it for quite a few years, and I love it :)

Title: Re: Binary to Decimal using multiply
Post by: FORTRANS on September 23, 2008, 05:53:04 PM

Hi,

Coded up the 16-bit binary to decimal conversion using 32-bit code.
Tried to use word logic, but when I finally got the correct multiplier
it demonstrated that a word has insufficient precision. And I got to
write up a fixed point number display routine to speed up the search
for the best multiplier. I tried to follow Mark_Larson's suggestions
for 32-bit code. Replace the macro to output DL and it should work in
Windows.

Summary:
I coded up an algorithm to convert a binary number to ASCII decimal
using multiplies rather than the usual divide based routines.

Pros;
Uses multiplies rather than divides. I do not know how much that
matters on current processors, but it sounded good at the beginning.
It outputs the digits in a left to right fashion. This means that
no temporary storage is needed, unlike most divide based algorithms.
Those tend to generate the digits right to left, saving them, to then
display them right to left.

Cons;
It uses five multiplies even if the number is small. A divide based
routine can check for zero after each digit is created, and exit early.
Of course that then requires testing the number, conditional jumps, and
other logic.
The multiply algorithm requires more precision than the divide based
routine. To convert a word, a number larger than a word is used. That
one will finish this exercise for this investigation. The byte sized
routine could work around that to a certain extent but the word routine
can't.
If you do not want to display the leading zeroes, the divide routine
is probably easier to use.
If you want to print out a different number of digits, the initial
multiplier must be recalculated. The divide routine just uses ten for
any size number.

Regards,

Steve.

P.S.

Hi Mark_Larson,

Just saw your post. Hey, that looks interesting.
SRN

Code Select

; - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
; BINary to DECimal conversion using multiplies.  Assume unsigned numbers.
; The number in AX is printed to the console as five digits.  Actually,
; works with all five digit numbers.

; 21 Sep Start word logic.  Which didn't work.
; 23 September 2008, start double word version.

;   INPUT:  AX or low half of EAX contains the 16 bit number to be
;           converted to decimal ASCII.
;    Uses:  EAX, ECX, EDX
;   Calls:  SCALL CONOUT, macro to write DL to standard output.

Bin2DecM:
        AND     EAX,0000FFFFH   ; Optional safety check, limit to 16 bits.
        MOV     EDX,00068DB9H   ; Multiplier to get leading decimal digit
                                ; into low byte of EDX (DL).
        MUL     EDX     ; Do a fixed point 16:16 x 16:16 multiply to get
                        ; a 32:32 result.  

        PUSH    EAX
        ADD     DL,'0'  ; Convert binary to ASCII.
        SCALL   CONOUT  ; Print leading decimal digit.
        POP     EAX     ; And restore fraction.

        MOV     ECX,10  ; Multiplier to get remaining digits.

        MUL     ECX     ; Second digit.

        PUSH    EAX
        ADD     DL,'0'
        SCALL   CONOUT
        POP     EAX

        MUL     ECX     ; And repeat as necessary.

        PUSH    EAX
        ADD     DL,'0'
        SCALL   CONOUT
        POP     EAX

        MUL     ECX

        PUSH    EAX
        ADD     DL,'0'
        SCALL   CONOUT
        POP     EAX

        MUL     ECX

        ADD     DL,'0'
        SCALL   CONOUT   ; Final digit

        RET

Title: Re: Binary to Decimal using multiply
Post by: qWord on September 23, 2008, 10:01:34 PM

hi, fortrans,

this is a very interesting idea! I've test your code and there was no problems for all values (0-0ffffh).
Because I'm not familiar with fixed-Point Arithmetic i want to ask the following:
Is it possible to obtain more than one digit in one step (multiplication by FP-constant)? For example from the DWORD-value 0123456789 the leading 5 digits (01234)

regards, qWord

EDIT: an example to my question:

Code Select

mov eax,Fpconstant
mov edx,075BCD15h ; = 0123456789
mul edx
now edx = 04d2h = 1234
is this doable?

Title: Re: Binary to Decimal using multiply
Post by: MichaelW on September 23, 2008, 10:21:49 PM

I modified the code to copy the digits to a buffer, so I could compare cycle counts with the MASM32 dwtoa procedure. Running on my P3 the modified code is more than twice as fast as dwtoa. Typical results:

Code Select


39 cycles, Bin2DecM
95 cycles, dwtoa

If the code were modified to handle the full 32-bit range, and optimized, it might still be faster than dwtoa, even if the parameters were passed on the stack.

[attachment deleted by admin]

Title: Re: Binary to Decimal using multiply
Post by: drizz on September 24, 2008, 01:06:44 AM

Quote from: qWord on September 23, 2008, 10:01:34 PM
EDIT: an example to my question:
Code Select Expand
mov eax,Fpconstant mov edx,075BCD15h ; = 0123456789 mul edx now edx = 04d2h = 1234 is this doable?

Yes it's doable!

you basically multiply the number (rounded) by (2^32/10^(something))

for example:

   mov eax,1234567890; **
   mov edx,01ADH; 2^32/10000000
   mul edx; you'll get 123

   mov eax,1234567890
   mov edx,010C7H; 2^32/1000000
   mul edx; you'll get 1234

   mov eax,1234567890
   mov edx,0A7C6H; 2^32/100000
   mul edx; you'll get 12345

   mov eax,1234567890
   mov edx,68DB9H; 2^32/10000
   mul edx; you'll get 123456

   mov eax,1234567890
   mov edx,418937h; 2^32/1000
   mul edx; you'll get 1234567

you can add some extra precision by multiplying the magic number by 2^x (so you can adjust by shifting right)
   mov eax,1234567890; **
   mov edx,1AD7F2Ah; (2^32/10000000)*2.0^16
   mul edx
   shr edx,16; you'll get 123

Of course this constants have to be carefully tested as it may happen to loose precision on some numbers and get wrong results.
The longer the integer part of division of 2^x/10^y is the less is the chance for error. (large enough magic number)

Title: Re: Binary to Decimal using multiply
Post by: qWord on September 24, 2008, 02:24:43 AM

thx drizz,

I've just got the idea that it could be possible to obtain the quotient and reminder of and division by 100000 with only two multiplications (particularly with regard to SIMD-instructions), but afaics the precession is the problem.

regards qWord

Title: Re: Binary to Decimal using multiply
Post by: NightWare on September 24, 2008, 03:34:28 AM

guys, read this topic http://www.masm32.com/board/index.php?topic=8974.0 ,you recreate the michaelw/pdixon algo...

Quote from: qWord on September 24, 2008, 02:24:43 AMbut afaics the precession is the problem

it's ok until you don't exceed 100000...

Title: Re: Binary to Decimal using multiply
Post by: drizz on September 24, 2008, 03:41:52 PM

qWord check this out:

Code Select


OPTION PROLOGUE:NONE
OPTION EPILOGUE:NONE
align  16
DwToStr2 proc dwValue,pBuffer
	push edi
	push ebx
	mov edi,[esp+2*4+8];buf
	mov ebx,[esp+1*4+8];val
	;; split the value to two five digit numbers
	mov edx,0A7C5AC47h; 1/100000
	lea eax,[ebx+1]
	mul edx
	shr edx,16
	mov eax,edx
	imul edx,100000
	sub ebx,edx
	; first five
	mov ecx,68DB9h
	mul ecx
	add dl,'0'
	mov [edi+0],dl
	xi = 1
	rept 4
		mov edx,10
		mul edx
		add dl,'0'
		mov [edi+xi],dl
		xi = xi + 1
	endm
	; next five
	mov eax,ebx
	mul ecx
	add dl,'0'
	mov [edi+5],dl
	xi = 6
	rept 4
		mov edx,10
		mul edx
		add dl,'0'
		mov [edi+xi],dl
		xi = xi + 1
	endm
	mov byte ptr [edi+10],0
	pop ebx
	pop edi
	ret 2*4
DwToStr2 endp
OPTION PROLOGUE:PROLOGUEDEF 
OPTION EPILOGUE:EPILOGUEDEF

Title: Re: Binary to Decimal using multiply
Post by: FORTRANS on September 24, 2008, 04:33:46 PM

Hi MichaelW,

Thanks for the test case.

23 cycles, Bin2DecM
53 cycles, dwtoa

AMD 2000 MHz

However, you can comment out the push and pop of EAX, as that was used
because the macro destroys AX.

; PUSH EAX
ADD DL,'0'
;SCALL CONOUT

mov [ebx+1], dl

; POP EAX

NightWare,

Thanks for the link. Now that I wrote my own code, that makes more
sense than it did on first reading. My code is way simpler than that!
Now I see how that thread was compensating for the loss of precision.
It's food for thought. I Just tried adjusting my code to print another
digit. It is good up to 900,000, and fails somewhere greater than that.

qWord,

I see your question was answered. But I wrote a small DOS program
to play with 16:16 bit fixed point arithmetic in binary and decimal,
and I could post the binary if it interests you. I used it to find
the multiplier for my code. It was coded up quickly, so it may still
have a bug, though it seems alright. It looks something like.

Fixed Point to Decimal calculator.
Esc = Quit, 1 = Set, 0 = Reset, Move = Cursor

0 0 0 0 : 0 0 0 0 : 0 0 0 0 : 1 0 1 0 x 0 1 1 1 : 1 0 0 0 : 0 0 0 0 : 0 0 0 0

000A:7800

00010.4687500000000000

Regards,

Steve N.

Edit: The 900,00 is a bit wrong. the multiplier does not cover the full range.
The multiplier for 1 - 100,000 is too big for, say 400,000 - 500,000. You
would need to test for what range the number was and adjust for it.

Title: Re: Binary to Decimal using multiply
Post by: drizz on September 24, 2008, 04:53:41 PM

Quote from: NightWare on September 24, 2008, 03:34:28 AM...

NightWare, it's fun reinventing the wheel :bg

Title: Re: Binary to Decimal using multiply
Post by: drizz on September 24, 2008, 08:20:54 PM

like this

Code Select

OPTION PROLOGUE:NONE
OPTION EPILOGUE:NONE

align 16
DwToStr5 proc dwValue,pBuffer
	mov eax,[esp+1*4];val
	mov edx,89705F41h; bits(4) == 3 == b, 2^(64-b)/1000000000
	add eax,1
	mov ecx,10
	jz @F
	mul edx
	shrd eax,edx,(64-3) and 31
	shr edx,(64-3) and 31
	for mmx,<0,1,2,3,4,5,6>
	%movd mm&mmx&,edx
	mul ecx
	endm
	movd mm7,edx
	punpcklbw mm0,mm1
	punpcklbw mm2,mm3
	punpcklbw mm4,mm5
	punpcklbw mm6,mm7
	punpcklwd mm0,mm2
	punpcklwd mm4,mm6
	punpckldq mm0,mm4
	mul ecx
	movd mm1,edx
	mul ecx
	movd mm2,edx
	punpcklbw mm1,mm2
	mov edx,'00'
	mov eax,'0000'
	mov ecx,[esp+2*4];buf
	movd mm6,edx
	movd mm7,eax
	punpckldq mm7,mm7
	paddb mm1,mm6
	paddb mm0,mm7
	movq [ecx+8],mm1
	movq [ecx+0],mm0
	ret 2*4
@@:
	mov edx,[esp+2*4];buf
	mov dword ptr [edx+0],'4924'
	mov dword ptr [edx+4],'2769'
	mov dword ptr [edx+8],'59'
	ret 2*4
	
DwToStr5 endp

OPTION PROLOGUE:PROLOGUEDEF 
OPTION EPILOGUE:EPILOGUEDEF

Title: Re: Binary to Decimal using multiply
Post by: NightWare on September 24, 2008, 09:09:42 PM

Quote from: drizz on September 24, 2008, 04:53:41 PM
it's fun reinventing the wheel :bg

yep, especially this algo, it's a very interesting one, stable (only small difference between large and small values), very fast (it appear slower than LUT in speed test, but in real use it's another thing :wink) and more important it's very adaptable to many case/digits...

Title: Re: Binary to Decimal using multiply
Post by: qWord on September 24, 2008, 11:59:40 PM

Quote from: FORTRANS on September 24, 2008, 04:33:46 PM
... I could post the binary if it interests you.

look intersting, pleas post - TIA

-----

Quote from: drizz on September 24, 2008, 08:20:54 PM
like this

on my core2duo and it takes ~39 clocks

I've also written an function using SSE2:

Code Select


;RETURN: eax == pointer in buffer
OPTION PROLOGUE:NONE
OPTION EPILOGUE:NONE
align 16
d2a proc uDword:DWORD,lpBuffer:DWORD


    .data
        align 16
        fp_100k_div OWORD 0A7C5AC47h
        fp_100k_mul OWORD 100000
        fp_const    QWORD 068DB9h, 068DB9h
        fp_10       QWORD 10     , 10
        fp_asc      db 4 dup(030h)
                    db 030h,0,0,0
                    db 4 dup(030h)
                    db 030h,0,0,0
    ;   fp_cmp      db 10 dup (030h)
    ;               db 6 dup (0)
    .code

    mov eax,DWORD ptr [esp+4] ; uDword
    test eax,eax
    .if !ZERO?
        .if eax != -1

            lea eax,[eax+1]
            movd xmm0,eax
            pmuludq xmm0,fp_100k_div
            psrlq xmm0,16
            movdqa xmm1,xmm0
            pmuludq xmm1,fp_100k_mul

            psrlq xmm0,32
            psrlq xmm1,32

            punpcklqdq xmm0,xmm1

            pmuludq xmm0,OWORD ptr fp_const
            movdqa xmm1,xmm0

            pmuludq xmm1,OWORD ptr fp_10
            movdqa xmm2,xmm1
            pslld xmm1,8

            pmuludq xmm2,OWORD ptr fp_10
            movdqa xmm3,xmm2
            pslld xmm2,16

            pmuludq xmm3,OWORD ptr fp_10
            movdqa xmm4,xmm3
            pslld xmm3,24
            por xmm0,xmm1

            pmuludq xmm4,OWORD ptr fp_10
            por xmm2,xmm3
            psrlq xmm4,32
            psllq xmm4,32

            por xmm0,xmm2

            mov eax,DWORD ptr [esp+8] ; lpBuffer
            pxor xmm2,xmm2
            psrlq xmm0,32
            por xmm0,xmm4
            paddb xmm0,OWORD ptr fp_asc
            movdqa xmm1,xmm0
            punpcklqdq xmm1,xmm2
            punpckhqdq xmm2,xmm0
            psrldq xmm2,3
            por xmm1,xmm2

            ;movdqa xmm7,xmm1               ; suppress leading zeros
            ;pcmpeqb xmm7,OWORD ptr fp_cmp  ;
            ;pmovmskb edx,xmm7              ;
            ;not edx                        ;
            ;bsf edx,edx                    ;

            movdqa OWORD ptr [eax],xmm1
            lea eax,[eax+edx]
            ret 8
        .else
            mov eax,DWORD ptr [esp+8] ; lpBuffer
            mov DWORD ptr [eax],034393234h
            mov DWORD ptr [eax+4],032373639h
            mov DWORD ptr [eax+8],03539h
            ret 8
        .endif
    .else
        mov eax,DWORD ptr [esp+8] ;lpBuffer
        mov DWORD ptr [eax],030h
        ret 8
    .endif

d2a endp
OPTION PROLOGUE:PROLOGUEDEF
OPTION EPILOGUE:EPILOGUEDEF

its a bit faster than drizz's one:
(test-value = 1234567890)
without suppressing leading zeros: ~ 29 clocks :green
with suppressing: ~ 37 clocks

regards qWord

EDIT: there was some sensless instructions (movdqa xmm3,xmm0 and psubd xmm3,xmm1 ) in code. I've delete them out

Title: Re: Binary to Decimal using multiply
Post by: FORTRANS on September 25, 2008, 02:29:54 PM

Quote from: qWord on September 24, 2008, 11:59:40 PM
Quote from: FORTRANS on September 24, 2008, 04:33:46 PM
... I could post the binary if it interests you.
look intersting, pleas post - TIA

Here it is.

I should note that the thread that NightWare pointed out also shows that
leading zero suppression is easily done using the fact that a MULtiply sets the
carry and overflow flags if the high byte, word, or double word is nonzero. I
stated earlier that that looked messy. Oops. And using a constant in memory,
rather than loading it in a register, would save an instruction.

Regards,

Steve N.

[attachment deleted by admin]

Title: Re: Binary to Decimal using multiply
Post by: jj2007 on September 25, 2008, 09:52:54 PM

Quote from: qWord on September 24, 2008, 11:59:40 PM
I've also written an function using SSE2:

Looks interesting, especially since it does not use the FPU registers. Can you extend it to qwords?

258 cycles for float$ REAL8 1.234568 (http://www.masm32.com/board/index.php?topic=9756.msg72641#msg72641)

Title: Re: Binary to Decimal using multiply
Post by: drizz on September 25, 2008, 11:36:46 PM

and what do you guys think of this one :)

Code Select


; no frame
align 16
DwToStr7 proc dwValue,pBuffer
	mov edx,089705F41H
	mov eax,[esp+1*4];val
	mul edx
	mov [esp+1*4],ebx
	add eax,070000000H
	adc edx,0
	movd mm0,edx
	psrld mm0,1
	and edx,01FFFFFFFH
	mov ecx,eax
	mov ebx,edx
	shld edx,eax,2;mul by 5
	shl eax,2
	add eax,ecx
	adc edx,ebx
	mov ecx,00FFFFFFFH
	movd mm1,edx
	and edx,ecx
	add edx,edx
	lea edx,[edx*4+edx]
	movd mm2,edx
	and edx,ecx
	add edx,edx
	lea edx,[edx*4+edx]
	movd mm3,edx
	and edx,ecx
	add edx,edx
	lea edx,[edx*4+edx]
	movd mm4,edx
	and edx,ecx
	add edx,edx
	lea edx,[edx*4+edx]
	movd mm5,edx
	and edx,ecx
	add edx,edx
	lea edx,[edx*4+edx]
	movd mm6,edx
	and edx,ecx
	add edx,edx
	lea edx,[edx*4+edx]
	movd mm7,edx
	punpckldq mm0,mm1
	punpckldq mm2,mm3
	punpckldq mm4,mm5
	punpckldq mm6,mm7
	psrld mm0,32-4
	psrld mm2,32-4
	psrld mm4,32-4
	psrld mm6,32-4
	mov eax,'0000'
	movd mm7,eax
	packssdw mm0,mm2
	packssdw mm4,mm6
	punpckldq mm7,mm7
	packsswb mm0,mm4
	paddb mm0,mm7
	and edx,ecx
	add edx,edx
	lea edx,[edx*4+edx]
	and ecx,edx
	shr edx,28
	add ecx,ecx
	lea ecx,[ecx*4+ecx]
	shr ecx,28-8
	mov ebx,[esp+1*4]
	mov eax,[esp+2*4];buf
	and ecx,0FF00h
	lea edx,[edx+ecx+'00']
	movq [eax+0],mm0
	mov [eax+8],edx
	ret 2*4
DwToStr7 endp

Title: Re: Binary to Decimal using multiply
Post by: NightWare on September 26, 2008, 02:13:07 AM

Quote from: jj2007 on September 25, 2008, 09:52:54 PMLooks interesting, especially since it does not use the FPU registers. Can you extend it to qwords?

jj, i've no time for a real8 version, but this real4 version may help you :

Code Select

.DATA
ALIGN 16
_TCA_Simd_Multiplicateur_Decimales_		REAL4 100000000.0f

.CODE
ALIGN 16
;
; convertir une valeur IEEE en texte au format décimale signé  (format : -x xxx xxx xxx . xxx xxx xx, soit 20 caractères
; en comptant le signe...).
; note : XMM0 et XMM1 sont modifiés
;
; syntaxe :
; mov eax,{the real4 value}
; mov esi,{OFFSET of the string to create}
; call Real4ToString
;
; Return :
; eax = length of the string
;
;
Real4ToString PROC
		push ebx										;; empiler ebx
		push ecx										;; empiler ecx
		push edx										;; empiler edx
		push esi										;; empiler esi
		push edi										;; empiler edi

; on commence par obtenir la valeur absolue, on teste si la valeur est signée
		btr eax,31									;; tester le signe, et conserver la valeur absolue de eax
		jnc Label00									;; si le bit n'est pas positionné, aller Label00
		mov BYTE PTR [esi],"-"							;; sinon, on place le signe - dans le premier octet de la chaine
		inc esi										;; et on incrémente l'adresse en esi
; ensuite on va séparer la partie entière et la partie décimales
Label00:	movd XMM1,eax									;; placer la variable das XMM1
		cvttss2si eax,XMM1								;; placer l'entier (sans arrondi) de XMM1 dans eax
		cvtsi2ss XMM0,eax								;; placer l'entier en eax dans XMM0 au format réel4
		subss XMM1,XMM0								;; soustraire XMM0 à XMM1
		mulss XMM1,_TCA_Simd_Multiplicateur_Decimales_		;; multiplier les décimales restantes par le multiplicateur (maintenant XMM0 posséde les décimales dans la partie entière)
; là on teste s'il y a un arrondi à l'entier supérieur
		cvtss2si edi,XMM1								;; placer l'entier (les décimales multipliées par notre multiplicateur) de XMM1 dans edi
		cmp edi,99999999								;; on teste si il reste quelquechose dans edi (après un traitement décimale virtuel)
		jnae Label01									;; si ce n'est pas supérieur ou égal, il n'y a pas d'arrondi, alors aller Label01
		inc eax										;; sinon, c'est qu'il faut arrondir à la valeur supérieure, donc on augmente eax d'1
Label01:	mov ecx,eax									;; copier eax dans ecx
;		inc eax										;; pour que la division qui suit donne le résultat exact (inutile puisque valeurs signées)
;		jnz Label02									;; si FFFFFFFFh+1 <>0, aller Label02
;; CasSpecial :
;		mov DWORD PTR [esi],"4924"						;; ) placer directement la valeur correspondante
;		mov DWORD PTR [esi+4],"2769"						;; )
;		mov WORD PTR [esi+8],"59"						;; )
;		jmp Label14									;; aller Label14
;ALIGN 4
; on va diviser eax par 100000 de manière optimisée
Label02:	mov edx,2814749768								;; placer 2814749768 (remplacement de 2814749767 et eax+1) dans edx
;		mov edx,2814749767								;; placer 2814749767 dans edx
		mul edx										;; multiplier eax par 2814749768
		shr edx,16									;; décaler edx de 16 bits à droite
		mov eax,100000									;; placer 100000 dans eax
		mov ebx,edx									;; copier edx dans ebx
		mul edx										;; multiplier edx par 100000
		sub ecx,eax									;; soustraire le résultat en eax à ecx
		test ebx,ebx									;; fixer les flags de ebx
		mov edx,ebx									;; replacer ebx dans edx
		mov ebx,10									;; placer 10 (notre multiplicateur décimale) dans ebx
		jz Label03									;; si c'est égal à 0, aller Label03
; sinon, on traite la partie supérieure du nombre
		mov eax,429497									;; on remplace eax par 429497
		mul edx										;; multiplier edx par 429497 (pour pouvoir extraire correctement les 5 décimales supérieures)
		jc Label04									;; s'il existe un dépassement, aller Label04
		dec esi										;; décrémenter l'adresse
		mul ebx										;; multiplier eax par 10
		jc Label05									;; s'il existe un dépassement, aller Label05
		dec esi										;; décrémenter l'adresse
		mul ebx										;; multiplier eax par 10
		jc Label06									;; s'il existe un dépassement, aller Label06
		dec esi										;; décrémenter l'adresse
		mul ebx										;; multiplier eax par 10
		jc Label07									;; s'il existe un dépassement, aller Label07
		dec esi										;; décrémenter l'adresse
		jmp Label08									;; Label08
;ALIGN 4
; ici, on traite la partie inférieure du nombre
Label03:	mov eax,429497									;; on remplace eax par 429497
		sub esi,5										;; enlever 5 caractères à esi
		mul ecx										;; multiplier ecx par 429497 (pour pouvoir extraire correctement les 5 décimales supérieures)
		jc Label09									;; s'il existe un dépassement, aller Label09
		dec esi										;; décrémenter l'adresse
		mul ebx										;; multiplier eax par 10
		jc Label10									;; s'il existe un dépassement, aller Label10
		dec esi										;; décrémenter l'adresse
		mul ebx										;; multiplier eax par 10
		jc Label11									;; s'il existe un dépassement, aller Label11
		dec esi										;; décrémenter l'adresse
		mul ebx										;; multiplier eax par 10
		jc Label12									;; s'il existe un dépassement, aller Label12
		dec esi										;; décrémenter l'adresse
		jmp Label13									;; aller Label13
;ALIGN 4
; ici, on va placer XXXXXXXXXX et le zéro final
Label04:	add dl,"0"									;; ajouter le caractère de base "0" à la valeur en edx
		mov BYTE PTR [esi],dl							;; placer l'octet en dl à l'adresse en esi
		mul ebx										;; multiplier eax par 10
Label05:	add dl,"0"									;; ajouter le caractère de base "0" à la valeur en edx
		mov BYTE PTR [esi+1],dl							;; placer l'octet en dl à l'adresse en esi
		mul ebx										;; multiplier eax par 10
Label06:	add dl,"0"									;; ajouter le caractère de base "0" à la valeur en edx
		mov BYTE PTR [esi+2],dl							;; placer l'octet en dl à l'adresse en esi
		mul ebx										;; multiplier eax par 10
Label07:	add dl,"0"									;; ajouter le caractère de base "0" à la valeur en edx
		mov BYTE PTR [esi+3],dl							;; placer l'octet en dl à l'adresse en esi
Label08:	mul ebx										;; multiplier eax par 10
		add dl,"0"									;; ajouter le caractère de base "0" à la valeur en edx
		mov BYTE PTR [esi+4],dl							;; placer l'octet en dl à l'adresse en esi
		mov eax,429497									;; on remplace eax par 429497
		mul ecx										;; multiplier ecx par 429497 (pour pouvoir extraire correctement les 5 décimales supérieures)
Label09:	add dl,"0"									;; ajouter le caractère de base "0" à la valeur en edx
		mov BYTE PTR [esi+5],dl							;; placer l'octet en dl à l'adresse en esi
		mul ebx										;; multiplier eax par 10
Label10:	add dl,"0"									;; ajouter le caractère de base "0" à la valeur en edx
		mov BYTE PTR [esi+6],dl							;; placer l'octet en dl à l'adresse en esi
		mul ebx										;; multiplier eax par 10
Label11:	add dl,"0"									;; ajouter le caractère de base "0" à la valeur en edx
		mov BYTE PTR [esi+7],dl							;; placer l'octet en dl à l'adresse en esi
		mul ebx										;; multiplier eax par 10
Label12:	add dl,"0"									;; ajouter le caractère de base "0" à la valeur en edx
		mov BYTE PTR [esi+8],dl							;; placer l'octet en dl à l'adresse en esi
Label13:	mul ebx										;; multiplier eax par 10
		add dl,"0"									;; ajouter le caractère de base "0" à la valeur en dx
		mov BYTE PTR [esi+9],dl							;; placer l'octet en dl et le 0 final en dh à l'adresse en esi
; ensuite on teste si il y a une partie décimale
Label14:	add esi,10									;; (peut pas regrouper cette addition avec celle qui suit, a cause du cas spécial...)
		test edi,edi									;; ) pas de décimales ?, alors aller Label18
		jz Label18									;; )
		cmp edi,99999999								;;	) on a déjà arrondi le nombre ?, alors aller Label18
		ja Label18									;;	)
; on poursuis en placant le point de séparation et les décimales
		mov eax,edi									;; placer les décimales dans eax
		lea edi,[esi+1]								;; sauvegarder l'adresse (du début des décimales) dans edi
		mov BYTE PTR [esi],"."							;; sinon, écrire le point de séparation
		add esi,9										;; placer la limite de la chaîne (adresse de début des décimales + le nombre de décimales maxi) dans esi
		mov ecx,3435973837								;; placer notre multiplicateur magique dans ecx
Label15:	dec esi										;; décrémenter l'adresse à écrire
		mov ebx,eax									;; sauvegarder eax dans ebx
		mul ecx										;; ) diviser eax par 10
		shr edx,3										;; )
		mov eax,edx									;; copier le dépassement obtenu en edx dans eax
		lea edx,[edx*4+eax]								;; ) multiplier eax par 10, et placer le résultat dans edx
		add edx,edx									;; )
		sub ebx,edx									;; soustraire edx à ebx
		jz Label15									;; si ebx est égal à 0 (rien n'a à être inscript), aller Label15
		add bl,"0"									;; sinon, ajouter le caractère de base (et passer à la boucle suivante...)
		mov BYTE PTR [esi],bl							;; sauvegarde du caractére
		cmp esi,edi									;; ) test à effectuer, s'il n'y a qu'un seul caractère
		jbe Label17									;; )
		push esi										;; sauvegarder l'adresse du dernier caractère
; ici, on n'a plus a se soucier des zéros de fin
Label16:	dec esi										;; décrémenter l'adresse à écrire
		mov ebx,eax									;; sauvegarder eax dans ebx
		mul ecx										;; ) diviser eax par 10
		shr edx,3										;; )
		mov eax,edx									;; copier le dépassement obtenu en edx dans eax
		lea edx,[edx*4+edx]								;; ) multiplier edx par 10
		add edx,edx									;; )
		sub ebx,edx									;; soustraire edx à ebx
		add bl,"0"									;; ajouter le caractère de base
		mov BYTE PTR [esi],bl							;; sauvegarde du caractére
		cmp esi,edi									;; comparer l'adresse à celle de début de la partie décimale
		ja Label16									;; tant que c'est supérieur, aller Label16
; enfin, on sort
		pop esi										;; restaurer l'adresse du dernier caractère
Label17:	inc esi										;; incrémenter l'adresse pour placer le zéro final
; sortie alternative... (quand pas de partie décimale)
Label18:	mov BYTE PTR [esi],0							;; placer le 0 final
		mov eax,esi									;; copier esi (l'adresse en cours) dans eax

		pop edi										;; désempiler edi
		pop esi										;; désempiler esi
		pop edx										;; désempiler edx
		pop ecx										;; désempiler ecx
		pop ebx										;; désempiler ebx
		sub eax,esi									;; pour obtenir la taille de la chaîne créée dans eax
	ret												;; retourner (sortir de la procédure)
Real4ToString ENDP

Title: Re: Binary to Decimal using multiply
Post by: jj2007 on September 26, 2008, 07:07:00 AM

Quote from: NightWare on September 26, 2008, 02:13:07 AM
jj, i've no time for a real8 version, but this real4 version may help you :

Merci beaucoup, je vais voir si ça accelère les r4.

Title: Re: Binary to Decimal using multiply
Post by: NightWare on September 26, 2008, 10:48:05 PM

here 2 uDw2A algos, except instead of 5+5 digits it's 2+4+4 and 4+4+2 (easier to convert to mmx/sse2), both algos are a bit slower for large values, but a bit faster for small (most used) values. beside, here (due to the used "divisions") no need to take care of FFFFFFFFh.

Code Select

ALIGN 16
;
; convert a dword to ascii string (2+4+4 digits)
;
; syntax :
; mov eax,{the value}
; mov esi,{OFFSET of the string to create}
; call uDw2A
;
; Return :
; eax = length of the string
;
uDw2A PROC
		push ebx										;; empiler ebx
		push ecx										;; empiler ecx
		push edx										;; empiler edx
		push esi										;; empiler esi
		push edi										;; empiler edi

; on va diviser eax par 10000 de manière optimisée
		mov edx,3518437209								;; placer 3518437209 dans edx
		mov ecx,eax									;; copier eax dans ecx
		mul edx										;; multiplier eax par 3518437209
		shr edx,13									;; décaler edx de 13 bits à droite
		mov eax,10000									;; placer 10000 dans eax
		mov edi,edx									;; copier edx dans edi
		mul edx										;; multiplier edx par 10000
		sub ecx,eax									;; soustraire le résultat en eax à ecx
		test edi,edi									;; fixer les flags de edi
		mov edx,edi									;; replacer edi dans edx
		mov edi,10									;; placer 10 (notre multiplicateur décimale) dans edi
		jz Label01									;; si c'est égal à 0, aller Label01
; on va diviser eax par 10000 de manière optimisée
		mov eax,3518437209								;; placer 3518437209 dans eax
		mov ebx,edx									;; copier edx dans ebx
		mul edx										;; multiplier eax par 3518437209
		shr edx,13									;; décaler edx de 13 bits à droite
		mov eax,10000									;; placer 10000 dans eax
		mov edi,edx									;; copier edx dans edi
		mul edx										;; multiplier edx par 10000
		sub ebx,eax									;; soustraire le résultat en eax à ecx
		test edi,edi									;; fixer les flags de edi
		mov edx,edi									;; replacer edi dans edx
		mov edi,10									;; placer 10 (notre multiplicateur décimale) dans edi
		jz Label00									;; si c'est égal à 0, aller Label00
;ALIGN 4
; sinon, on traite la partie XX-------- du nombre
		mov eax,429496730								;; on remplace eax par 429496730
		mul edx										;; multiplier eax par 10
		jc Label02									;; s'il existe un dépassement, aller Label02
		dec esi										;; décrémenter l'adresse
		jmp Label03									;; aller Label03
;ALIGN 4
; sinon, on traite la partie --XXXX---- du nombre
Label00:	sub esi,2										;; enlever 2 caractères à esi (XX--------)
		mov eax,4294968								;; on remplace eax par 4294968
		mul ebx										;; multiplier ecx par 4294968 (pour pouvoir extraire correctement les 4 décimales supérieures)
		jc Label04									;; s'il existe un dépassement, aller Label04
		dec esi										;; décrémenter l'adresse
		mul edi										;; multiplier eax par 10
		jc Label05									;; s'il existe un dépassement, aller Label05
		dec esi										;; décrémenter l'adresse
		mul edi										;; multiplier eax par 10
		jc Label06									;; s'il existe un dépassement, aller Label06
		dec esi										;; décrémenter l'adresse
		jmp Label07									;; aller Label07
;ALIGN 4
; ici, on traite la partie ------XXXX du nombre
Label01:	sub esi,6										;; enlever 6 caractères à esi (XXXXXX----)
		mov eax,4294968								;; on remplace eax par 4294968
		mul ecx										;; multiplier ecx par 4294968 (pour pouvoir extraire correctement les 4 décimales supérieures)
		jc Label08									;; s'il existe un dépassement, aller Label08
		dec esi										;; décrémenter l'adresse
		mul edi										;; multiplier eax par 10
		jc Label09									;; s'il existe un dépassement, aller Label09
		dec esi										;; décrémenter l'adresse
		mul edi										;; multiplier eax par 10
		jc Label10									;; s'il existe un dépassement, aller Label10
		dec esi										;; décrémenter l'adresse
		jmp Label11									;; aller Label11
;ALIGN 4
; ici, on va placer XXXXXXXXXX et le zéro final
Label02:	add dl,30h									;; ajouter le caractère de base "0" à la valeur en edx
		mov BYTE PTR [esi],dl							;; placer l'octet en dl à l'adresse en esi
Label03:	mul edi										;; multiplier eax par 10
		add dl,30h									;; ajouter le caractère de base "0" à la valeur en edx
		mov BYTE PTR [esi+1],dl							;; placer l'octet en dl à l'adresse en esi
		mov eax,4294968								;; on remplace eax par 4294968
		mul ebx										;; multiplier ebx par 4294968 (pour pouvoir extraire correctement les 4 décimales supérieures)
Label04:	add dl,30h									;; ajouter le caractère de base "0" à la valeur en edx
		mov BYTE PTR [esi+2],dl							;; placer l'octet en dl à l'adresse en esi
		mul edi										;; multiplier eax par 10
Label05:	add dl,30h									;; ajouter le caractère de base "0" à la valeur en edx
		mov BYTE PTR [esi+3],dl							;; placer l'octet en dl à l'adresse en esi
		mul edi										;; multiplier eax par 10
Label06:	add dl,30h									;; ajouter le caractère de base "0" à la valeur en edx
		mov BYTE PTR [esi+4],dl							;; placer l'octet en dl à l'adresse en esi
Label07:	mul edi										;; multiplier eax par 10
		add dl,30h									;; ajouter le caractère de base "0" à la valeur en edx
		mov BYTE PTR [esi+5],dl							;; placer l'octet en dl à l'adresse en esi
		mov eax,4294968								;; on remplace eax par 4294968
		mul ecx										;; multiplier ecx par 4294968 (pour pouvoir extraire correctement les 4 décimales supérieures)
Label08:	add dl,30h									;; ajouter le caractère de base "0" à la valeur en edx
		mov BYTE PTR [esi+6],dl							;; placer l'octet en dl à l'adresse en esi
		mul edi										;; multiplier eax par 10
Label09:	add dl,30h									;; ajouter le caractère de base "0" à la valeur en edx
		mov BYTE PTR [esi+7],dl							;; placer l'octet en dl à l'adresse en esi
		mul edi										;; multiplier eax par 10
Label10:	add dl,30h									;; ajouter le caractère de base "0" à la valeur en edx
		mov BYTE PTR [esi+8],dl							;; placer l'octet en dl à l'adresse en esi
Label11:	mul edi										;; multiplier eax par 10
		add dx,30h									;; ajouter le caractère de base "0" à la valeur en dx
		mov WORD PTR [esi+9],dx							;; placer l'octet en dl et le 0 final en dh à l'adresse en esi
		lea eax,[esi+10]								;; copier esi (l'adresse en cours+10) dans eax

		pop edi										;; désempiler edi
		pop esi										;; désempiler esi
		pop edx										;; désempiler edx
		pop ecx										;; désempiler ecx
		pop ebx										;; désempiler ebx
		sub eax,esi									;; pour obtenir la taille de la chaîne créée dans eax
	ret												;; retourner (sortir de la procédure)
uDw2A ENDP

Code Select

ALIGN 16
;
; convert a dword to ascii string (4+4+2 digits)
;
; syntax :
; mov eax,{the value}
; mov esi,{OFFSET of the string to create}
; call uDw2A
;
; Return :
; eax = length of the string
;
uDw2A PROC
		push ebx										;; empiler ebx
		push ecx										;; empiler ecx
		push edx										;; empiler edx
		push esi										;; empiler esi
		push edi										;; empiler edi

; on va diviser eax par 100 de manière optimisée
		mov edx,2748779070								;; placer 2748779069+1 dans edx
		mov ecx,eax									;; copier eax dans ecx
		mul edx										;; multiplier eax par 3518437209
		shr edx,6										;; décaler edx de 6 bits à droite
		mov eax,100									;; placer 100 dans eax
		mov edi,edx									;; copier edx dans edi
		mul edx										;; multiplier edx par 10000
		sub ecx,eax									;; soustraire le résultat en eax à ecx
		test edi,edi									;; fixer les flags de edi
		mov edx,edi									;; replacer edi dans edx
		mov edi,10									;; placer 10 (notre multiplicateur décimale) dans edi
		jz Label01									;; si c'est égal à 0, aller Label01
; on va diviser eax par 10000 de manière optimisée
		mov eax,3518437209								;; placer 3518437209 dans eax
		mov ebx,edx									;; copier edx dans ebx
		mul edx										;; multiplier eax par 3518437209
		shr edx,13									;; décaler edx de 19 bits à droite
		mov eax,10000									;; placer 10000 dans eax
		mov edi,edx									;; copier edx dans edi
		mul edx										;; multiplier edx par 10000
		sub ebx,eax									;; soustraire le résultat en eax à ecx
		test edi,edi									;; fixer les flags de edi
		mov edx,edi									;; replacer edi dans edx
		mov edi,10									;; placer 10 (notre multiplicateur décimale) dans edi
		jz Label00									;; si c'est égal à 0, aller Label00
;ALIGN 4
; sinon, on traite la partie XXXX------ du nombre
		mov eax,4294968								;; on remplace eax par 4294968 (pour pouvoir extraire correctement les 4 décimales supérieures)
		mul edx										;; multiplier eax par 10
		jc Label02									;; s'il existe un dépassement, aller Label02
		dec esi										;; décrémenter l'adresse
		mul edi										;; multiplier eax par 10
		jc Label03									;; s'il existe un dépassement, aller Label03
		dec esi										;; décrémenter l'adresse
		mul edi										;; multiplier eax par 10
		jc Label04									;; s'il existe un dépassement, aller Label04
		dec esi										;; décrémenter l'adresse
		jmp Label05									;; s'il existe un dépassement, aller Label05
;ALIGN 4
; sinon, on traite la partie ----XXXX-- du nombre
Label00:	mov eax,4294968								;; on remplace eax par 4294968 (pour pouvoir extraire correctement les 4 décimales supérieures)
		sub esi,4										;; enlever 4 caractères à esi (XXXX------)
		mul ebx										;; multiplier ebx par 4294968
		jc Label06									;; s'il existe un dépassement, aller Label06
		dec esi										;; décrémenter l'adresse
		mul edi										;; multiplier eax par 10
		jc Label07									;; s'il existe un dépassement, aller Label07
		dec esi										;; décrémenter l'adresse
		mul edi										;; multiplier eax par 10
		jc Label08									;; s'il existe un dépassement, aller Label08
		dec esi										;; décrémenter l'adresse
		jmp Label09									;; s'il existe un dépassement, aller Label09
;ALIGN 4
; ici, on traite la partie --------XX du nombre
Label01:	mov eax,429496730								;; on remplace eax par 429496730
		sub esi,8										;; enlever 8 caractères à esi (XXXXXXXX--)
		mul ecx										;; multiplier ecx par 429496730 (pour pouvoir extraire correctement les 2 décimales supérieures)
		jc Label10									;; s'il existe un dépassement, aller Label10
		dec esi										;; décrémenter l'adresse
		jmp Label11									;; aller Label11
;ALIGN 4
; ici, on va placer XXXXXXXXXX et le zéro final
Label02:	add dl,30h									;; ajouter le caractère de base "0" à la valeur en edx
		mov BYTE PTR [esi],dl							;; placer l'octet en dl à l'adresse en esi
		mul edi										;; multiplier eax par 10
Label03:	add dl,30h									;; ajouter le caractère de base "0" à la valeur en edx
		mov BYTE PTR [esi+1],dl							;; placer l'octet en dl à l'adresse en esi
		mul edi										;; multiplier eax par 10
Label04:	add dl,30h									;; ajouter le caractère de base "0" à la valeur en edx
		mov BYTE PTR [esi+2],dl							;; placer l'octet en dl à l'adresse en esi
Label05:	mul edi										;; multiplier eax par 10
		add dl,30h									;; ajouter le caractère de base "0" à la valeur en edx
		mov BYTE PTR [esi+3],dl							;; placer l'octet en dl à l'adresse en esi
		mov eax,4294968								;; on remplace eax par 4294968
		mul ebx										;; multiplier ebx par 4294968 (pour pouvoir extraire correctement les 4 décimales supérieures)
Label06:	add dl,30h									;; ajouter le caractère de base "0" à la valeur en edx
		mov BYTE PTR [esi+4],dl							;; placer l'octet en dl à l'adresse en esi
		mul edi										;; multiplier eax par 10
Label07:	add dl,30h									;; ajouter le caractère de base "0" à la valeur en edx
		mov BYTE PTR [esi+5],dl							;; placer l'octet en dl à l'adresse en esi
		mul edi										;; multiplier eax par 10
Label08:	add dl,30h									;; ajouter le caractère de base "0" à la valeur en edx
		mov BYTE PTR [esi+6],dl							;; placer l'octet en dl à l'adresse en esi
Label09:	mul edi										;; multiplier eax par 10
		add dl,30h									;; ajouter le caractère de base "0" à la valeur en edx
		mov BYTE PTR [esi+7],dl							;; placer l'octet en dl à l'adresse en esi
		mov eax,429496730								;; on remplace eax par 429496730
		mul ecx										;; multiplier ecx par 4294968 (pour pouvoir extraire correctement les 4 décimales supérieures)
Label10:	add dl,30h									;; ajouter le caractère de base "0" à la valeur en edx
		mov BYTE PTR [esi+8],dl							;; placer l'octet en dl à l'adresse en esi
Label11:	mul edi										;; multiplier eax par 10
		add dx,30h									;; ajouter le caractère de base "0" à la valeur en dx
		mov WORD PTR [esi+9],dx							;; placer l'octet en dl et le 0 final en dh à l'adresse en esi
		lea eax,[esi+10]								;; copier esi (l'adresse en cours+10) dans eax

		pop edi										;; désempiler edi
		pop esi										;; désempiler esi
		pop edx										;; désempiler edx
		pop ecx										;; désempiler ecx
		pop ebx										;; désempiler ebx
		sub eax,esi									;; pour obtenir la taille de la chaîne créée dans eax
	ret												;; retourner (sortir de la procédure)
uDw2A ENDP

and here a sse2 test, 21 cycles on my computer :

Code Select

.DATA
ALIGN 16
Div100			DWORD 0A3D70A3Eh,000000000h,000000000h,000000000h	;; 0,0,0,2748779070 (pratiquer un shr ,6 ensuite)
Div10000			DWORD 0D1B71759h,000000000h,000000000h,000000000h	;; 0,0,0,3518437209 (pratiquer un shr ,13 ensuite)
Mul10x2			DWORD 00000000Ah,000000000h,00000000Ah,000000000h	;; 0,10,0,10
Mul100			DWORD 000000064h,000000000h,000000000h,000000000h	;; 0,0,0,100
Mul10000			DWORD 000002710h,000000000h,000000000h,000000000h	;; 0,0,0,10000
PushBits100		DWORD 01999999Ah,000000000h,000000000h,000000000h ;; 0,0,0,429496730
PushBits10000x2	DWORD 000418938h,000000000h,000418938h,000000000h ;; 0,4294968,0,4294968
BaseDecValues		DWORD 030303030h,030303030h,000003030h,000000000h	;; "0000000000"


.CODE
ALIGN 16
;
; convert a dword to ascii string, sse2 version but leading zeros
;
; syntax :
; mov eax,{the value}
; mov esi,{OFFSET of the string to create}
; call uDw2A_Sse2
;
; Return :
; nothing
;
uDw2A_Sse2 PROC

		movd XMM3,eax									;; XMM3 = 0,0,0,Val
; on sépare les parties XXXX,XXXX et XX
		movss XMM6,DWORD PTR Div100						;; XMM6 = 0,0,0,2748779070
		movss XMM7,DWORD PTR Mul100						;; XMM7 = 0,0,0,100
		movdqa XMM4,XMM3								;; XMM4 = _,_,_,Val
		pmuludq XMM6,XMM3								;; XMM6 = _,_,Hi+Mi,_
		psrlq XMM6,32+6								;; XMM6 = 0,0,0,Hi+Mi/100
		movss XMM3,XMM6								;; XMM3 = 0,0,0,Hi+Mi/100
		pmuludq XMM7,XMM6								;; XMM7 = 0,0,0,Hi+Mi
		psubd XMM4,XMM7								;; XMM4 = 0,0,0,Lo
; on sépare les parties XXXX et XXXX
		movss XMM6,DWORD PTR Div10000						;; XMM6 = 0,0,0,3518437209
		movss XMM7,DWORD PTR Mul10000						;; XMM7 = 0,0,0,10000
		movdqa XMM1,XMM3								;; XMM1 = _,_,_,Hi+Mi
		pmuludq XMM6,XMM3								;; XMM6 = _,_,Hi,_
		psrlq XMM6,32+13								;; XMM6 = 0,0,0,Hi/10000
		movss XMM0,XMM6								;; XMM0 = 0,0,0,Hi/10000
		pmuludq XMM7,XMM6								;; XMM7 = 0,0,0,Hi
		psubd XMM1,XMM7								;; XMM4 = 0,0,0,Mi
; ici, on calcul et sépare les valeurs des caractères
		movdqa XMM6,OWORD PTR Mul10x2						;; XMM6 = 0,10,0,10
		movlhps XMM0,XMM1								;; XMM0 = 0,Mi,0,Hi
		pmuludq XMM0,OWORD PTR PushBits10000x2				;; XMM0 = _,Mi*4294968 (D4),_,Hi*4294968 (D0)
		pmuludq XMM4,OWORD PTR PushBits100					;; XMM0 = _,_,_,Lo*429496730 (D8)
		movdqa XMM1,XMM0								;; XMM1 = _Mi,_,Hi
		pmuludq XMM1,XMM6								;; XMM1 = _,Mi*10 (D5),_,Hi*10 (D1)
		movdqa XMM5,XMM4								;; XMM5 = _,_,_,Lo
		movdqa XMM2,XMM1								;; XMM2 = _Mi,_,Hi
		pmuludq XMM2,XMM6								;; XMM2 = _,Mi*10 (D6),_,Hi*10 (D2)
		pmuludq XMM5,XMM6								;; XMM1 = _,_,_,Lo*10 (D9)
		movdqa XMM3,XMM2								;; XMM3 = _Mi,_,Hi
		pmuludq XMM3,XMM6								;; XMM3 = _,Mi*10 (D7),_,Hi*10 (D3)
; ici, on fusionne les caractères
		punpcklbw XMM4,XMM5								;; XMM4 = _,D9+D8,_,_
		pslld XMM1,8									;; XMM1 = D5,_,D1,_
		pslld XMM2,16									;; XMM2 = D6,_,D2,_
		pslld XMM3,24									;; XMM3 = D7,_,D3,_
		por XMM0,XMM1									;; ) XMM0 = Mi,_,Hi,_
		por XMM0,XMM2									;; )
		por XMM0,XMM3									;; )
		shufps XMM0,XMM4,0EDh							;; XMM0 = _,Lo,Mi,Hi
; ici, on ajoute le caractère de base aux caractères obtenus, et on sauvegarde
		paddb XMM0,OWORD PTR BaseDecValues					;; XMM0 = _,Lo,Mi,Hi + "0000000000"
		movdqa OWORD PTR [esi],XMM0						;; placer la valeur

;		movdqa XMM1,XMM0								;; ) suppress leading zeros
;		pcmpeqb XMM1,OWORD PTR BaseDecValues				;; )
;		pmovmskb edx,XMM1								;; )
;		not edx										;; )
;		bsf edx,edx									;; )
;		add esi,edx									;; )

	ret												;; retourner (sortir de la procédure)
uDw2A_Sse2 ENDP

Title: Re: Binary to Decimal using multiply
Post by: eaOn on October 15, 2008, 02:23:40 PM

Quote from: qWord on September 24, 2008, 02:24:43 AM
thx drizz,

I've just got the idea that it could be possible to obtain the quotient and reminder of and division by 100000 with only two multiplications (particularly with regard to SIMD-instructions), but afaics the precession is the problem.

regards qWord

.
Actually, there is a slight significance from what your saying.
Look here. http://www.masm32.com/board/index.php?topic=10039.0
Check the lastDigit,thirdDigit.... section below.
The only difference is that its hex to decimal conversion and it is build on 16-bit DOS asm.

As matter of fact it is possible. From my experience in 16-bit programming, the quotient will be stored in ax while the remainder is stored in dx if you divide ax with dx.

Text only | Text with Images

SMF 2.1.4 © 2023, Simple Machines