News:

MASM32 SDK Description, downloads and other helpful links
MASM32.com New Forum Link
masmforum WebSite

Binary to Decimal using multiply

Started by FORTRANS, September 19, 2008, 04:40:14 PM

Previous topic - Next topic

FORTRANS

Hi,

   I was thinking about doing a binary to decimal display routine
using a multiply rather than dividing by ten as is usually done.
I have read discussions of this, but never seen any code for the
X86.  So I hope this is also relatively new to some of you.

   The idea is simple in decimal, the number is treated as a
fraction, and multiplying by ten pulls off the leading digit.
123 => .123, 123 * 10 = 1.230, pull off the integer and repeat.
So on a decimal computer it probably could make sense.

   So I coded up a routine to do the equivalent with a binary
number to how good it would be.  I used byte numbers to do a proof
of concept.  And basically it works, but the conventional divide
algorithm is better.  Thus no effort to write a word or larger
routine.

   It requires word multiplies to process a byte (actually three
digit) number, whereas a divide works with a byte divide.  And the
suppression of leading zeroes looks to be more involved.  With the
change in number of digits to print out, the first multiplication
would require a different multiplier, and the divide routines do
not require changes in the divisor.  The first multiplier has a bit
of tolerance with a byte, but for usage up to 999 only 290H worked.
So a word version should be doable, though there seems no point.

   Examples using the numbers 255 and 999:

   Thinking in terms of fixed point arithmetic, 0FFH is .99609375;
.5 + .25 + .125 + .0625 + .03125 + .015625 + .0078125 + .00390625.
And 0290H (656) is then 2.5625.  2.5625 * 0.99609375 = 2.55249,
giving the first digit and the proper fraction so that multiplying
by ten can get the next digits.  And 999 => 3E7H, 3E7 * 290 = 9FFF0,
which is decimal 9.99+.  You can see that the math is not exact, a
minor nit that the divide algorithm avoids.

   Code follows, the SCALL macro was written for DOS, so replace for
another environment.  I can attach the macro file if it is allowed,
but it was not created by me.  Dated 1984 by ZDS, with a boilerplate
message.  Though I see no real proprietary content.

Comments?

Steve N.



        TITLE - BINary to DECimal conversion, test inverted logic.
        COMMENT *
   Do a binary to decimal routine trying to use a left to right
output order using a multiply, rather than a divide.
   Do with bytes at first to simplify debugging.
26 June 2008 by SRN.
*

; - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
        .XCREF
        .XLIST
INCLUDE  DEFMS.ASM ; MACROs and MS-DOS definitions from Heath/Zenith software.
        .LIST
        .CREF

; - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
; Example of usage.

        MOV     AL,BYTE PTR [Dividend]
        CALL Bin2DecM   ; Prints AL with min 3 Digits.
        CALL Newline

        SCALL   EXIT    ; .EXE exit

; - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
; BINary to DECimal conversion a different way.  Assume unsigned numbers.
; The number in AL is printed to the console as three digits.  Actually
; lightly tested up to 999.
; 26 June 2008, 19 September 2008 cleaned up for posting.

;   INPUT:  AL contains the number to be converted to decimal ASCII.
;  OUTPUT:  No registers changed.

Bin2DecM:
        PUSH    AX      ; Save register used.  Change to EAX as needed.
        PUSH    BX
        PUSH    DX

        XOR     AH,AH   ; Optional to ensure AH is clear.

        MOV     BX,290H ; Multiplier to get first digit into DL.

        MUL     BX      ; Do a fixed point 8:8 x 8:8 multiply to get a 16:16.

        PUSH    AX      ; Calling DOS functions destroy AX.
        ADD     DL,'0'  ; Convert binary to ASCII.
        SCALL   CONOUT  ; Print leading decimal digit.
        POP     AX      ; And restore fraction.

        MOV     BX,10   ; Multiplier to get remaining digits.

        MUL     BX      ; And repeat as necessary.

        PUSH    AX
        ADD     DL,'0'
        SCALL   CONOUT
        POP     AX

        MUL     BX

        ADD     DL,'0'  ; The last digit, so no need to preserve AX.
        SCALL   CONOUT

        POP     DX      ; Restore and return.
        POP     BX
        POP     AX

        RET

FORTRANS

   And of course this morning, figured out how to do it all with byte
arithmetic.  This is because the first multiplier has its low nybble
set to zero.  One can shift 290H down to 29H and not lose precision.
This then places the integer in the high nybble of AH, were is in the
middle of bit soup, and must be shifted up into DL or down into the
low nybble of AH.

   The first is equivalent to two multiplies and an implicit use of
word size logic.  Not to mention getting the four bits out of AH/AX
and into DL/DX.  Not much of a gain over explicit use of a single word
multiply with the integer ending up in DL for free.

   The second is then equivalent to a multiply followed by a divide
through the use of a word shift (or byte shifts, and byte rotates
through carry), and then requires a move of AH to DL for each digit.
And on checking, it produces the wrong answer due to a loss of
precision.

   Ugly.  But it implies a word (dword) size routine might be possible
using only word (dword) logic, if the multiplier gets really lucky,
and you cheat a little.

   Phooey, I thought this was dead.  I guess i'll have to get a
stake to finish it off (calculate the numbers for those cases).
It figures that I only figure these things out after posting.

Oops,

Steve N.

Mark_Larson

 I am going to analyze your code.  I know you are going to be posting new code.  You want to make sure you don't do any 16-bit code in Windows in Intel processors.  It is very slow.  You want to use a byte or dword.

Quote from: FORTRANS on September 19, 2008, 04:40:14 PM

; - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
; BINary to DECimal conversion a different way.  Assume unsigned numbers.
; The number in AL is printed to the console as three digits.  Actually
; lightly tested up to 999.
; 26 June 2008, 19 September 2008 cleaned up for posting.

;   INPUT:  AL contains the number to be converted to decimal ASCII.
;  OUTPUT:  No registers changed.

Bin2DecM:

; With Windows you don't need to preserve ax, or dx ( eax, or edx).
; you also don't have to preserve ecx, so you should use it in place of bx.
; pushing AX shouldn't work under windows, the stack is 32-bit

        PUSH    AX      ; Save register used.  Change to EAX as needed.
        PUSH    BX
        PUSH    DX

;this causes a stall since it only updates part of hte
; register, use xor eax,eax
        XOR     AH,AH   ; Optional to ensure AH is clear.

        MOV     BX,290H ; Multiplier to get first digit into DL.


BIOS programmers do it fastest, hehe.  ;)

My Optimization webpage
htttp://www.website.masmforum.com/mark/index.htm

FORTRANS

Quote from: Mark_Larson on September 20, 2008, 06:27:00 PM
I am going to analyze your code.

   Good.  Nice to see some interest.

Quote
  I know you are going to be posting new code.  You want to make sure you don't do any 16-bit code in Windows in Intel processors.  It is very slow.  You want to use a byte or dword.

   This is just "proof of concept" and/or algorithm tweaking.  If it
proves to be useful I will follow your guidelines.


; With Windows you don't need to preserve ax, or dx ( eax, or edx).
; you also don't have to preserve ecx, so you should use it in place of bx.
; pushing AX shouldn't work under windows, the stack is 32-bit


   Noted.  I am developing using DOS as I am more familar with it,
and not currently set up for windows.  I _am_ trying to get a Windows
environment set up.  I am preserving registers due to the debug
style environment.  And of course if the word version pans out, I
will look at a double word version, and will try to follow Windows
coding conventions.


        PUSH    AX      ; Save register used.  Change to EAX as needed.



;this causes a stall since it only updates part of hte
; register, use xor eax,eax
        XOR     AH,AH   ; Optional to ensure AH is clear.


   Um, no can do, the input is in AL.  Is AND  AX,00FFH better than
XOR  AH,AH?  Is AND  EAX,000000FFH again better?

   Thank you for your inputs.

Regards,

Steve N.

hutch--

Steve,

If you can make the conversion is your direct coding from 16 bit to full 32 bit it will be like getting out of a T model ford into a formula one car, the difference is so great. Full FLAT memory model gives you gigabytes of address space, more instructions that are a lot faster and the addressing is cleaner and simpler.
Download site for MASM32      New MASM Forum
https://masm32.com          https://masm32.com/board/index.php

FORTRANS

Hi Hutch,

   I'm working on it.  It would be nice to not run out of memory.
But about half (or a bit more) of my projects targets an 80186
with DOS 5.0.  Hmm, more instructions to misuse.

   And I procrastinate.  I've download your MASM32 package, and
I'll put it on a computer when I find the disk space on the proper
one of them them.  And I got tied up on the current project.
Needed to code up a fixed point arithmetic decoder.  Kept making
errors trying to calculate the constants.

Best regards,

Steve N.

Mark_Larson

I rarely do "windows" programs.  Drives me nuts.  I always do "console" programs under Windows, which is exactly like running under DOS, except you can't use DOS interrupts.  But you get the full 32-bit mode.  It's also easier to program ( less coding) than doing Windows programming. Your programs are really similar to DOS programs ( I was a big DOS programmer, go figure).

I'd highly recommend you try it.  You can still call the Win32 API, which is really important, and you don't have to worry about a Window.

The one exception is if I am doing 3D programming, I have to use a Window, but some 3D APIs allow for window creation through the API, and you don't have to do much.

SDL does that, and that is what I use.  Supports Windows and Linux, so you can write one set of code and use both OSes.  Supports threads, audio and other cool stuff, that also ports back and forth.  Windows and Linux use really different threading models. 

You have to do VERY little in order to do a Window in SDL as compared to Windows. 

It also allows you to look for events (keyboard, mouse), other than that, there is no other Windows stuff that you have to handle.

It supports at the low level, both DirectX and OpenGl, or just a software renderer.  I use the software renderer when I do my frame buffer for raytracing.  If you want to do hardware acceleration you can pick DirectX or OpenGl, the api under SDL is the same for both.  You just have to specifiy which one you want.

www.libsdl.org

I have been using it for quite a few years, and I love it :)



BIOS programmers do it fastest, hehe.  ;)

My Optimization webpage
htttp://www.website.masmforum.com/mark/index.htm

FORTRANS

Hi,

   Coded up the 16-bit binary to decimal conversion using 32-bit code.
Tried to use word logic, but when I finally got the correct multiplier
it demonstrated that a word has insufficient precision.  And I got to
write up a fixed point number display routine to speed up the search
for the best multiplier.  I tried to follow Mark_Larson's suggestions
for 32-bit code.  Replace the macro to output DL and it should work in
Windows.

Summary:
   I coded up an algorithm to convert a binary number to ASCII decimal
using multiplies rather than the usual divide based routines.

Pros;
   Uses multiplies rather than divides.  I do not know how much that
matters on current processors, but it sounded good at the beginning.
   It outputs the digits in a left to right fashion.  This means that
no temporary storage is needed, unlike most divide based algorithms.
Those tend to generate the digits right to left, saving them, to then
display them right to left.

Cons;
   It uses five multiplies even if the number is small.  A divide based
routine can check for zero after each digit is created, and exit early.
Of course that then requires testing the number, conditional jumps, and
other logic.
   The multiply algorithm requires more precision than the divide based
routine.  To convert a word, a number larger than a word is used.  That
one will finish this exercise for this investigation.  The byte sized
routine could work around that to a certain extent but the word routine
can't.
   If you do not want to display the leading zeroes, the divide routine
is probably easier to use.
   If you want to print out a different number of digits, the initial
multiplier must be recalculated.  The divide routine just uses ten for
any size number.

Regards,

Steve.

P.S.

Hi Mark_Larson,

   Just saw your post.  Hey, that looks interesting.
SRN

; - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
; BINary to DECimal conversion using multiplies.  Assume unsigned numbers.
; The number in AX is printed to the console as five digits.  Actually,
; works with all five digit numbers.

; 21 Sep Start word logic.  Which didn't work.
; 23 September 2008, start double word version.

;   INPUT:  AX or low half of EAX contains the 16 bit number to be
;           converted to decimal ASCII.
;    Uses:  EAX, ECX, EDX
;   Calls:  SCALL CONOUT, macro to write DL to standard output.

Bin2DecM:
        AND     EAX,0000FFFFH   ; Optional safety check, limit to 16 bits.
        MOV     EDX,00068DB9H   ; Multiplier to get leading decimal digit
                                ; into low byte of EDX (DL).
        MUL     EDX     ; Do a fixed point 16:16 x 16:16 multiply to get
                        ; a 32:32 result. 

        PUSH    EAX
        ADD     DL,'0'  ; Convert binary to ASCII.
        SCALL   CONOUT  ; Print leading decimal digit.
        POP     EAX     ; And restore fraction.

        MOV     ECX,10  ; Multiplier to get remaining digits.

        MUL     ECX     ; Second digit.

        PUSH    EAX
        ADD     DL,'0'
        SCALL   CONOUT
        POP     EAX

        MUL     ECX     ; And repeat as necessary.

        PUSH    EAX
        ADD     DL,'0'
        SCALL   CONOUT
        POP     EAX

        MUL     ECX

        PUSH    EAX
        ADD     DL,'0'
        SCALL   CONOUT
        POP     EAX

        MUL     ECX

        ADD     DL,'0'
        SCALL   CONOUT   ; Final digit

        RET

qWord

hi, fortrans,

this is a very interesting idea! I've test your code and there was no problems for all values (0-0ffffh).
Because I'm not familiar with fixed-Point Arithmetic i want to ask the following:
Is it possible to obtain more than one digit in one step (multiplication by FP-constant)? For example from the DWORD-value 0123456789 the leading 5 digits (01234)

regards, qWord

EDIT: an example to my question:

mov eax,Fpconstant
mov edx,075BCD15h ; = 0123456789
mul edx
now edx = 04d2h = 1234
is this doable?
FPU in a trice: SmplMath
It's that simple!

MichaelW

I modified the code to copy the digits to a buffer, so I could compare cycle counts with the MASM32 dwtoa procedure. Running on my P3 the modified code is more than twice as fast as dwtoa. Typical results:


39 cycles, Bin2DecM
95 cycles, dwtoa


If the code were modified to handle the full 32-bit range, and optimized, it might still be faster than dwtoa, even if the parameters were passed on the stack.



[attachment deleted by admin]
eschew obfuscation

drizz

Quote from: qWord on September 23, 2008, 10:01:34 PM
EDIT: an example to my question:
mov eax,Fpconstant
mov edx,075BCD15h ; = 0123456789
mul edx
now edx = 04d2h = 1234
is this doable?

Yes it's doable!

you basically multiply the number (rounded) by (2^32/10^(something))

for example:

   mov eax,1234567890; **
   mov edx,01ADH;  2^32/10000000
   mul edx; you'll get 123

   mov eax,1234567890
   mov edx,010C7H; 2^32/1000000
   mul edx; you'll get 1234

   mov eax,1234567890
   mov edx,0A7C6H; 2^32/100000
   mul edx; you'll get 12345

   mov eax,1234567890
   mov edx,68DB9H; 2^32/10000
   mul edx; you'll get 123456

   mov eax,1234567890
   mov edx,418937h; 2^32/1000
   mul edx; you'll get 1234567


you can add some extra precision by multiplying the magic number by 2^x (so you can adjust by shifting right)
   mov eax,1234567890; **
   mov edx,1AD7F2Ah;  (2^32/10000000)*2.0^16
   mul edx
   shr edx,16; you'll get 123

   
Of course this constants have to be carefully tested as it may happen to loose precision on some numbers and get wrong results.
The longer the integer part of division of 2^x/10^y is the less is the chance for error. (large enough magic number)
The truth cannot be learned ... it can only be recognized.

qWord

thx drizz,

I've just got the idea that it could be possible to obtain the quotient and reminder of and division by 100000 with only two multiplications (particularly with regard to SIMD-instructions), but afaics the precession is the problem.

regards qWord

FPU in a trice: SmplMath
It's that simple!

NightWare

guys, read this topic http://www.masm32.com/board/index.php?topic=8974.0 ,you recreate the michaelw/pdixon algo...

Quote from: qWord on September 24, 2008, 02:24:43 AMbut afaics the precession is the problem
it's ok until you don't exceed 100000...

drizz

qWord check this out:

OPTION PROLOGUE:NONE
OPTION EPILOGUE:NONE
align  16
DwToStr2 proc dwValue,pBuffer
push edi
push ebx
mov edi,[esp+2*4+8];buf
mov ebx,[esp+1*4+8];val
;; split the value to two five digit numbers
mov edx,0A7C5AC47h; 1/100000
lea eax,[ebx+1]
mul edx
shr edx,16
mov eax,edx
imul edx,100000
sub ebx,edx
; first five
mov ecx,68DB9h
mul ecx
add dl,'0'
mov [edi+0],dl
xi = 1
rept 4
mov edx,10
mul edx
add dl,'0'
mov [edi+xi],dl
xi = xi + 1
endm
; next five
mov eax,ebx
mul ecx
add dl,'0'
mov [edi+5],dl
xi = 6
rept 4
mov edx,10
mul edx
add dl,'0'
mov [edi+xi],dl
xi = xi + 1
endm
mov byte ptr [edi+10],0
pop ebx
pop edi
ret 2*4
DwToStr2 endp
OPTION PROLOGUE:PROLOGUEDEF
OPTION EPILOGUE:EPILOGUEDEF

The truth cannot be learned ... it can only be recognized.

FORTRANS

#14
Hi MichaelW,

   Thanks for the test case.

23 cycles, Bin2DecM
53 cycles, dwtoa

AMD 2000 MHz

   However, you can comment out the push and pop of EAX, as that was used
because the macro destroys AX.

      ;  PUSH    EAX
        ADD     DL,'0'
        ;SCALL   CONOUT

        mov [ebx+1], dl

      ;  POP     EAX


NightWare,

   Thanks for the link.  Now that I wrote my own code, that makes more
sense than it did on first reading.  My code is way simpler than that!
Now I see how that thread was compensating for the loss of precision.
It's food for thought.  I Just tried adjusting my code to print another
digit.  It is good up to 900,000, and fails somewhere greater than that.


qWord,

   I see your question was answered.  But I wrote a small DOS program
to play with 16:16 bit fixed point arithmetic in binary and decimal,
and I could post the binary if it interests you.  I used it to find
the multiplier for my code.  It was coded up quickly, so it may still
have a bug, though it seems alright.  It looks something like.


                        Fixed Point to Decimal calculator.
                  Esc = Quit, 1 = Set, 0 = Reset, Move = Cursor

  0 0 0 0 : 0 0 0 0 : 0 0 0 0 : 1 0 1 0 x 0 1 1 1 : 1 0 0 0 : 0 0 0 0 : 0 0 0 0

  000A:7800

  00010.4687500000000000


Regards,

Steve N.

Edit:  The 900,00 is a bit wrong.  the multiplier does not cover the full range.
The multiplier for 1 - 100,000 is too big for, say 400,000 - 500,000.  You
would need to test for what range the number was and adjust for it.