News:

MASM32 SDK Description, downloads and other helpful links
MASM32.com New Forum Link
masmforum WebSite

translate a dword number in a base from 2 to 16

Started by ToutEnMasm, November 10, 2010, 06:02:54 PM

Previous topic - Next topic

ToutEnMasm

I have not see it in the masm32 package,so i made it
The function give only the significant char,you have to made the print format.
I am curious to compare his speed with the other

Quote
.const
BaseN PROTO  :DWORD,  :DWORD,:DWORD
.data
RefHexTamp db "0","1","2","3","4","5","6","7","8","9","A","B","C","D","E","F"
Chiffre db 50 dup (0)
.code
;translate a number,value,in a base from 2 to 16
;################################################################
BaseN PROC uses esi edi ebx value:DWORD,  base:DWORD,  pout:DWORD
         mov ebx,0
   lea edi,Chiffre
   add edi,LENGTHOF Chiffre - 2   ;let a zero
   lea esi,RefHexTamp   
   ;début
   mov eax,value
   mov ecx,base      ;975 = 9*10!2 + 7*10!1 + 5*10!0     10!0 = 1   
   newpowerofbase:      ; ! = power of         
   xor edx,edx
   div ecx         ;divide by the base!1 in loop
   ;the rest give the value of Number*base!N-1
   mov dl,[esi+edx]
   mov [edi],dl
   inc ebx      ;number of char +1
   .if eax != 0    ;dividend not null,use the next power of base
         ;ascii numbers are written right to left
      dec edi   ;pointer of ascii number -1      
      jmp newpowerofbase
   .endif
   ;--- The end ---------
   mov esi,edi   ;start of chiffre
   mov ecx,ebx   ;number of ascii char
   inc ecx      ;add the zero
   mov edi,pout   ;output
   rep movsb
FindeBaseN:
   mov eax,ebx   ;return lenght of ascii chiffre without the zero
         ret
BaseN endp


dedndave

hi Yves
i would use Ling Long Kai Fang (Horner's Rule) and - get rid of the DIV   :P

ToutEnMasm

Quote
would use Ling Long Kai Fang (Horner's Rule) and - get rid of the DIV   

Have you a sample to avoid too much read ?

dedndave

well - some time ago, i wrote some arbitrary length integer to ascii decimal routines
however, they require a lot of reading to understand
let me see what i can come up with, Yves   :U
give me some time....

dedndave

the method i have used has always been fixed-base
however, that doesn't mean it couldn't be adapted
you might even consider base 2, 4, 8, 16 as special
the code could be really fast for those, and they are likely to be used more often than the others
let me get my nap and i will see what i can do with a variable base routine   :P

dedndave

i am not sure i can make an advantage using Ling Long Kai Fang on dword sizes
i ran across a similar issue with simple conversion to ascii decimal
for Ling Long Kai Fang to yield an advantage, the input size had to be larger than a dword
(in other words, for dword to decimal, there was no advantage)

nonetheless, i am working on a routine to see how it goes   :P

the MSVCT _itoa functions can do this
that's probably why there is no masm32 lib function for it

http://msdn.microsoft.com/en-us/library/yakksftt%28VS.80%29.aspx

ToutEnMasm


I have made a test of speed that show it's a little slow
Quote
Intel(R) Celeron(R) CPU 2.80GHz (SSE3)
1982    cycles for BaseN 2        ;--------
35      cycles for dw2bin_ex
632     cycles for BaseN 16       ;--------
18      cycles for dw2hex_ex
775     cycles for BaseN 10       ;--------
154     cycles for dwtoa
I will have a look to the link

dedndave

yes - that is the DIV instruction at work
i am using crt__itoa, but having a hard time getting correct output   :P
i must be mashing the wrong buttons - lol

here is my current code:
        INVOKE  crt__itoa,0FFFFFFFFh,offset OutBuf,sizeof OutBuf,dwBase
where dwBase is a dword = 2
and OutBuf is a 33 byte buffer
from your routine, i get "11111111111111111111111111111111", which is correct
from crt__itoa, i get "3aokg93"   :lol

i am sure it is reasonably fast, though
most of these CRT conversion functions tend to perform fairly well

dedndave

ok - the buffer size isn't needed - i am surprised it assembled - lol
        INVOKE  crt__itoa,0FFFFFFFFh,offset OutBuf,dwBase
now, let me run some speed tests

dedndave

well - the CRT times aren't as good as i would have expected
i would say there is a lot of room for improvement

        BaseN   crt__itoa
Base 2  2521    2364
Base 3  1758    1570
Base 4  1323    1226
Base 5  1178    1089
Base 6  1115    1005
Base 7  1052    947
Base 8  977     872
Base 9  1040    967
Base 10 980     110  <---
Base 11 993     791
Base 12 856     736
Base 13 902     695
Base 14 968     724
Base 15 844     664
Base 16 772     622

in all cases, the value is 0FFFFFFFFh
the times seem to be related to output string length, as i would expect
except for base 10 - it obviously uses one of the faster functions for that

ToutEnMasm


I have found a method to decrease the time needed by the division.
Here is the principle:
What is done faster is a division by a power of 2
Then we get the one who is immediatly below the diviser
We done the divide and search what is the difference with a divide by the diviser
The result of a division can be define as the number of summ of diviser needed to retrieve the dividende.
Dividende - rest= (diviseur +....+diviseur) 
The number of terms of the summ is the result of the division
Stay to search by how many we have to decrease the first result to find the final result.
The function give the rest in edx,the result in eax.
The method can be generalised to any diviser with a gain of speed maximal for a power of two
For the base 10 time is decrease by about 20%
The test time prog give result in periodes (real time)


Here is a simple for a divide by 10
Quote
.data
uncinquieme real8 0.2
;################################################################
.code
Division PROC  dividende:DWORD,diviseur:DWORD
   ;suite de somme   resultat 12D687 reste 8
   ;division par 8          2!3           ;!=power         
   mov edx,dividende
   shr edx,3
   mov eax,edx
   push edx
   ;by how many is increase the result if we had two to 8 = eax * 2
   ;shl eax,1
   ;nombre de termes à enlever pour retrouver la valeur du dividende
   fild dword ptr [esp]
   fmul real8 ptr uncinquieme
   fistp dword ptr [esp]
   pop edx
   sub  eax,edx
   mov edx,0
   mov ecx,10
   push eax
   mul ecx
   mov edx,dividende      
   sub edx,eax
   pop eax
   
FindeDivision:
         ret
Division endp

brethren

Quote from: dedndave on November 11, 2010, 07:28:38 PM
well - the CRT times aren't as good as i would have expected
i would say there is a lot of room for improvement

        BaseN   crt__itoa
Base 2  2521    2364
Base 3  1758    1570
Base 4  1323    1226
Base 5  1178    1089
Base 6  1115    1005
Base 7  1052    947
Base 8  977     872
Base 9  1040    967
Base 10 980     110  <---
Base 11 993     791
Base 12 856     736
Base 13 902     695
Base 14 968     724
Base 15 844     664
Base 16 772     622

in all cases, the value is 0FFFFFFFFh
the times seem to be related to output string length, as i would expect
except for base 10 - it obviously uses one of the faster functions for that

that base 10 is fast, i'm gonna reverse that call and see how its done

btw using the value 0FFFFFFFFh and itoa to convert it to a hex string takes 383 cycles on mine. This code that i use does it in 29 clock cycles

.data
hextable BYTE "0123456789ABCDEF"

.code
dwtohex PROC USES esi edi,val:DWORD, buff:PTR BYTE
mov ecx, 8
mov esi, val
mov edi, buff
@@:
rol esi, 4
mov eax, esi
and eax,0000000Fh
mov al, hextable[eax]
mov [edi], al
add edi, 1
sub ecx, 1
jnz @B
mov BYTE PTR [edi], 0

ret
dwtohex ENDP

dedndave

well - you can use multiply-to-divide with a table for the constants   :P

ToutEnMasm

The cycles for dw2hex_ex is 18 cycles.Faster than dwtohex upper.
This is done by the use of a table.The table is filled by the ascii values for 4 bits.
Binary and hexa can use use tables and are very fast like this.

brethren

#14
Quote from: ToutEnMasm on November 12, 2010, 03:44:46 PM
The cycles for dw2hex_ex is 18 cycles.Faster than dwtohex upper.
This is done by the use of a table.The table is filled by the ascii values for 4 bits.
Binary and hexa can use use tables and are very fast like this.


the proc i've just posted is just as fast as dw2hex_ex and would be faster if i got rid of the prologue and epilogue code and used esp and also it could be faster if i used trashable regs (maybe use edx instead of edi to save a push and pop)

btw after running itoa through a debugger i know why its much quicker converting 0FFFFFFFFh its because it only needs to convert 1 digit (-1), try it with 0FFFFFFFh and you'll see it takes 6 times as long. dwtoa from masm32 is miles faster :toothy