I have not see it in the masm32 package,so i made it
The function give only the significant char,you have to made the print format.
I am curious to compare his speed with the other
Quote
.const
BaseN PROTO :DWORD, :DWORD,:DWORD
.data
RefHexTamp db "0","1","2","3","4","5","6","7","8","9","A","B","C","D","E","F"
Chiffre db 50 dup (0)
.code
;translate a number,value,in a base from 2 to 16
;################################################################
BaseN PROC uses esi edi ebx value:DWORD, base:DWORD, pout:DWORD
mov ebx,0
lea edi,Chiffre
add edi,LENGTHOF Chiffre - 2 ;let a zero
lea esi,RefHexTamp
;début
mov eax,value
mov ecx,base ;975 = 9*10!2 + 7*10!1 + 5*10!0 10!0 = 1
newpowerofbase: ; ! = power of
xor edx,edx
div ecx ;divide by the base!1 in loop
;the rest give the value of Number*base!N-1
mov dl,[esi+edx]
mov [edi],dl
inc ebx ;number of char +1
.if eax != 0 ;dividend not null,use the next power of base
;ascii numbers are written right to left
dec edi ;pointer of ascii number -1
jmp newpowerofbase
.endif
;--- The end ---------
mov esi,edi ;start of chiffre
mov ecx,ebx ;number of ascii char
inc ecx ;add the zero
mov edi,pout ;output
rep movsb
FindeBaseN:
mov eax,ebx ;return lenght of ascii chiffre without the zero
ret
BaseN endp
hi Yves
i would use Ling Long Kai Fang (Horner's Rule) and - get rid of the DIV :P
Quote
would use Ling Long Kai Fang (Horner's Rule) and - get rid of the DIV
Have you a sample to avoid too much read ?
well - some time ago, i wrote some arbitrary length integer to ascii decimal routines
however, they require a lot of reading to understand
let me see what i can come up with, Yves :U
give me some time....
the method i have used has always been fixed-base
however, that doesn't mean it couldn't be adapted
you might even consider base 2, 4, 8, 16 as special
the code could be really fast for those, and they are likely to be used more often than the others
let me get my nap and i will see what i can do with a variable base routine :P
i am not sure i can make an advantage using Ling Long Kai Fang on dword sizes
i ran across a similar issue with simple conversion to ascii decimal
for Ling Long Kai Fang to yield an advantage, the input size had to be larger than a dword
(in other words, for dword to decimal, there was no advantage)
nonetheless, i am working on a routine to see how it goes :P
the MSVCT _itoa functions can do this
that's probably why there is no masm32 lib function for it
http://msdn.microsoft.com/en-us/library/yakksftt%28VS.80%29.aspx
I have made a test of speed that show it's a little slow
Quote
Intel(R) Celeron(R) CPU 2.80GHz (SSE3)
1982 cycles for BaseN 2 ;--------
35 cycles for dw2bin_ex
632 cycles for BaseN 16 ;--------
18 cycles for dw2hex_ex
775 cycles for BaseN 10 ;--------
154 cycles for dwtoa
I will have a look to the link
yes - that is the DIV instruction at work
i am using crt__itoa, but having a hard time getting correct output :P
i must be mashing the wrong buttons - lol
here is my current code:
INVOKE crt__itoa,0FFFFFFFFh,offset OutBuf,sizeof OutBuf,dwBase
where dwBase is a dword = 2
and OutBuf is a 33 byte buffer
from your routine, i get "11111111111111111111111111111111", which is correct
from crt__itoa, i get "3aokg93" :lol
i am sure it is reasonably fast, though
most of these CRT conversion functions tend to perform fairly well
ok - the buffer size isn't needed - i am surprised it assembled - lol
INVOKE crt__itoa,0FFFFFFFFh,offset OutBuf,dwBase
now, let me run some speed tests
well - the CRT times aren't as good as i would have expected
i would say there is a lot of room for improvement
BaseN crt__itoa
Base 2 2521 2364
Base 3 1758 1570
Base 4 1323 1226
Base 5 1178 1089
Base 6 1115 1005
Base 7 1052 947
Base 8 977 872
Base 9 1040 967
Base 10 980 110 <---
Base 11 993 791
Base 12 856 736
Base 13 902 695
Base 14 968 724
Base 15 844 664
Base 16 772 622
in all cases, the value is 0FFFFFFFFh
the times seem to be related to output string length, as i would expect
except for base 10 - it obviously uses one of the faster functions for that
I have found a method to decrease the time needed by the division.
Here is the principle:
What is done faster is a division by a power of 2
Then we get the one who is immediatly below the diviser
We done the divide and search what is the difference with a divide by the diviser
The result of a division can be define as the number of summ of diviser needed to retrieve the dividende.
Dividende - rest= (diviseur +....+diviseur)
The number of terms of the summ is the result of the division
Stay to search by how many we have to decrease the first result to find the final result.
The function give the rest in edx,the result in eax.
The method can be generalised to any diviser with a gain of speed maximal for a power of two
For the base 10 time is decrease by about 20%
The test time prog give result in periodes (real time)
Here is a simple for a divide by 10
Quote
.data
uncinquieme real8 0.2
;################################################################
.code
Division PROC dividende:DWORD,diviseur:DWORD
;suite de somme resultat 12D687 reste 8
;division par 8 2!3 ;!=power
mov edx,dividende
shr edx,3
mov eax,edx
push edx
;by how many is increase the result if we had two to 8 = eax * 2
;shl eax,1
;nombre de termes à enlever pour retrouver la valeur du dividende
fild dword ptr [esp]
fmul real8 ptr uncinquieme
fistp dword ptr [esp]
pop edx
sub eax,edx
mov edx,0
mov ecx,10
push eax
mul ecx
mov edx,dividende
sub edx,eax
pop eax
FindeDivision:
ret
Division endp
Quote from: dedndave on November 11, 2010, 07:28:38 PM
well - the CRT times aren't as good as i would have expected
i would say there is a lot of room for improvement
BaseN crt__itoa
Base 2 2521 2364
Base 3 1758 1570
Base 4 1323 1226
Base 5 1178 1089
Base 6 1115 1005
Base 7 1052 947
Base 8 977 872
Base 9 1040 967
Base 10 980 110 <---
Base 11 993 791
Base 12 856 736
Base 13 902 695
Base 14 968 724
Base 15 844 664
Base 16 772 622
in all cases, the value is 0FFFFFFFFh
the times seem to be related to output string length, as i would expect
except for base 10 - it obviously uses one of the faster functions for that
that base 10 is fast, i'm gonna reverse that call and see how its done
btw using the value 0FFFFFFFFh and itoa to convert it to a hex string takes 383 cycles on mine. This code that i use does it in 29 clock cycles
.data
hextable BYTE "0123456789ABCDEF"
.code
dwtohex PROC USES esi edi,val:DWORD, buff:PTR BYTE
mov ecx, 8
mov esi, val
mov edi, buff
@@:
rol esi, 4
mov eax, esi
and eax,0000000Fh
mov al, hextable[eax]
mov [edi], al
add edi, 1
sub ecx, 1
jnz @B
mov BYTE PTR [edi], 0
ret
dwtohex ENDP
well - you can use multiply-to-divide with a table for the constants :P
The cycles for dw2hex_ex is 18 cycles.Faster than dwtohex upper.
This is done by the use of a table.The table is filled by the ascii values for 4 bits.
Binary and hexa can use use tables and are very fast like this.
Quote from: ToutEnMasm on November 12, 2010, 03:44:46 PM
The cycles for dw2hex_ex is 18 cycles.Faster than dwtohex upper.
This is done by the use of a table.The table is filled by the ascii values for 4 bits.
Binary and hexa can use use tables and are very fast like this.
the proc i've just posted is just as fast as dw2hex_ex and would be faster if i got rid of the prologue and epilogue code and used esp and also it could be faster if i used trashable regs (maybe use edx instead of edi to save a push and pop)
btw after running itoa through a debugger i know why its much quicker converting 0FFFFFFFFh its because it only needs to convert 1 digit (-1), try it with 0FFFFFFFh and you'll see it takes 6 times as long. dwtoa from masm32 is miles faster :toothy
i think i have found what could be described as a bug in crt__itoa
while it uses a faster routine for base 10 values, it is a signed routine :lol
BaseN crt__itoa
Base 2 2498 2325 11111111111111111111111111111111
Base 3 1743 1556 102002022201221111210
Base 4 1311 1201 3333333333333333
Base 5 1175 1087 32244002423140
Base 6 1105 998 1550104015503
Base 7 1067 934 211301422353
Base 8 967 862 37777777777
Base 9 969 864 12068657453
Base 10 902 107 -1
Base 11 900 798 1904440553
Base 12 834 657 9ba461593
Base 13 828 645 535a79888
Base 14 834 737 2ca5b7463
Base 15 832 656 1a20dcd80
Base 16 764 634 ffffffff
It's not a bug, that is the documented behavior:
http://msdn.microsoft.com/en-us/library/yakksftt(v=VS.71).aspx
yah - ok
"we found a bug - let's change the documentation and call it a feature"
well - "-1" is a short string and, thus, a faster conversion
so, in order to get a valid comparison, i changed the value to 7FFFFFFF...
BaseN crt__itoa
Base 2 2439 2257 1111111111111111111111111111111
Base 3 1700 1520 12112122212110202101
Base 4 1321 1213 1333333333333333
Base 5 1180 1074 13344223434042
Base 6 1041 941 553032005531
Base 7 1038 935 104134211161
Base 8 971 864 17777777777
Base 9 902 801 5478773671
Base 10 909 788 2147483647
Base 11 864 781 a02220281
Base 12 836 682 4bb2308a7
Base 13 836 689 282ba4aaa
Base 14 835 659 1652ca931
Base 15 768 645 c87e66b7
Base 16 768 591 7fffffff
personally, i say they should all either be signed or unsigned
not changing underwear because the radix has changed :lol
Quote
the proc i've just posted is just as fast as dw2hex_ex and would be faster if i got rid of the prologue and epilogue code and used esp and also it could be faster if i used trashable regs (maybe use edx instead of edi to save a push and pop)
I agree with that
For those who are interested i join the brother of your dwtohex,dwtobin
The masm32 have a very bigger table
Quote
.data
;---------------0------1------2------3------4------5-------6-----7---
bintable BYTE "0000","0001","0010","0011","0100","0101","0110","0111"
;------8----9------10----11----12-------13----14-----15---
DB "1000","1001","1010","1011","1100","1101","1110","1111",0
;-------------------------------------------------------------
.code
dwtobin PROC USES esi edi,value:DWORD, buff:DWORD
mov ecx, 8
mov esi, value
mov edi, buff
@@:
rol esi, 4 ;4 bits de poids fort
mov eax, esi
and eax,0000000Fh
mov eax, dword ptr bintable[eax*4]
mov [edi], eax
add edi, 4
sub ecx, 1
jnz @B
mov BYTE PTR [edi], 0
;suprimer zeros inutiles
mov esi,buff
mov edi,buff
@@:
.if byte ptr [esi] == "0"
inc esi
jmp @B
.endif
.if byte ptr [esi] == 0
dec esi
mov byte ptr [esi],"0"
.endif
mov eax,0
@@:
.if byte ptr [esi] != 0
movsb
inc eax ;count the number of char
jmp @B
.endif
mov BYTE PTR [edi], 0 ;and return the number of char
ret
dwtobin ENDP
Quote from: dedndave on November 12, 2010, 05:09:18 PM
"we found a bug - let's change the documentation and call it a feature"
Efficiency in action :bg
Actually, that behavior has existed since the DOS days.