News:

MASM32 SDK Description, downloads and other helpful links
MASM32.com New Forum Link
masmforum WebSite

float$ macro and algo for testing

Started by jj2007, August 26, 2008, 03:48:53 PM

Previous topic - Next topic

jj2007

Given that the MasmLib FloatToStr has some limitations, I have tried to write a replacement. It is marginally smaller than FloatToStr and somewhat faster. Furthermore, float$ accepts Real4, Real8, Real4, registers and immediate integers as arguments, and is therefore a bit more flexible. It has also basic printf style features:

MsgBox 0, float$("The number %f is far too big", MyReal10), "Test float$ macro:", MB_OK
(%f will be replaced by the converted number MyReal10)

MsgBox 0, float$("The number\t%f\nis far too small", MyReal10/12345678+11), "Test float$ macro:", MB_OK
(%f will be replaced by the result of MyReal10/12345678+11, \t creates tab, \n creates newline)

.data
   mta REAL4 1234.5 ; just for fun, we mix a real 4 with a real 10
   mtb REAL10 123.45 ; for calculating mta/mtb+113
.code
   print float$("a=1234.5, b=123.45, c=113:\na/b+c=%f", mta/mtb+113)

Output:
a=1234.5, b=123.45, c=113:
a/b+c=123.0000


Cycle counts for P4 are below. Grateful if Core Duo owners could give it a try, too. On a Celeron M, there was only a marginal difference between float$ and FloatToStr. I am pretty sure it could be made somewhat faster; load FloatStr.asm in an editor and search for crash - the algo is just below that "keyword" (the FloatStr.asc version is meant for printing from Word or Wordpad).

Happy testing, jj

EDIT: New version attached, with minor bug fixes (display of exponents)
EDIT (2): New version attached with changes between f2sL0: and f2sPutDot2:
EDIT (3): New version correct up to 15 digits precision; new P4 timings inserted
EDIT (4): Accepts now immediate real numbers as arguments, e.g.

.data
MyR10   REAL10 1.2345678e9
.code
mov edx, 12345678      ; = 1.2345678e7
MsgBox 0, float$("MyR10/edx+23.4567=%f", MyR10/edx+23.4567), "Test float$ macro:", MB_OK

Output: MyR10/edx+23.4567=123.4567
One to three arguments are allowed; they can be Real8, Real10, registers (all sizes), immediate integers or floats. Allowed operators are + - * /
No check for precedence, i.e. the macro does sequential calculation of the type 5+3*5=40

Code sizes and FPU register preservation:
FloatToStr      size=895, ST 6-8 trashed
float$          size=746, all ST regs preserved
Ray's lib       size=700, all ST regs preserved
crt sprintf     size=???, all ST regs preserved

finit is ON
2143 cycles for FloatToStr      1.234568e-004
979 cycles for float$ Real4     1.234568e-05   <- three different numbers
974 cycles for float$ Real8     1.234568e-04   here to show automatic switch
925 cycles for float$ Real10    0.001234568   to normal notation
3352 cycles for Ray's lib       0.001235
7412 cycles for sprintf         0.000123457

---------
2238 cycles for FloatToStr      0.1234568
931 cycles for float$ Real4     0.1234568
937 cycles for float$ Real8     0.1234568
929 cycles for float$ Real10    0.1234568
3352 cycles for Ray's lib       0.123457
6322 cycles for sprintf         0.123457

---------
1737 cycles for FloatToStr      1.234568
932 cycles for float$ Real4     1.234568
913 cycles for float$ Real8     1.234568
940 cycles for float$ Real10    1.234568
3355 cycles for Ray's lib       1.234568
7670 cycles for sprintf         1.23457

---------
2207 cycles for FloatToStr      1234.568
940 cycles for float$ Real4     1234.568
945 cycles for float$ Real8     1234.568
945 cycles for float$ Real10    1234.568
3560 cycles for Ray's lib       1234.567890
7170 cycles for sprintf         1234.57

---------
2188 cycles for FloatToStr      1.234568e+123
987 cycles for float$ Real4     1.234568e+23        <--- 123 exceeds Real4 range, so I put e23
1035 cycles for float$ Real8    1.234568e+123
1053 cycles for float$ Real10   1.234568e+123
3608 cycles for Ray's lib       1.234567890123457E+0123
9804 cycles for sprintf         1.23457e+123

---------
2159 cycles for FloatToStr      -1.234568e-123
977 cycles for float$ Real4     -1.234568e-23
1019 cycles for float$ Real8    -1.234568e-123
1047 cycles for float$ Real10   -1.234568e-123
3348 cycles for Ray's lib       -0.000000
10526 cycles for sprintf        -1.23457e-123

---------
21 cycles for FloatToStr        0
183 cycles for float$ Real4     0.0
196 cycles for float$ Real8     0.0
181 cycles for float$ Real10    0.0
1180 cycles for Ray's lib       0
1233 cycles for sprintf         0


Latest version of 1 September 2008, 11:06 GMT+1 attached here.

[attachment deleted by admin]

sinsi

Some weird stuff here (Q6600)

Code sizes and FPU register preservation:
FloatToStr      size=895, ST 6-8 trashed
float$          size=689, all ST regs preserved
Ray's lib       size=700, all ST regs preserved
crt sprintf     size=???, all ST regs preserved

finit is ON
461 cycles for FloatToStr       1.234568e-003
422 cycles for float$ Real4     0.0012345
418 cycles for float$ Real8     0.0012345
420 cycles for float$ Real10    0.0012345
446 cycles for float$ edi       2.1474836e+09
1171 cycles for Ray's lib        0.001235
4409 cycles for sprintf         0.00123457

---------
495 cycles for FloatToStr       0.1234568
428 cycles for float$ Real4     0.1234567
433 cycles for float$ Real8     0.1234567
430 cycles for float$ Real10    0.1234567
1170 cycles for Ray's lib        0.123457
3559 cycles for sprintf         0.123457

---------
685 cycles for FloatToStr       1.234568
432 cycles for float$ Real4     1.2345678
434 cycles for float$ Real8     1.2345678
472 cycles for float$ Real10    1.2345678
1185 cycles for Ray's lib        1.234568
4259 cycles for sprintf         1.23457

---------
682 cycles for FloatToStr       1234.568
431 cycles for float$ Real4     1234.5678
432 cycles for float$ Real8     1234.5678
430 cycles for float$ Real10    1234.5678
1238 cycles for Ray's lib        1234.567890
4283 cycles for sprintf         1234.57

---------
497 cycles for FloatToStr       1.234568e+123
443 cycles for float$ Real4     1.2345678e+23
467 cycles for float$ Real8     1.2345678e+123
460 cycles for float$ Real10    1.2345678e+123
1232 cycles for Ray's lib        1.234567890100000E+0123
5787 cycles for sprintf         1.23457e+123

---------
473 cycles for FloatToStr       -1.234568e-123
444 cycles for float$ Real4     -1.234567e-23
455 cycles for float$ Real8     -1.234567e-123
455 cycles for float$ Real10    -1.234567e-123
1167 cycles for Ray's lib       -0.000000
5636 cycles for sprintf         -1.23457e-123

---------
8 cycles for FloatToStr 0
69 cycles for float$ Real4      0.0
68 cycles for float$ Real8      0.0
68 cycles for float$ Real10     0.0
399 cycles for Ray's lib         0
644 cycles for sprintf          0

Light travels faster than sound, that's why some people seem bright until you hear them.

jj2007

Quote from: sinsi on August 27, 2008, 07:45:26 AM
Some weird stuff here (Q6600)
It seems that the Core 2 line of processors have a much faster FPU (typically a factor 5); here are timings for a "conventional" P4:

2221 cycles for FloatToStr      0.1234568
901 cycles for float$ Real4     0.1234567  <-- must check my rounding routine
906 cycles for float$ Real8     0.1234567
929 cycles for float$ Real10    0.1234567
3397 cycles for Ray's lib        0.123457
6453 cycles for sprintf         0.123457

Comparing FloatToStr and float$:
2221/912=2.4, against 495/(428+433+430)*3=1.15 for your timings. Given that there is fair amount of non-FPU instructions in here, this demonstrates the advantage of the Core 2 FPU. I have googled for "faster fpu" "core 2", with very few results; Wiki says Pentium MMX has a faster FPU, but that's not what we observe here. Strange that such a major improvement goes unnoticed, especially since many compilers still don't know about sse2...

jj2007

Quote from: jj2007 on August 27, 2008, 08:23:43 AM
I have googled for "faster fpu" "core 2", with very few results

Here is a table showing FPU scores for a number of different processors. But it cannot explain the stark differences observed e.g. by Raymond.

MichaelW

Core 2 Duo E6300, 1867 MHz: 395645
Pentium 4 Willamette S423, 1300 MHz: 90631

After adjusting for the different clock speed on the Pentium 4 the ratio is ~3.8:1.
eschew obfuscation

herge

 Hi All:


------- New float$ macro: ------------------
Divide MyReal10 (=1.2345678e9)
by 12345678
add 11
Result= 111.00000
-- This para printed by one line of code! --


Code sizes and FPU register preservation:
FloatToStr size=895, ST 6-8 trashed
float$    size=689, all ST regs preserved
Ray's lib size=700, all ST regs preserved
crt sprintf size=???, all ST regs preserved

finit is ON
468 cycles for FloatToStr 1.234568e-003
425 cycles for float$ Real4 0.0012345
424 cycles for float$ Real8 0.0012345
424 cycles for float$ Real10 0.0012345
450 cycles for float$ edi 2.1474836e+09
1175 cycles for Ray's lib 0.001235
4449 cycles for sprintf  0.00123457

---------
500 cycles for FloatToStr 0.1234568
435 cycles for float$ Real4 0.1234567
437 cycles for float$ Real8 0.1234567
435 cycles for float$ Real10 0.1234567
1179 cycles for Ray's lib 0.123457
3601 cycles for sprintf  0.123457

---------
688 cycles for FloatToStr 1.234568
438 cycles for float$ Real4 1.2345678
438 cycles for float$ Real8 1.2345678
476 cycles for float$ Real10 1.2345678
1192 cycles for Ray's lib 1.234568
4349 cycles for sprintf  1.23457

---------
687 cycles for FloatToStr 1234.568
436 cycles for float$ Real4 1234.5678
436 cycles for float$ Real8 1234.5678
435 cycles for float$ Real10 1234.5678
1242 cycles for Ray's lib 1234.567890
4285 cycles for sprintf  1234.57

---------
499 cycles for FloatToStr 1.234568e+123
441 cycles for float$ Real4 1.2345678e+23
461 cycles for float$ Real8 1.2345678e+123
461 cycles for float$ Real10 1.2345678e+123
1237 cycles for Ray's lib 1.234567890100000E+0123
5855 cycles for sprintf  1.23457e+123

---------
480 cycles for FloatToStr -1.234568e-123
445 cycles for float$ Real4 -1.234567e-23
458 cycles for float$ Real8 -1.234567e-123
457 cycles for float$ Real10 -1.234567e-123
1180 cycles for Ray's lib -0.000000
5734 cycles for sprintf  -1.23457e-123

---------
8 cycles for FloatToStr 0
69 cycles for float$ Real4 0.0
68 cycles for float$ Real8 0.0
68 cycles for float$ Real10 0.0
402 cycles for Ray's lib 0
595 cycles for sprintf  0


I have a Duo Core

Regards herge
// Herge born  Brussels, Belgium May 22, 1907
// Died March 3, 1983
// Cartoonist of Tintin and Snowy

Mark Jones


------- New float$ macro: ------------------
Divide  MyReal10 (=1.2345678e9)
by      12345678
add     11
Result= 111.00000
-- This para printed by one line of code! --


Code sizes and FPU register preservation:
FloatToStr      size=895, ST 6-8 trashed
float$          size=689, all ST regs preserved
Ray's lib       size=700, all ST regs preserved
crt sprintf     size=???, all ST regs preserved

finit is ON
542 cycles for FloatToStr       1.234568e-003
443 cycles for float$ Real4     0.0012345
443 cycles for float$ Real8     0.0012345
449 cycles for float$ Real10    0.0012345
491 cycles for float$ edi       2.1474836e+09
993 cycles for Ray's lib         0.001235
4586 cycles for sprintf         0.00123457

---------
458 cycles for FloatToStr       0.1234568
432 cycles for float$ Real4     0.1234567
432 cycles for float$ Real8     0.1234567
434 cycles for float$ Real10    0.1234567
970 cycles for Ray's lib         0.123457
3695 cycles for sprintf         0.123457

---------
465 cycles for FloatToStr       1.234568
468 cycles for float$ Real4     1.2345678
467 cycles for float$ Real8     1.2345678
470 cycles for float$ Real10    1.2345678
997 cycles for Ray's lib         1.234568
4201 cycles for sprintf         1.23457

---------
457 cycles for FloatToStr       1234.568
466 cycles for float$ Real4     1234.5678
458 cycles for float$ Real8     1234.5678
453 cycles for float$ Real10    1234.5678
1000 cycles for Ray's lib        1234.567890
4346 cycles for sprintf         1234.57

---------
559 cycles for FloatToStr       1.234568e+123
488 cycles for float$ Real4     1.2345678e+23
528 cycles for float$ Real8     1.2345678e+123
530 cycles for float$ Real10    1.2345678e+123
1046 cycles for Ray's lib        1.234567890100000E+0123
6075 cycles for sprintf         1.23457e+123

---------
547 cycles for FloatToStr       -1.234568e-123
488 cycles for float$ Real4     -1.234567e-23
530 cycles for float$ Real8     -1.234567e-123
530 cycles for float$ Real10    -1.234567e-123
970 cycles for Ray's lib        -0.000000
6004 cycles for sprintf         -1.23457e-123

---------
10 cycles for FloatToStr        0
82 cycles for float$ Real4      0.0
84 cycles for float$ Real8      0.0
101 cycles for float$ Real10    0.0
401 cycles for Ray's lib         0
808 cycles for sprintf          0


AMD x2 x64 4000+ / WinXP32
"To deny our impulses... foolish; to revel in them, chaos." MCJ 2003.08

dsouza123

Athlon 1172 Mhz  XP Pro 32-bit


  ------- New float$ macro: ------------------
  Divide  MyReal10 (=1.2345678e9)
  by      12345678
  add     11
  Result= 111.00000
  -- This para printed by one line of code! --
 
 
  Code sizes and FPU register preservation:
  FloatToStr      size=895, ST 6-8 trashed
  float$          size=689, all ST regs preserved
  Ray's lib       size=700, all ST regs preserved
  crt sprintf     size=???, all ST regs preserved
 
  finit is ON
  565 cycles for FloatToStr       1.234568e-003
  508 cycles for float$ Real4     0.0012345
  487 cycles for float$ Real8     0.0012345
  492 cycles for float$ Real10    0.0012345
  551 cycles for float$ edi       2.1474836e+09
  990 cycles for Ray's lib         0.001235
  4862 cycles for sprintf         0.00123457
 
  ---------
  473 cycles for FloatToStr       0.1234568
  483 cycles for float$ Real4     0.1234567
  489 cycles for float$ Real8     0.1234567
  482 cycles for float$ Real10    0.1234567
  969 cycles for Ray's lib         0.123457
  4074 cycles for sprintf         0.123457
 
  ---------
  486 cycles for FloatToStr       1.234568
  548 cycles for float$ Real4     1.2345678
  520 cycles for float$ Real8     1.2345678
  516 cycles for float$ Real10    1.2345678
  1003 cycles for Ray's lib        1.234568
  4732 cycles for sprintf         1.23457
 
  ---------
  484 cycles for FloatToStr       1234.568
  523 cycles for float$ Real4     1234.5678
  508 cycles for float$ Real8     1234.5678
  505 cycles for float$ Real10    1234.5678
  1008 cycles for Ray's lib        1234.567890
  4756 cycles for sprintf         1234.57
 
  ---------
  581 cycles for FloatToStr       1.234568e+123
  550 cycles for float$ Real4     1.2345678e+23
  598 cycles for float$ Real8     1.2345678e+123
  580 cycles for float$ Real10    1.2345678e+123
  1049 cycles for Ray's lib        1.234567890100000E+0123
  6463 cycles for sprintf         1.23457e+123
 
  ---------
  567 cycles for FloatToStr       -1.234568e-123
  544 cycles for float$ Real4     -1.234567e-23
  580 cycles for float$ Real8     -1.234567e-123
  578 cycles for float$ Real10    -1.234567e-123
  977 cycles for Ray's lib        -0.000000
  6422 cycles for sprintf         -1.23457e-123
 
  ---------
  13 cycles for FloatToStr        0
  105 cycles for float$ Real4     0.0
  107 cycles for float$ Real8     0.0
  73 cycles for float$ Real10     0.0
  360 cycles for Ray's lib         0
  915 cycles for sprintf          0

jj2007

#8
Thanks for testing this. In the meantime, I have refined the algo a little bit, improving inter alia the compatibility to FloatToStr and fixing a rounding bug. Timings are still slightly below FloatToStr on my Celeron M (=Yonah), especially in the "ordinary" range of the 123.456 type:

Code sizes and FPU register preservation:
FloatToStr      size=895, ST 6-8 trashed
float$          size=700, all ST regs preserved
Ray's lib       size=700, all ST regs preserved
crt sprintf     size=???, all ST regs preserved

finit is ON
426 cycles for FloatToStr       1.234568e-004
457 cycles for float$ Real4     1.234568e-05    <--- not a bug; I deliberately
449 cycles for float$ Real8     1.234568e-04    put e-5, e-4, e-3 to demonstrate
436 cycles for float$ Real10    0.001234568     the change from scientific to normal notation
1082 cycles for Ray's lib       0.001235
4219 cycles for sprintf         0.000123457

---------
430 cycles for FloatToStr       0.1234568
432 cycles for float$ Real4     0.1234568
433 cycles for float$ Real8     0.1234568
432 cycles for float$ Real10    0.1234568
1083 cycles for Ray's lib       0.123457
3569 cycles for sprintf         0.123457

---------
598 cycles for FloatToStr       1.234568
441 cycles for float$ Real4     1.234568
434 cycles for float$ Real8     1.234568
440 cycles for float$ Real10    1.234568
1081 cycles for Ray's lib       1.234568
4260 cycles for sprintf         1.23457

---------
598 cycles for FloatToStr       1234.568
438 cycles for float$ Real4     1234.568
435 cycles for float$ Real8     1234.568
439 cycles for float$ Real10    1234.568
1081 cycles for Ray's lib       1234.567890
4340 cycles for sprintf         1234.57

---------
475 cycles for FloatToStr       1.234568e+123
442 cycles for float$ Real4     1.234568e+23
456 cycles for float$ Real8     1.234568e+123
465 cycles for float$ Real10    1.234568e+123
1140 cycles for Ray's lib       1.234567890123456E+0123
5739 cycles for sprintf         1.23457e+123

---------
443 cycles for FloatToStr       -1.234568e-123
448 cycles for float$ Real4     -1.234568e-23    <-- -123 would have been beyond the R4 range, so I put 23
463 cycles for float$ Real8     -1.234568e-123
460 cycles for float$ Real10    -1.234568e-123
1080 cycles for Ray's lib       -0.000000
5810 cycles for sprintf         -1.23457e-123

---------
12 cycles for FloatToStr        0
58 cycles for float$ Real4      0.0
56 cycles for float$ Real8      0.0
59 cycles for float$ Real10     0.0
340 cycles for Ray's lib        0
667 cycles for sprintf          0


float$ isn't any better in the "ordinary" range, but FloatToStr is 20% slower in this range.
Precision can be configured at assembly time (3-15 digits), as well as the breakpoint for switching from 0.001 to 1.00e-3

For the assembly purists and professional macro haters:

------- New float$ macro: ------------------
Divide  MyReal10 (=1.2345678e9)
by      12345678
add     11
Result= 111.0000
-- This para printed by one line of code! --


EDIT: Newest version attached to first post above

Mark Jones


------- New float$ macro: ------------------
Divide  MyReal10 (=1.2345678e9)
by      12345678
add     11
Result= 111.0000
-- This para printed by one line of code! --


Code sizes and FPU register preservation:
FloatToStr      size=895, ST 6-8 trashed
float$          size=738, all ST regs preserved
Ray's lib       size=700, all ST regs preserved
crt sprintf     size=???, all ST regs preserved

finit is ON
541 cycles for FloatToStr       1.234568e-004
473 cycles for float$ Real4     1.234568e-05
490 cycles for float$ Real8     1.234568e-04
457 cycles for float$ Real10    0.001234568
971 cycles for Ray's lib        0.001235
4275 cycles for sprintf         0.0001234568

---------
456 cycles for FloatToStr       0.1234568
431 cycles for float$ Real4     0.1234568
432 cycles for float$ Real8     0.1234568
433 cycles for float$ Real10    0.1234568
952 cycles for Ray's lib        0.123457
3643 cycles for sprintf         0.1234568

---------
453 cycles for FloatToStr       1.234568
452 cycles for float$ Real4     1.234568
451 cycles for float$ Real8     1.234568
455 cycles for float$ Real10    1.234568
979 cycles for Ray's lib        1.234568
4344 cycles for sprintf         1.234568

---------
459 cycles for FloatToStr       1234.568
456 cycles for float$ Real4     1234.568
455 cycles for float$ Real8     1234.568
460 cycles for float$ Real10    1234.568
981 cycles for Ray's lib        1234.567890
4146 cycles for sprintf         1234.568

---------
578 cycles for FloatToStr       1.234568e+123
473 cycles for float$ Real4     1.234568e+23
515 cycles for float$ Real8     1.234568e+123
517 cycles for float$ Real10    1.234568e+123
1030 cycles for Ray's lib       1.234567890123457E+0123
5794 cycles for sprintf         1.234568e+123

---------
547 cycles for FloatToStr       -1.234568e-123
474 cycles for float$ Real4     -1.234568e-23
515 cycles for float$ Real8     -1.234568e-123
520 cycles for float$ Real10    -1.234568e-123
953 cycles for Ray's lib        -0.000000
5860 cycles for sprintf         -1.234568e-123

---------
9 cycles for FloatToStr 0
83 cycles for float$ Real4      0
93 cycles for float$ Real8      0
86 cycles for float$ Real10     0
374 cycles for Ray's lib        0
855 cycles for sprintf          0
"To deny our impulses... foolish; to revel in them, chaos." MCJ 2003.08

jj2007

Quote from: jj2007 on August 27, 2008, 09:55:12 PM
float$ isn't any better in the "ordinary" range, but FloatToStr is 20% slower in this range.

Not for your machine, Mark...

Quote from: Mark Jones on August 31, 2008, 04:38:07 PM

---------
453 cycles for FloatToStr       1.234568
452 cycles for float$ Real4     1.234568
451 cycles for float$ Real8     1.234568
455 cycles for float$ Real10    1.234568

---------
459 cycles for FloatToStr       1234.568
456 cycles for float$ Real4     1234.568
455 cycles for float$ Real8     1234.568
460 cycles for float$ Real10    1234.568


herge


Hi ALL:


------- New float$ macro: ---------------------
Divide MyReal10 (=1.2345678e9)
by 12345678 (=1.2e7, in eax)
add 11.1111    (an immediate real)
Result= 111.1111
-- This para printed by one line of code! -----


Code sizes and FPU register preservation:
FloatToStr size=895, ST 6-8 trashed
float$    size=771, all ST regs preserved
Ray's lib size=700, all ST regs preserved
crt sprintf size=???, all ST regs preserved

finit is ON Version 1.1, 1 September 2008
424 cycles for FloatToStr 1.234568e-004
435 cycles for float$ Real5 1.234568e-05
441 cycles for float$ Real8 1.234568e-04
435 cycles for float$ Real10 0.001234568
1194 cycles for Ray's lib 0.001235
4198 cycles for sprintf  0.0001234568

---------
458 cycles for FloatToStr 0.1234568
436 cycles for float$ Real5 0.1234568
432 cycles for float$ Real8 0.1234568
432 cycles for float$ Real10 0.1234568
1197 cycles for Ray's lib 0.123457
3418 cycles for sprintf  0.1234568

---------
655 cycles for FloatToStr 1.234568
430 cycles for float$ Real5 1.234568
432 cycles for float$ Real8 1.234568
431 cycles for float$ Real10 1.234568
1199 cycles for Ray's lib 1.234568
4559 cycles for sprintf  1.234568

---------
649 cycles for FloatToStr 1234.568
441 cycles for float$ Real5 1234.568
434 cycles for float$ Real8 1234.568
431 cycles for float$ Real10 1234.568
1246 cycles for Ray's lib 1234.567890
4422 cycles for sprintf  1234.568

---------
466 cycles for FloatToStr 1.234568e+123
431 cycles for float$ Real5 1.234568e+23
448 cycles for float$ Real8 1.234568e+123
448 cycles for float$ Real10 1.234568e+123
1230 cycles for Ray's lib 1.234567890123457E+0123
5373 cycles for sprintf  1.234568e+123

---------
434 cycles for FloatToStr -1.234568e-123
431 cycles for float$ Real5 -1.234568e-23
450 cycles for float$ Real8 -1.234568e-123
448 cycles for float$ Real10 -1.234568e-123
1169 cycles for Ray's lib -0.000000
5746 cycles for sprintf  -1.234568e-123

---------
11 cycles for FloatToStr 0
68 cycles for float$ Real5 0
66 cycles for float$ Real8 0
59 cycles for float$ Real10 0
402 cycles for Ray's lib 0
588 cycles for sprintf  0



Regards herge
// Herge born  Brussels, Belgium May 22, 1907
// Died March 3, 1983
// Cartoonist of Tintin and Snowy

jj2007

New version attached, enhanced with drizz qwtoa algo.
Timings on Core 2 Celeron M below. The bad news: It's now 18 bytes longer than the old FloatToStr, and is still much slower if the number to print is zero. Otherwise, I am quite satisfied. Can somebody test it on a non-Core 2 please? Thanx.

Code sizes and FPU register preservation:
FloatToStr      size=895, ST 6-8 trashed
float$          size=913, all ST regs preserved
Ray's lib       size=700, all ST regs preserved
crt sprintf     size=???, all ST regs preserved

finit is ON     Version 1.2, 12 September 2008
Credits to drizz for the qwtoa algo

493 cycles for FloatToStr       1.234568e-004
353 cycles for float$ REAL4     1.23456e-05
347 cycles for float$ REAL8     1.23456e-04
345 cycles for float$ REAL10    1.23456e-03
1115 cycles for Ray's lib       0.001235
4490 cycles for sprintf         0.0001234568

---------
492 cycles for FloatToStr       0.1234568
325 cycles for float$ REAL4     0.1234568
319 cycles for float$ REAL8     0.1234568
319 cycles for float$ REAL10    0.1234568
1110 cycles for Ray's lib       0.123457
3620 cycles for sprintf         0.1234568

---------
658 cycles for FloatToStr       1.234568
329 cycles for float$ REAL4     1.234568
316 cycles for float$ REAL8     1.234568
323 cycles for float$ REAL10    1.234568
1114 cycles for Ray's lib       1.234568
4530 cycles for sprintf         1.234568

---------
653 cycles for FloatToStr       1234.568
325 cycles for float$ REAL4     1234.568
322 cycles for float$ REAL8     1234.568
322 cycles for float$ REAL10    1234.568
1113 cycles for Ray's lib       1234.567890
4555 cycles for sprintf         1234.568

---------
526 cycles for FloatToStr       1.234568e+123
343 cycles for float$ REAL4     1.23456e+23
370 cycles for float$ REAL8     1.23456e+123
366 cycles for float$ REAL10    1.23456e+123
1175 cycles for Ray's lib       1.234567890123457E+0123
5925 cycles for sprintf         1.234568e+123

---------
503 cycles for FloatToStr       -1.234568e-123
359 cycles for float$ REAL4     -1.23456e-23
394 cycles for float$ REAL8     -1.23456e-123
388 cycles for float$ REAL10    -1.23456e-123
1124 cycles for Ray's lib       -0.000000
6085 cycles for sprintf         -1.234568e-123

---------
11 cycles for FloatToStr        0
125 cycles for float$ REAL4     0
125 cycles for float$ REAL8     0
124 cycles for float$ REAL10    0
349 cycles for Ray's lib        0
695 cycles for sprintf          0




[attachment deleted by admin]

herge


Hi ALL:



Test variable precision (default: 7 digits)
0.0012345: 1.23456e-03
0.012345: 0.01234568
0.12345: 0.1234568
1.2345:      1.234568
1.2345:      1.
1.2345:      1.2
1.2345:      1.23
1.2345:      1.235
1.2345:      1.2346
1.2345:      1.23457
1.2345:      1.234568
1.2345:      1.2345679
1.2345:      1.23456789
1.2345:      1.234567890
1.2345:      1.2345678901
1.2345:      1.23456789012
1.2345:      1.234567890123
1.2345:      1.2345678901235
1.2345:      1.23456789012346
12.345:      12.34568
12345.678: 12345.68
1.2345e+9: 1.23456e+09
1.2345e+12: 1.23456e+12
1.2345e+15: 1.23456e+15

Code sizes and FPU register preservation:
FloatToStr size=895, ST 6-8 trashed
float$    size=913, all ST regs preserved
Ray's lib size=700, all ST regs preserved
crt sprintf size=???, all ST regs preserved

finit is ON Version 1.2, 12 September 2008
Credits to drizz for the qwtoa algo

464 cycles for FloatToStr 1.234568e-004
305 cycles for float$ REAL4 1.23456e-05
308 cycles for float$ REAL8 1.23456e-04
299 cycles for float$ REAL10 1.23456e-03
1202 cycles for Ray's lib 0.001235
4196 cycles for sprintf  0.0001234568

---------
502 cycles for FloatToStr 0.1234568
281 cycles for float$ REAL4 0.1234568
284 cycles for float$ REAL8 0.1234568
273 cycles for float$ REAL10 0.1234568
1185 cycles for Ray's lib 0.123457
3413 cycles for sprintf  0.1234568

---------
692 cycles for FloatToStr 1.234568
281 cycles for float$ REAL4 1.234568
283 cycles for float$ REAL8 1.234568
277 cycles for float$ REAL10 1.234568
1184 cycles for Ray's lib 1.234568
4555 cycles for sprintf  1.234568

---------
684 cycles for FloatToStr 1234.568
284 cycles for float$ REAL4 1234.568
286 cycles for float$ REAL8 1234.568
287 cycles for float$ REAL10 1234.568
1243 cycles for Ray's lib 1234.567890
4386 cycles for sprintf  1234.568

---------
505 cycles for FloatToStr 1.234568e+123
299 cycles for float$ REAL4 1.23456e+23
324 cycles for float$ REAL8 1.23456e+123
316 cycles for float$ REAL10 1.23456e+123
1254 cycles for Ray's lib 1.234567890123457E+0123
5454 cycles for sprintf  1.234568e+123

---------
485 cycles for FloatToStr -1.234568e-123
310 cycles for float$ REAL4 -1.23456e-23
361 cycles for float$ REAL8 -1.23456e-123
355 cycles for float$ REAL10 -1.23456e-123
1196 cycles for Ray's lib -0.000000
5834 cycles for sprintf  -1.234568e-123

---------
10 cycles for FloatToStr 0
132 cycles for float$ REAL4 0
134 cycles for float$ REAL8 0
128 cycles for float$ REAL10 0
398 cycles for Ray's lib 0
587 cycles for sprintf  0



Regards herge
// Herge born  Brussels, Belgium May 22, 1907
// Died March 3, 1983
// Cartoonist of Tintin and Snowy

Mark Jones

AMD x2 Dual 4GHz x64 on WinXPx32:

Test variable precision (default: 7 digits)
0.0012345:      1.23456e-03
0.012345:       0.01234568
0.12345:        0.1234568
1.2345:         1.234568
1.2345:         1.
1.2345:         1.2
1.2345:         1.23
1.2345:         1.235
1.2345:         1.2346
1.2345:         1.23457
1.2345:         1.234568
1.2345:         1.2345679
1.2345:         1.23456789
1.2345:         1.234567890
1.2345:         1.2345678901
1.2345:         1.23456789012
1.2345:         1.234567890123
1.2345:         1.2345678901235
1.2345:         1.23456789012346
12.345:         12.34568
12345.678:      12345.68
1.2345e+9:      1.23456e+09
1.2345e+12:     1.23456e+12
1.2345e+15:     1.23456e+15

Code sizes and FPU register preservation:
FloatToStr      size=895, ST 6-8 trashed
float$          size=913, all ST regs preserved
Ray's lib       size=700, all ST regs preserved
crt sprintf     size=???, all ST regs preserved

finit is ON     Version 1.2, 12 September 2008
Credits to drizz for the qwtoa algo

540 cycles for FloatToStr       1.234568e-004
461 cycles for float$ REAL4     1.23456e-05
476 cycles for float$ REAL8     1.23456e-04
479 cycles for float$ REAL10    1.23456e-03
1000 cycles for Ray's lib       0.001235
4180 cycles for sprintf         0.0001234568

---------
455 cycles for FloatToStr       0.1234568
384 cycles for float$ REAL4     0.1234568
384 cycles for float$ REAL8     0.1234568
386 cycles for float$ REAL10    0.1234568
978 cycles for Ray's lib        0.123457
3455 cycles for sprintf         0.1234568

---------
470 cycles for FloatToStr       1.234568
425 cycles for float$ REAL4     1.234568
424 cycles for float$ REAL8     1.234568
427 cycles for float$ REAL10    1.234568
1011 cycles for Ray's lib       1.234568
4177 cycles for sprintf         1.234568

---------
458 cycles for FloatToStr       1234.568
404 cycles for float$ REAL4     1234.568
421 cycles for float$ REAL8     1234.568
412 cycles for float$ REAL10    1234.568
1008 cycles for Ray's lib       1234.567890
4111 cycles for sprintf         1234.568

---------
564 cycles for FloatToStr       1.234568e+123
458 cycles for float$ REAL4     1.23456e+23
501 cycles for float$ REAL8     1.23456e+123
504 cycles for float$ REAL10    1.23456e+123
1056 cycles for Ray's lib       1.234567890123457E+0123
5695 cycles for sprintf         1.234568e+123

---------
546 cycles for FloatToStr       -1.234568e-123
459 cycles for float$ REAL4     -1.23456e-23
503 cycles for float$ REAL8     -1.23456e-123
505 cycles for float$ REAL10    -1.23456e-123
978 cycles for Ray's lib        -0.000000
5679 cycles for sprintf         -1.234568e-123

---------
9 cycles for FloatToStr 0
132 cycles for float$ REAL4     0
131 cycles for float$ REAL8     0
134 cycles for float$ REAL10    0
398 cycles for Ray's lib        0
842 cycles for sprintf          0
"To deny our impulses... foolish; to revel in them, chaos." MCJ 2003.08