Print Page - float$ macro and algo for testing

Title: float$ macro and algo for testing
Post by: jj2007 on August 26, 2008, 03:48:53 PM

Given that the MasmLib FloatToStr has some limitations, I have tried to write a replacement. It is marginally smaller than FloatToStr and somewhat faster. Furthermore, float$ accepts Real4, Real8, Real4, registers and immediate integers as arguments, and is therefore a bit more flexible. It has also basic printf style features:

MsgBox 0, float$("The number %f is far too big", MyReal10), "Test float$ macro:", MB_OK
(%f will be replaced by the converted number MyReal10)

MsgBox 0, float$("The number\t%f\nis far too small", MyReal10/12345678+11), "Test float$ macro:", MB_OK
(%f will be replaced by the result of MyReal10/12345678+11, \t creates tab, \n creates newline)

Code Select

.data
   mta	REAL4	1234.5	; just for fun, we mix a real 4 with a real 10
   mtb	REAL10	123.45	; for calculating mta/mtb+113
.code
   print float$("a=1234.5, b=123.45, c=113:\na/b+c=%f", mta/mtb+113)

Output:

Code Select

a=1234.5, b=123.45, c=113:
a/b+c=123.0000

Cycle counts for P4 are below. Grateful if Core Duo owners could give it a try, too. On a Celeron M, there was only a marginal difference between float$ and FloatToStr. I am pretty sure it could be made somewhat faster; load FloatStr.asm in an editor and search for crash - the algo is just below that "keyword" (the FloatStr.asc version is meant for printing from Word or Wordpad).

Happy testing, jj

EDIT: New version attached, with minor bug fixes (display of exponents)
EDIT (2): New version attached with changes between f2sL0: and f2sPutDot2:
EDIT (3): New version correct up to 15 digits precision; new P4 timings inserted
EDIT (4): Accepts now immediate real numbers as arguments, e.g.

.data
MyR10 REAL10 1.2345678e9
.code
mov edx, 12345678 ; = 1.2345678e7
MsgBox 0, float$("MyR10/edx+23.4567=%f", MyR10/edx+23.4567), "Test float$ macro:", MB_OK

Output: MyR10/edx+23.4567=123.4567
One to three arguments are allowed; they can be Real8, Real10, registers (all sizes), immediate integers or floats. Allowed operators are + - * /
No check for precedence, i.e. the macro does sequential calculation of the type 5+3*5=40

Code Select

Code sizes and FPU register preservation:
FloatToStr      size=895, ST 6-8 trashed
float$          size=746, all ST regs preserved
Ray's lib       size=700, all ST regs preserved
crt sprintf     size=???, all ST regs preserved

finit is ON
2143 cycles for FloatToStr      1.234568e-004
979 cycles for float$ Real4     1.234568e-05   <- three different numbers
974 cycles for float$ Real8     1.234568e-04   here to show automatic switch
925 cycles for float$ Real10    0.001234568   to normal notation
3352 cycles for Ray's lib       0.001235
7412 cycles for sprintf         0.000123457

---------
2238 cycles for FloatToStr      0.1234568
931 cycles for float$ Real4     0.1234568
937 cycles for float$ Real8     0.1234568
929 cycles for float$ Real10    0.1234568
3352 cycles for Ray's lib       0.123457
6322 cycles for sprintf         0.123457

---------
1737 cycles for FloatToStr      1.234568
932 cycles for float$ Real4     1.234568
913 cycles for float$ Real8     1.234568
940 cycles for float$ Real10    1.234568
3355 cycles for Ray's lib       1.234568
7670 cycles for sprintf         1.23457

---------
2207 cycles for FloatToStr      1234.568
940 cycles for float$ Real4     1234.568
945 cycles for float$ Real8     1234.568
945 cycles for float$ Real10    1234.568
3560 cycles for Ray's lib       1234.567890
7170 cycles for sprintf         1234.57

---------
2188 cycles for FloatToStr      1.234568e+123
987 cycles for float$ Real4     1.234568e+23        <--- 123 exceeds Real4 range, so I put e23
1035 cycles for float$ Real8    1.234568e+123
1053 cycles for float$ Real10   1.234568e+123
3608 cycles for Ray's lib       1.234567890123457E+0123
9804 cycles for sprintf         1.23457e+123

---------
2159 cycles for FloatToStr      -1.234568e-123
977 cycles for float$ Real4     -1.234568e-23
1019 cycles for float$ Real8    -1.234568e-123
1047 cycles for float$ Real10   -1.234568e-123
3348 cycles for Ray's lib       -0.000000
10526 cycles for sprintf        -1.23457e-123

---------
21 cycles for FloatToStr        0
183 cycles for float$ Real4     0.0
196 cycles for float$ Real8     0.0
181 cycles for float$ Real10    0.0
1180 cycles for Ray's lib       0
1233 cycles for sprintf         0

Latest version of 1 September 2008, 11:06 GMT+1 attached here.

[attachment deleted by admin]

Title: Re: float$ macro and algo for testing
Post by: sinsi on August 27, 2008, 07:45:26 AM

Some weird stuff here (Q6600)

Code Select


Code sizes and FPU register preservation:
FloatToStr      size=895, ST 6-8 trashed
float$          size=689, all ST regs preserved
Ray's lib       size=700, all ST regs preserved
crt sprintf     size=???, all ST regs preserved

finit is ON
461 cycles for FloatToStr       1.234568e-003
422 cycles for float$ Real4     0.0012345
418 cycles for float$ Real8     0.0012345
420 cycles for float$ Real10    0.0012345
446 cycles for float$ edi       2.1474836e+09
1171 cycles for Ray's lib        0.001235
4409 cycles for sprintf         0.00123457

---------
495 cycles for FloatToStr       0.1234568
428 cycles for float$ Real4     0.1234567
433 cycles for float$ Real8     0.1234567
430 cycles for float$ Real10    0.1234567
1170 cycles for Ray's lib        0.123457
3559 cycles for sprintf         0.123457

---------
685 cycles for FloatToStr       1.234568
432 cycles for float$ Real4     1.2345678
434 cycles for float$ Real8     1.2345678
472 cycles for float$ Real10    1.2345678
1185 cycles for Ray's lib        1.234568
4259 cycles for sprintf         1.23457

---------
682 cycles for FloatToStr       1234.568
431 cycles for float$ Real4     1234.5678
432 cycles for float$ Real8     1234.5678
430 cycles for float$ Real10    1234.5678
1238 cycles for Ray's lib        1234.567890
4283 cycles for sprintf         1234.57

---------
497 cycles for FloatToStr       1.234568e+123
443 cycles for float$ Real4     1.2345678e+23
467 cycles for float$ Real8     1.2345678e+123
460 cycles for float$ Real10    1.2345678e+123
1232 cycles for Ray's lib        1.234567890100000E+0123
5787 cycles for sprintf         1.23457e+123

---------
473 cycles for FloatToStr       -1.234568e-123
444 cycles for float$ Real4     -1.234567e-23
455 cycles for float$ Real8     -1.234567e-123
455 cycles for float$ Real10    -1.234567e-123
1167 cycles for Ray's lib       -0.000000
5636 cycles for sprintf         -1.23457e-123

---------
8 cycles for FloatToStr 0
69 cycles for float$ Real4      0.0
68 cycles for float$ Real8      0.0
68 cycles for float$ Real10     0.0
399 cycles for Ray's lib         0
644 cycles for sprintf          0

Title: Re: float$ macro and algo for testing
Post by: jj2007 on August 27, 2008, 08:23:43 AM

Quote from: sinsi on August 27, 2008, 07:45:26 AM
Some weird stuff here (Q6600)

It seems that the Core 2 line of processors have a much faster FPU (typically a factor 5); here are timings for a "conventional" P4:

2221 cycles for FloatToStr 0.1234568
901 cycles for float$ Real4 0.1234567 <-- must check my rounding routine
906 cycles for float$ Real8 0.1234567
929 cycles for float$ Real10 0.1234567
3397 cycles for Ray's lib 0.123457
6453 cycles for sprintf 0.123457

Comparing FloatToStr and float$:
2221/912=2.4, against 495/(428+433+430)*3=1.15 for your timings. Given that there is fair amount of non-FPU instructions in here, this demonstrates the advantage of the Core 2 FPU. I have googled for "faster fpu" "core 2" (http://www.google.it/search?num=50&hl=en&newwindow=1&safe=off&q=%22faster+fpu%22+%22core+2%22&btnG=Search), with very few results; Wiki (http://en.wikipedia.org/wiki/X86)says Pentium MMX has a faster FPU, but that's not what we observe here. Strange that such a major improvement goes unnoticed, especially since many compilers still don't know about sse2...

Title: Re: float$ macro and algo for testing
Post by: jj2007 on August 27, 2008, 09:15:00 AM

Quote from: jj2007 on August 27, 2008, 08:23:43 AM
I have googled for "faster fpu" "core 2" (http://www.google.it/search?num=50&hl=en&newwindow=1&safe=off&q=%22faster+fpu%22+%22core+2%22&btnG=Search), with very few results

Here (http://www.mikusite.de/pages/listx32g.gif) is a table showing FPU scores for a number of different processors. But it cannot explain the stark differences observed e.g. by Raymond (http://www.masm32.com/board/index.php?topic=9257.msg69484#msg69484).

Title: Re: float$ macro and algo for testing
Post by: MichaelW on August 27, 2008, 10:05:16 AM

Core 2 Duo E6300, 1867 MHz: 395645
Pentium 4 Willamette S423, 1300 MHz: 90631

After adjusting for the different clock speed on the Pentium 4 the ratio is ~3.8:1.

Title: Re: float$ macro and algo for testing
Post by: herge on August 27, 2008, 02:13:10 PM

Hi All:

Code Select


------- New float$ macro: ------------------
Divide	MyReal10 (=1.2345678e9)
by	12345678
add	11
Result=	111.00000
-- This para printed by one line of code! --


Code sizes and FPU register preservation:
FloatToStr	size=895, ST 6-8 trashed
float$   	size=689, all ST regs preserved
Ray's lib	size=700, all ST regs preserved
crt sprintf	size=???, all ST regs preserved

finit is ON
468 cycles for FloatToStr	1.234568e-003
425 cycles for float$ Real4	0.0012345
424 cycles for float$ Real8	0.0012345
424 cycles for float$ Real10	0.0012345
450 cycles for float$ edi	2.1474836e+09
1175 cycles for Ray's lib	 0.001235
4449 cycles for sprintf  	0.00123457

---------
500 cycles for FloatToStr	0.1234568
435 cycles for float$ Real4	0.1234567
437 cycles for float$ Real8	0.1234567
435 cycles for float$ Real10	0.1234567
1179 cycles for Ray's lib	 0.123457
3601 cycles for sprintf  	0.123457

---------
688 cycles for FloatToStr	1.234568
438 cycles for float$ Real4	1.2345678
438 cycles for float$ Real8	1.2345678
476 cycles for float$ Real10	1.2345678
1192 cycles for Ray's lib	 1.234568
4349 cycles for sprintf  	1.23457

---------
687 cycles for FloatToStr	1234.568
436 cycles for float$ Real4	1234.5678
436 cycles for float$ Real8	1234.5678
435 cycles for float$ Real10	1234.5678
1242 cycles for Ray's lib	 1234.567890
4285 cycles for sprintf  	1234.57

---------
499 cycles for FloatToStr	1.234568e+123
441 cycles for float$ Real4	1.2345678e+23
461 cycles for float$ Real8	1.2345678e+123
461 cycles for float$ Real10	1.2345678e+123
1237 cycles for Ray's lib	 1.234567890100000E+0123
5855 cycles for sprintf  	1.23457e+123

---------
480 cycles for FloatToStr	-1.234568e-123
445 cycles for float$ Real4	-1.234567e-23
458 cycles for float$ Real8	-1.234567e-123
457 cycles for float$ Real10	-1.234567e-123
1180 cycles for Ray's lib	-0.000000
5734 cycles for sprintf  	-1.23457e-123

---------
8 cycles for FloatToStr	0
69 cycles for float$ Real4	0.0
68 cycles for float$ Real8	0.0
68 cycles for float$ Real10	0.0
402 cycles for Ray's lib	 0
595 cycles for sprintf  	0

I have a Duo Core

Regards herge

Title: Re: float$ macro and algo for testing
Post by: Mark Jones on August 27, 2008, 06:41:02 PM

Code Select


------- New float$ macro: ------------------
Divide  MyReal10 (=1.2345678e9)
by      12345678
add     11
Result= 111.00000
-- This para printed by one line of code! --


Code sizes and FPU register preservation:
FloatToStr      size=895, ST 6-8 trashed
float$          size=689, all ST regs preserved
Ray's lib       size=700, all ST regs preserved
crt sprintf     size=???, all ST regs preserved

finit is ON
542 cycles for FloatToStr       1.234568e-003
443 cycles for float$ Real4     0.0012345
443 cycles for float$ Real8     0.0012345
449 cycles for float$ Real10    0.0012345
491 cycles for float$ edi       2.1474836e+09
993 cycles for Ray's lib         0.001235
4586 cycles for sprintf         0.00123457

---------
458 cycles for FloatToStr       0.1234568
432 cycles for float$ Real4     0.1234567
432 cycles for float$ Real8     0.1234567
434 cycles for float$ Real10    0.1234567
970 cycles for Ray's lib         0.123457
3695 cycles for sprintf         0.123457

---------
465 cycles for FloatToStr       1.234568
468 cycles for float$ Real4     1.2345678
467 cycles for float$ Real8     1.2345678
470 cycles for float$ Real10    1.2345678
997 cycles for Ray's lib         1.234568
4201 cycles for sprintf         1.23457

---------
457 cycles for FloatToStr       1234.568
466 cycles for float$ Real4     1234.5678
458 cycles for float$ Real8     1234.5678
453 cycles for float$ Real10    1234.5678
1000 cycles for Ray's lib        1234.567890
4346 cycles for sprintf         1234.57

---------
559 cycles for FloatToStr       1.234568e+123
488 cycles for float$ Real4     1.2345678e+23
528 cycles for float$ Real8     1.2345678e+123
530 cycles for float$ Real10    1.2345678e+123
1046 cycles for Ray's lib        1.234567890100000E+0123
6075 cycles for sprintf         1.23457e+123

---------
547 cycles for FloatToStr       -1.234568e-123
488 cycles for float$ Real4     -1.234567e-23
530 cycles for float$ Real8     -1.234567e-123
530 cycles for float$ Real10    -1.234567e-123
970 cycles for Ray's lib        -0.000000
6004 cycles for sprintf         -1.23457e-123

---------
10 cycles for FloatToStr        0
82 cycles for float$ Real4      0.0
84 cycles for float$ Real8      0.0
101 cycles for float$ Real10    0.0
401 cycles for Ray's lib         0
808 cycles for sprintf          0

AMD x2 x64 4000+ / WinXP32

Title: Re: float$ macro and algo for testing
Post by: dsouza123 on August 27, 2008, 09:35:29 PM

Athlon 1172 Mhz XP Pro 32-bit

Code Select


  ------- New float$ macro: ------------------
  Divide  MyReal10 (=1.2345678e9)
  by      12345678
  add     11
  Result= 111.00000
  -- This para printed by one line of code! --
  
  
  Code sizes and FPU register preservation:
  FloatToStr      size=895, ST 6-8 trashed
  float$          size=689, all ST regs preserved
  Ray's lib       size=700, all ST regs preserved
  crt sprintf     size=???, all ST regs preserved
  
  finit is ON
  565 cycles for FloatToStr       1.234568e-003
  508 cycles for float$ Real4     0.0012345
  487 cycles for float$ Real8     0.0012345
  492 cycles for float$ Real10    0.0012345
  551 cycles for float$ edi       2.1474836e+09
  990 cycles for Ray's lib         0.001235
  4862 cycles for sprintf         0.00123457
  
  ---------
  473 cycles for FloatToStr       0.1234568
  483 cycles for float$ Real4     0.1234567
  489 cycles for float$ Real8     0.1234567
  482 cycles for float$ Real10    0.1234567
  969 cycles for Ray's lib         0.123457
  4074 cycles for sprintf         0.123457
  
  ---------
  486 cycles for FloatToStr       1.234568
  548 cycles for float$ Real4     1.2345678
  520 cycles for float$ Real8     1.2345678
  516 cycles for float$ Real10    1.2345678
  1003 cycles for Ray's lib        1.234568
  4732 cycles for sprintf         1.23457
  
  ---------
  484 cycles for FloatToStr       1234.568
  523 cycles for float$ Real4     1234.5678
  508 cycles for float$ Real8     1234.5678
  505 cycles for float$ Real10    1234.5678
  1008 cycles for Ray's lib        1234.567890
  4756 cycles for sprintf         1234.57
  
  ---------
  581 cycles for FloatToStr       1.234568e+123
  550 cycles for float$ Real4     1.2345678e+23
  598 cycles for float$ Real8     1.2345678e+123
  580 cycles for float$ Real10    1.2345678e+123
  1049 cycles for Ray's lib        1.234567890100000E+0123
  6463 cycles for sprintf         1.23457e+123
  
  ---------
  567 cycles for FloatToStr       -1.234568e-123
  544 cycles for float$ Real4     -1.234567e-23
  580 cycles for float$ Real8     -1.234567e-123
  578 cycles for float$ Real10    -1.234567e-123
  977 cycles for Ray's lib        -0.000000
  6422 cycles for sprintf         -1.23457e-123
  
  ---------
  13 cycles for FloatToStr        0
  105 cycles for float$ Real4     0.0
  107 cycles for float$ Real8     0.0
  73 cycles for float$ Real10     0.0
  360 cycles for Ray's lib         0
  915 cycles for sprintf          0

Title: Re: float$ macro and algo for testing
Post by: jj2007 on August 27, 2008, 09:55:12 PM

Thanks for testing this. In the meantime, I have refined the algo a little bit, improving inter alia the compatibility to FloatToStr and fixing a rounding bug. Timings are still slightly below FloatToStr on my Celeron M (=Yonah), especially in the "ordinary" range of the 123.456 type:

Code Select

Code sizes and FPU register preservation:
FloatToStr      size=895, ST 6-8 trashed
float$          size=700, all ST regs preserved
Ray's lib       size=700, all ST regs preserved
crt sprintf     size=???, all ST regs preserved

finit is ON
426 cycles for FloatToStr       1.234568e-004
457 cycles for float$ Real4     1.234568e-05    <--- not a bug; I deliberately
449 cycles for float$ Real8     1.234568e-04    put e-5, e-4, e-3 to demonstrate
436 cycles for float$ Real10    0.001234568     the change from scientific to normal notation
1082 cycles for Ray's lib       0.001235
4219 cycles for sprintf         0.000123457

---------
430 cycles for FloatToStr       0.1234568
432 cycles for float$ Real4     0.1234568
433 cycles for float$ Real8     0.1234568
432 cycles for float$ Real10    0.1234568
1083 cycles for Ray's lib       0.123457
3569 cycles for sprintf         0.123457

---------
598 cycles for FloatToStr       1.234568
441 cycles for float$ Real4     1.234568
434 cycles for float$ Real8     1.234568
440 cycles for float$ Real10    1.234568
1081 cycles for Ray's lib       1.234568
4260 cycles for sprintf         1.23457

---------
598 cycles for FloatToStr       1234.568
438 cycles for float$ Real4     1234.568
435 cycles for float$ Real8     1234.568
439 cycles for float$ Real10    1234.568
1081 cycles for Ray's lib       1234.567890
4340 cycles for sprintf         1234.57

---------
475 cycles for FloatToStr       1.234568e+123
442 cycles for float$ Real4     1.234568e+23
456 cycles for float$ Real8     1.234568e+123
465 cycles for float$ Real10    1.234568e+123
1140 cycles for Ray's lib       1.234567890123456E+0123
5739 cycles for sprintf         1.23457e+123

---------
443 cycles for FloatToStr       -1.234568e-123
448 cycles for float$ Real4     -1.234568e-23    <-- -123 would have been beyond the R4 range, so I put 23
463 cycles for float$ Real8     -1.234568e-123
460 cycles for float$ Real10    -1.234568e-123
1080 cycles for Ray's lib       -0.000000
5810 cycles for sprintf         -1.23457e-123

---------
12 cycles for FloatToStr        0
58 cycles for float$ Real4      0.0
56 cycles for float$ Real8      0.0
59 cycles for float$ Real10     0.0
340 cycles for Ray's lib        0
667 cycles for sprintf          0

float$ isn't any better in the "ordinary" range, but FloatToStr is 20% slower in this range.
Precision can be configured at assembly time (3-15 digits), as well as the breakpoint for switching from 0.001 to 1.00e-3

For the assembly purists and professional macro haters:

Code Select

------- New float$ macro: ------------------
Divide  MyReal10 (=1.2345678e9)
by      12345678
add     11
Result= 111.0000
-- This para printed by one line of code! --

EDIT: Newest version attached to first post above

Title: Re: float$ macro and algo for testing
Post by: Mark Jones on August 31, 2008, 04:38:07 PM

Code Select


------- New float$ macro: ------------------
Divide  MyReal10 (=1.2345678e9)
by      12345678
add     11
Result= 111.0000
-- This para printed by one line of code! --


Code sizes and FPU register preservation:
FloatToStr      size=895, ST 6-8 trashed
float$          size=738, all ST regs preserved
Ray's lib       size=700, all ST regs preserved
crt sprintf     size=???, all ST regs preserved

finit is ON
541 cycles for FloatToStr       1.234568e-004
473 cycles for float$ Real4     1.234568e-05
490 cycles for float$ Real8     1.234568e-04
457 cycles for float$ Real10    0.001234568
971 cycles for Ray's lib        0.001235
4275 cycles for sprintf         0.0001234568

---------
456 cycles for FloatToStr       0.1234568
431 cycles for float$ Real4     0.1234568
432 cycles for float$ Real8     0.1234568
433 cycles for float$ Real10    0.1234568
952 cycles for Ray's lib        0.123457
3643 cycles for sprintf         0.1234568

---------
453 cycles for FloatToStr       1.234568
452 cycles for float$ Real4     1.234568
451 cycles for float$ Real8     1.234568
455 cycles for float$ Real10    1.234568
979 cycles for Ray's lib        1.234568
4344 cycles for sprintf         1.234568

---------
459 cycles for FloatToStr       1234.568
456 cycles for float$ Real4     1234.568
455 cycles for float$ Real8     1234.568
460 cycles for float$ Real10    1234.568
981 cycles for Ray's lib        1234.567890
4146 cycles for sprintf         1234.568

---------
578 cycles for FloatToStr       1.234568e+123
473 cycles for float$ Real4     1.234568e+23
515 cycles for float$ Real8     1.234568e+123
517 cycles for float$ Real10    1.234568e+123
1030 cycles for Ray's lib       1.234567890123457E+0123
5794 cycles for sprintf         1.234568e+123

---------
547 cycles for FloatToStr       -1.234568e-123
474 cycles for float$ Real4     -1.234568e-23
515 cycles for float$ Real8     -1.234568e-123
520 cycles for float$ Real10    -1.234568e-123
953 cycles for Ray's lib        -0.000000
5860 cycles for sprintf         -1.234568e-123

---------
9 cycles for FloatToStr 0
83 cycles for float$ Real4      0
93 cycles for float$ Real8      0
86 cycles for float$ Real10     0
374 cycles for Ray's lib        0
855 cycles for sprintf          0

Title: Re: float$ macro and algo for testing
Post by: jj2007 on September 01, 2008, 06:51:08 PM

Quote from: jj2007 on August 27, 2008, 09:55:12 PM
float$ isn't any better in the "ordinary" range, but FloatToStr is 20% slower in this range.

Not for your machine, Mark...

Quote from: Mark Jones on August 31, 2008, 04:38:07 PM
Code Select Expand
--------- 453 cycles for FloatToStr 1.234568 452 cycles for float$ Real4 1.234568 451 cycles for float$ Real8 1.234568 455 cycles for float$ Real10 1.234568 --------- 459 cycles for FloatToStr 1234.568 456 cycles for float$ Real4 1234.568 455 cycles for float$ Real8 1234.568 460 cycles for float$ Real10 1234.568

Title: Re: float$ macro and algo for testing
Post by: herge on September 01, 2008, 07:02:03 PM

Hi ALL:

Code Select



------- New float$ macro: ---------------------
Divide	MyReal10	(=1.2345678e9)
by	12345678	(=1.2e7, in eax)
add	 11.1111   	(an immediate real)
Result=	111.1111
-- This para printed by one line of code! -----


Code sizes and FPU register preservation:
FloatToStr	size=895, ST 6-8 trashed
float$   	size=771, all ST regs preserved
Ray's lib	size=700, all ST regs preserved
crt sprintf	size=???, all ST regs preserved

finit is ON	Version 1.1, 1 September 2008
424 cycles for FloatToStr	1.234568e-004
435 cycles for float$ Real5	1.234568e-05
441 cycles for float$ Real8	1.234568e-04
435 cycles for float$ Real10	0.001234568
1194 cycles for Ray's lib	0.001235
4198 cycles for sprintf  	0.0001234568

---------
458 cycles for FloatToStr	0.1234568
436 cycles for float$ Real5	0.1234568
432 cycles for float$ Real8	0.1234568
432 cycles for float$ Real10	0.1234568
1197 cycles for Ray's lib	0.123457
3418 cycles for sprintf  	0.1234568

---------
655 cycles for FloatToStr	1.234568
430 cycles for float$ Real5	1.234568
432 cycles for float$ Real8	1.234568
431 cycles for float$ Real10	1.234568
1199 cycles for Ray's lib	1.234568
4559 cycles for sprintf  	1.234568

---------
649 cycles for FloatToStr	1234.568
441 cycles for float$ Real5	1234.568
434 cycles for float$ Real8	1234.568
431 cycles for float$ Real10	1234.568
1246 cycles for Ray's lib	1234.567890
4422 cycles for sprintf  	1234.568

---------
466 cycles for FloatToStr	1.234568e+123
431 cycles for float$ Real5	1.234568e+23
448 cycles for float$ Real8	1.234568e+123
448 cycles for float$ Real10	1.234568e+123
1230 cycles for Ray's lib	1.234567890123457E+0123
5373 cycles for sprintf  	1.234568e+123

---------
434 cycles for FloatToStr	-1.234568e-123
431 cycles for float$ Real5	-1.234568e-23
450 cycles for float$ Real8	-1.234568e-123
448 cycles for float$ Real10	-1.234568e-123
1169 cycles for Ray's lib	-0.000000
5746 cycles for sprintf  	-1.234568e-123

---------
11 cycles for FloatToStr	0
68 cycles for float$ Real5	0
66 cycles for float$ Real8	0
59 cycles for float$ Real10	0
402 cycles for Ray's lib	0
588 cycles for sprintf  	0

Regards herge

Title: Re: float$ macro and algo for testing
Post by: jj2007 on September 12, 2008, 09:09:37 AM

New version attached, enhanced with drizz qwtoa algo (http://www.masm32.com/board/index.php?topic=9857.15).
Timings on Core 2 Celeron M below. The bad news: It's now 18 bytes longer than the old FloatToStr, and is still much slower if the number to print is zero. Otherwise, I am quite satisfied. Can somebody test it on a non-Core 2 please? Thanx.

Code Select

Code sizes and FPU register preservation:
FloatToStr      size=895, ST 6-8 trashed
float$          size=913, all ST regs preserved
Ray's lib       size=700, all ST regs preserved
crt sprintf     size=???, all ST regs preserved

finit is ON     Version 1.2, 12 September 2008
Credits to drizz for the qwtoa algo

493 cycles for FloatToStr       1.234568e-004
353 cycles for float$ REAL4     1.23456e-05
347 cycles for float$ REAL8     1.23456e-04
345 cycles for float$ REAL10    1.23456e-03
1115 cycles for Ray's lib       0.001235
4490 cycles for sprintf         0.0001234568

---------
492 cycles for FloatToStr       0.1234568
325 cycles for float$ REAL4     0.1234568
319 cycles for float$ REAL8     0.1234568
319 cycles for float$ REAL10    0.1234568
1110 cycles for Ray's lib       0.123457
3620 cycles for sprintf         0.1234568

---------
658 cycles for FloatToStr       1.234568
329 cycles for float$ REAL4     1.234568
316 cycles for float$ REAL8     1.234568
323 cycles for float$ REAL10    1.234568
1114 cycles for Ray's lib       1.234568
4530 cycles for sprintf         1.234568

---------
653 cycles for FloatToStr       1234.568
325 cycles for float$ REAL4     1234.568
322 cycles for float$ REAL8     1234.568
322 cycles for float$ REAL10    1234.568
1113 cycles for Ray's lib       1234.567890
4555 cycles for sprintf         1234.568

---------
526 cycles for FloatToStr       1.234568e+123
343 cycles for float$ REAL4     1.23456e+23
370 cycles for float$ REAL8     1.23456e+123
366 cycles for float$ REAL10    1.23456e+123
1175 cycles for Ray's lib       1.234567890123457E+0123
5925 cycles for sprintf         1.234568e+123

---------
503 cycles for FloatToStr       -1.234568e-123
359 cycles for float$ REAL4     -1.23456e-23
394 cycles for float$ REAL8     -1.23456e-123
388 cycles for float$ REAL10    -1.23456e-123
1124 cycles for Ray's lib       -0.000000
6085 cycles for sprintf         -1.234568e-123

---------
11 cycles for FloatToStr        0
125 cycles for float$ REAL4     0
125 cycles for float$ REAL8     0
124 cycles for float$ REAL10    0
349 cycles for Ray's lib        0
695 cycles for sprintf          0

[attachment deleted by admin]

Title: Re: float$ macro and algo for testing
Post by: herge on September 12, 2008, 09:17:56 AM

Hi ALL:

Code Select



Test variable precision (default: 7 digits)
0.0012345:	1.23456e-03
0.012345:	0.01234568
0.12345:	0.1234568
1.2345:      	1.234568
1.2345:      	1.
1.2345:      	1.2
1.2345:      	1.23
1.2345:      	1.235
1.2345:      	1.2346
1.2345:      	1.23457
1.2345:      	1.234568
1.2345:      	1.2345679
1.2345:      	1.23456789
1.2345:      	1.234567890
1.2345:      	1.2345678901
1.2345:      	1.23456789012
1.2345:      	1.234567890123
1.2345:      	1.2345678901235
1.2345:      	1.23456789012346
12.345:      	12.34568
12345.678:	12345.68
1.2345e+9:	1.23456e+09
1.2345e+12:	1.23456e+12
1.2345e+15:	1.23456e+15

Code sizes and FPU register preservation:
FloatToStr	size=895, ST 6-8 trashed
float$   	size=913, all ST regs preserved
Ray's lib	size=700, all ST regs preserved
crt sprintf	size=???, all ST regs preserved

finit is ON	Version 1.2, 12 September 2008
Credits to drizz for the qwtoa algo

464 cycles for FloatToStr	1.234568e-004
305 cycles for float$ REAL4	1.23456e-05
308 cycles for float$ REAL8	1.23456e-04
299 cycles for float$ REAL10	1.23456e-03
1202 cycles for Ray's lib	0.001235
4196 cycles for sprintf  	0.0001234568

---------
502 cycles for FloatToStr	0.1234568
281 cycles for float$ REAL4	0.1234568
284 cycles for float$ REAL8	0.1234568
273 cycles for float$ REAL10	0.1234568
1185 cycles for Ray's lib	0.123457
3413 cycles for sprintf  	0.1234568

---------
692 cycles for FloatToStr	1.234568
281 cycles for float$ REAL4	1.234568
283 cycles for float$ REAL8	1.234568
277 cycles for float$ REAL10	1.234568
1184 cycles for Ray's lib	1.234568
4555 cycles for sprintf  	1.234568

---------
684 cycles for FloatToStr	1234.568
284 cycles for float$ REAL4	1234.568
286 cycles for float$ REAL8	1234.568
287 cycles for float$ REAL10	1234.568
1243 cycles for Ray's lib	1234.567890
4386 cycles for sprintf  	1234.568

---------
505 cycles for FloatToStr	1.234568e+123
299 cycles for float$ REAL4	1.23456e+23
324 cycles for float$ REAL8	1.23456e+123
316 cycles for float$ REAL10	1.23456e+123
1254 cycles for Ray's lib	1.234567890123457E+0123
5454 cycles for sprintf  	1.234568e+123

---------
485 cycles for FloatToStr	-1.234568e-123
310 cycles for float$ REAL4	-1.23456e-23
361 cycles for float$ REAL8	-1.23456e-123
355 cycles for float$ REAL10	-1.23456e-123
1196 cycles for Ray's lib	-0.000000
5834 cycles for sprintf  	-1.234568e-123

---------
10 cycles for FloatToStr	0
132 cycles for float$ REAL4	0
134 cycles for float$ REAL8	0
128 cycles for float$ REAL10	0
398 cycles for Ray's lib	0
587 cycles for sprintf  	0

Regards herge

Title: Re: float$ macro and algo for testing
Post by: Mark Jones on September 12, 2008, 12:10:52 PM

AMD x2 Dual 4GHz x64 on WinXPx32:

Code Select


Test variable precision (default: 7 digits)
0.0012345:      1.23456e-03
0.012345:       0.01234568
0.12345:        0.1234568
1.2345:         1.234568
1.2345:         1.
1.2345:         1.2
1.2345:         1.23
1.2345:         1.235
1.2345:         1.2346
1.2345:         1.23457
1.2345:         1.234568
1.2345:         1.2345679
1.2345:         1.23456789
1.2345:         1.234567890
1.2345:         1.2345678901
1.2345:         1.23456789012
1.2345:         1.234567890123
1.2345:         1.2345678901235
1.2345:         1.23456789012346
12.345:         12.34568
12345.678:      12345.68
1.2345e+9:      1.23456e+09
1.2345e+12:     1.23456e+12
1.2345e+15:     1.23456e+15

Code sizes and FPU register preservation:
FloatToStr      size=895, ST 6-8 trashed
float$          size=913, all ST regs preserved
Ray's lib       size=700, all ST regs preserved
crt sprintf     size=???, all ST regs preserved

finit is ON     Version 1.2, 12 September 2008
Credits to drizz for the qwtoa algo

540 cycles for FloatToStr       1.234568e-004
461 cycles for float$ REAL4     1.23456e-05
476 cycles for float$ REAL8     1.23456e-04
479 cycles for float$ REAL10    1.23456e-03
1000 cycles for Ray's lib       0.001235
4180 cycles for sprintf         0.0001234568

---------
455 cycles for FloatToStr       0.1234568
384 cycles for float$ REAL4     0.1234568
384 cycles for float$ REAL8     0.1234568
386 cycles for float$ REAL10    0.1234568
978 cycles for Ray's lib        0.123457
3455 cycles for sprintf         0.1234568

---------
470 cycles for FloatToStr       1.234568
425 cycles for float$ REAL4     1.234568
424 cycles for float$ REAL8     1.234568
427 cycles for float$ REAL10    1.234568
1011 cycles for Ray's lib       1.234568
4177 cycles for sprintf         1.234568

---------
458 cycles for FloatToStr       1234.568
404 cycles for float$ REAL4     1234.568
421 cycles for float$ REAL8     1234.568
412 cycles for float$ REAL10    1234.568
1008 cycles for Ray's lib       1234.567890
4111 cycles for sprintf         1234.568

---------
564 cycles for FloatToStr       1.234568e+123
458 cycles for float$ REAL4     1.23456e+23
501 cycles for float$ REAL8     1.23456e+123
504 cycles for float$ REAL10    1.23456e+123
1056 cycles for Ray's lib       1.234567890123457E+0123
5695 cycles for sprintf         1.234568e+123

---------
546 cycles for FloatToStr       -1.234568e-123
459 cycles for float$ REAL4     -1.23456e-23
503 cycles for float$ REAL8     -1.23456e-123
505 cycles for float$ REAL10    -1.23456e-123
978 cycles for Ray's lib        -0.000000
5679 cycles for sprintf         -1.234568e-123

---------
9 cycles for FloatToStr 0
132 cycles for float$ REAL4     0
131 cycles for float$ REAL8     0
134 cycles for float$ REAL10    0
398 cycles for Ray's lib        0
842 cycles for sprintf          0

Title: Re: float$ macro and algo for testing
Post by: jj2007 on September 12, 2008, 01:40:01 PM

Thanxalot, Herge & Mark!

Title: Re: float$ macro and algo for testing
Post by: sinsi on September 12, 2008, 02:03:49 PM

Hey jj can you reduce the size of the posts? There's a lot of b i g posts here...and here's another.
Q6600 2.4GHz - I can test it on an Athlon 2600+ if you like (it's a small hassle).

Code Select


Credits to drizz for the qwtoa algo

462 cycles for FloatToStr       1.234568e-004
295 cycles for float$ REAL4     1.23456e-05
300 cycles for float$ REAL8     1.23456e-04
299 cycles for float$ REAL10    1.23456e-03
1174 cycles for Ray's lib       0.001235
4132 cycles for sprintf         0.0001234568

---------
504 cycles for FloatToStr       0.1234568
278 cycles for float$ REAL4     0.1234568
277 cycles for float$ REAL8     0.1234568
272 cycles for float$ REAL10    0.1234568
1178 cycles for Ray's lib       0.123457
3341 cycles for sprintf         0.1234568

---------
682 cycles for FloatToStr       1.234568
279 cycles for float$ REAL4     1.234568
277 cycles for float$ REAL8     1.234568
278 cycles for float$ REAL10    1.234568
1185 cycles for Ray's lib       1.234568
4577 cycles for sprintf         1.234568

---------
679 cycles for FloatToStr       1234.568
277 cycles for float$ REAL4     1234.568
283 cycles for float$ REAL8     1234.568
282 cycles for float$ REAL10    1234.568
1239 cycles for Ray's lib       1234.567890
4306 cycles for sprintf         1234.568

---------
499 cycles for FloatToStr       1.234568e+123
300 cycles for float$ REAL4     1.23456e+23
317 cycles for float$ REAL8     1.23456e+123
311 cycles for float$ REAL10    1.23456e+123
1241 cycles for Ray's lib       1.234567890123457E+0123
5340 cycles for sprintf         1.234568e+123

---------
484 cycles for FloatToStr       -1.234568e-123
302 cycles for float$ REAL4     -1.23456e-23
351 cycles for float$ REAL8     -1.23456e-123
350 cycles for float$ REAL10    -1.23456e-123
1189 cycles for Ray's lib       -0.000000
5708 cycles for sprintf         -1.234568e-123

---------
9 cycles for FloatToStr 0
126 cycles for float$ REAL4     0
128 cycles for float$ REAL8     0
127 cycles for float$ REAL10    0
398 cycles for Ray's lib        0
585 cycles for sprintf          0

Funny how we use 'code' tags to quote something...

Title: Re: float$ macro and algo for testing
Post by: jj2007 on September 12, 2008, 02:16:55 PM

Quote from: sinsi on September 12, 2008, 02:03:49 PM
Hey jj can you reduce the size of the posts? There's a lot of b i g posts here...and here's another.

ok, next version will output only the "typical" range, i.e. 123.456 etc.; although it's unfair to the MasmLib FloatToStr, which is particularly slow in that range.

Quote
Q6600 2.4GHz - I can test it on an Athlon 2600+ if you like (it's a small hassle).

Yes please. The AMD seems to show the smallest improvement against the MasmLib algo.
I could shave off a few cycles by cutting down the macro, but then... I wanted to have this feature:

Code Select

.data
Sales2006	REAL8 300.0
Sales2007	REAL8 309.6
.code
	print float$('\nMarketing report:\nSales were up %2f% in 2007\n', Sales2007/Sales2006-1*100)

Output:

Code Select

Marketing report:
Sales were up 3.2% in 2007

:bg

Title: Re: float$ macro and algo for testing
Post by: sinsi on September 12, 2008, 02:46:50 PM

Your wish is my command...
Athlon XP 2600+ at 2.13GHz, XP Home SP3, 1.5GB RAM

Code Select


584 cycles for FloatToStr       1.234568e-004
489 cycles for float$ REAL4     1.23456e-05
525 cycles for float$ REAL8     1.23456e-04
528 cycles for float$ REAL10    1.23456e-03
1017 cycles for Ray's lib       0.001235
4811 cycles for sprintf         0.0001234568

---------
502 cycles for FloatToStr       0.1234568
416 cycles for float$ REAL4     0.1234568
407 cycles for float$ REAL8     0.1234568
408 cycles for float$ REAL10    0.1234568
1006 cycles for Ray's lib       0.123457
3915 cycles for sprintf         0.1234568

---------
520 cycles for FloatToStr       1.234568
453 cycles for float$ REAL4     1.234568
450 cycles for float$ REAL8     1.234568
442 cycles for float$ REAL10    1.234568
1023 cycles for Ray's lib       1.234568
4638 cycles for sprintf         1.234568

---------
505 cycles for FloatToStr       1234.568
427 cycles for float$ REAL4     1234.568
433 cycles for float$ REAL8     1234.568
430 cycles for float$ REAL10    1234.568
1035 cycles for Ray's lib       1234.567890
4612 cycles for sprintf         1234.568

---------
601 cycles for FloatToStr       1.234568e+123
482 cycles for float$ REAL4     1.23456e+23
540 cycles for float$ REAL8     1.23456e+123
520 cycles for float$ REAL10    1.23456e+123
1068 cycles for Ray's lib       1.234567890123457E+0123
6097 cycles for sprintf         1.234568e+123

---------
592 cycles for FloatToStr       -1.234568e-123
512 cycles for float$ REAL4     -1.23456e-23
548 cycles for float$ REAL8     -1.23456e-123
552 cycles for float$ REAL10    -1.23456e-123
1002 cycles for Ray's lib       -0.000000
6164 cycles for sprintf         -1.234568e-123

---------
12 cycles for FloatToStr        0
143 cycles for float$ REAL4     0
141 cycles for float$ REAL8     0
142 cycles for float$ REAL10    0
361 cycles for Ray's lib        0
941 cycles for sprintf          0

Not as bad as I thought.

jj, I am in the mood to try and overclock my quadcore beastie.
It seems that a lot of these speed tests rely on clock speed.
What do you reckon? Write the code and I will roll with the hoops...

Title: Re: float$ macro and algo for testing
Post by: jj2007 on September 12, 2008, 07:49:01 PM

Quote from: sinsi on September 12, 2008, 02:46:50 PM
Your wish is my command...
Athlon XP 2600+ at 2.13GHz, XP Home SP3, 1.5GB RAM

Grazie!

Quote
It seems that a lot of these speed tests rely on clock speed.

Normally Michael's counterXX macro should produce cycles independently of clock speed. But I might be wrong.

Attached one more for the road. Little bug fixes (the expo was one position too far on the left), 10% less size, 10% more speed.

EDIT: Obsolete - see end of page 2 for March 30 version

[attachment deleted by admin]

Title: Re: float$ macro and algo for testing
Post by: sinsi on September 13, 2008, 03:21:02 AM

Tried it at 2.4 (normal), 2.6, 2.8 and 3GHz with no real differences, so clock speed doesn't have anything to do with it I guess.
Here's the latest at 2.4

Code Select


428 cycles for FloatToStr       1.234568e-004
282 cycles for float$ REAL4     1.234568e-05
256 cycles for float$ REAL8     0.0001234568
267 cycles for float$ REAL10    0.001234568
1171 cycles for Ray's lib       0.001235
4061 cycles for sprintf         0.0001234568

---------
633 cycles for FloatToStr       1.234568
265 cycles for float$ REAL4     1.234568
272 cycles for float$ REAL8     1.234568
273 cycles for float$ REAL10    1.234568
1191 cycles for Ray's lib       1.234568
4456 cycles for sprintf         1.234568

---------
635 cycles for FloatToStr       1234.568
260 cycles for float$ REAL4     1234.568
263 cycles for float$ REAL8     1234.568
264 cycles for float$ REAL10    1234.568
1233 cycles for Ray's lib       1234.567890
4261 cycles for sprintf         1234.568

[attachment deleted by admin]

Title: Re: float$ macro and algo for testing
Post by: jj2007 on September 13, 2008, 06:27:45 AM

Quote from: sinsi on September 13, 2008, 03:21:02 AM
Tried it at 2.4 (normal), 2.6, 2.8 and 3GHz with no real differences, so clock speed doesn't have anything to do with it I guess.

Quod erat demonstrandum :U

Quote
Code Select Expand
633 cycles for FloatToStr 1.234568 265 cycles for float$ REAL4 1.234568

I love it. Unfortunately, I am no good at SSE2...

Title: float$ demo
Post by: jj2007 on September 14, 2008, 10:50:38 PM

After quite a bit of testing, here the float$ macro for casual use (attached - the usual disclaimers apply). It is pretty flexible, does not trash any FPU registers, and in its basic variant it is almost twice as fast as the MasmLib FloatToStr routine; not to mention the C++ sprintf routine, which is a factor 15 slower:

Code Select

282 cycles for 4*float$         1234.568
618 cycles for 4*FloatToStr     1234.568

float$: reg32, Real4, Real8, Real10
MasmLib FloatToStr: 4*Real8

617 cycles for FloatToStr       1.234568
277 cycles for float$ REAL4     1.234568
258 cycles for float$ REAL8     1.234568
277 cycles for float$ REAL10    1.234568
1113 cycles for Ray's lib       1.234568
4533 cycles for sprintf         1.234568

Code sizes and FPU register preservation:
FloatToStr      size=895, ST 6-8 trashed
float$          size=823, all ST regs preserved
Ray's lib       size=700, all ST regs preserved
crt sprintf     size=???, all ST regs preserved

Usage:
   Basic:
   print float$(MyReal10)
   print float$(MyInt32)
   mov al, 123
   print float$(al)

   Simple:
   print float$("The number %f is very high", MyReal10)

   Four digits precision (n=1-15, use uppercase A-F for 10-15 digits):
   print float$("The number %4f is very high", MyReal10)

   Simple calculations: You can mix up to 5 registers (1-4 bytes), integers, immediate numbers, local and global variables:
   mov ecx, 3000   ; Caution: edx cannot be used here, eax not after an immediate integer
   print float$("Divide ecx by 10, add Sales2005,\nadd 10, mul 111.111;\nresult=%f", ecx/10+@Sales2005+10*111.1111)
   use %f (or %Af, 10 digits precision) as placeholder for the number; use \n for newline, \t for tab

EDIT: Obsolete - see end of page 2 for March 30 version

[attachment deleted by admin]

Title: float$ macro
Post by: jj2007 on March 28, 2009, 01:31:40 PM

Updated with slightly higher precision:

PI = 3.14159265358979324

Extract all to a temporary folder, then double-click on Float2Asc_INSTALL.bat

Precision will be somewhat lower for very high and very low exponents.
The usual disclaimers apply.

EDIT: Obsolete - see end of page 2

[attachment deleted by admin]

Title: Re: float$ macro and algo for testing
Post by: herge on March 28, 2009, 07:35:41 PM

Hi jj2007:

The latest results from My Computer.

Code Select

 Saturday, March 28, 2009 3:32 PM 
------- New float$ macro: ---------------------
Divide	MyReal10	(=1.2345678e9)
by	12345678	(=1.2e7, in eax)
add	 11.1111   	(an immediate real)
Result=	111.1111
-- This para printed by one line of code! -----


Code sizes and FPU register preservation:
FloatToStr	size=895, ST 6-8 trashed
float$   	size=771, all ST regs preserved
Ray's lib	size=700, all ST regs preserved
crt sprintf	size=???, all ST regs preserved

finit is ON	Version 1.1, 1 September 2008
415 cycles for FloatToStr	1.234568e-004
428 cycles for float$ Real5	1.234568e-05
429 cycles for float$ Real8	1.234568e-04
424 cycles for float$ Real10	0.001234568
1167 cycles for Ray's lib	0.001235
4103 cycles for sprintf  	0.0001234568

---------
448 cycles for FloatToStr	0.1234568
424 cycles for float$ Real5	0.1234568
422 cycles for float$ Real8	0.1234568
421 cycles for float$ Real10	0.1234568
1169 cycles for Ray's lib	0.123457
3323 cycles for sprintf  	0.1234568

---------
634 cycles for FloatToStr	1.234568
425 cycles for float$ Real5	1.234568
424 cycles for float$ Real8	1.234568
422 cycles for float$ Real10	1.234568
1169 cycles for Ray's lib	1.234568
4468 cycles for sprintf  	1.234568

---------
634 cycles for FloatToStr	1234.568
426 cycles for float$ Real5	1234.568
426 cycles for float$ Real8	1234.568
424 cycles for float$ Real10	1234.568
1234 cycles for Ray's lib	1234.567890
4346 cycles for sprintf  	1234.568

---------
461 cycles for FloatToStr	1.234568e+123
431 cycles for float$ Real5	1.234568e+23
448 cycles for float$ Real8	1.234568e+123
449 cycles for float$ Real10	1.234568e+123
1228 cycles for Ray's lib	1.234567890123457E+0123
5335 cycles for sprintf  	1.234568e+123

---------
430 cycles for FloatToStr	-1.234568e-123
430 cycles for float$ Real5	-1.234568e-23
447 cycles for float$ Real8	-1.234568e-123
448 cycles for float$ Real10	-1.234568e-123
1164 cycles for Ray's lib	-0.000000
5688 cycles for sprintf  	-1.234568e-123

---------
10 cycles for FloatToStr	0
66 cycles for float$ Real5	0
65 cycles for float$ Real8	0
57 cycles for float$ Real10	0
400 cycles for Ray's lib	0
580 cycles for sprintf  	0

Regards herge

Title: Re: float$ macro and algo for testing
Post by: herge on March 30, 2009, 01:10:09 AM

Hi jj2207:

I thought PI was a constant you loaded in to the FPU.

Code Select

 D9 EB FLDPI

Regards herge

Title: Re: float$ macro and algo for testing
Post by: UtillMasm on March 30, 2009, 07:34:35 AM

Question 1
C:\FloatStr.asm
This file contains characters which can be lost in current encoding.
Do you want to select one of other encoding options?
==================================
Windows-936 is not right
Line: 1656
; ?
Windows-1252 is ok?
Line: 1656
; «««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««
Question 2
==================================
Line: 1659
OPT_Linker   link
OPT_Icon   Calc
OPT_WAIT   1
OPT_Susy   CONSOLE
OPT_Tmp2Asm   1
What's mean these lines?

Title: Re: float$ macro and algo for testing
Post by: jj2007 on March 30, 2009, 08:39:56 AM

Quote from: herge on March 30, 2009, 01:10:09 AM
Hi jj2207:

I thought PI was a constant you loaded in to the FPU.

Code Select Expand
D9 EB FLDPI

Regards herge

Hi Herge,
You are right, it could be done that way, but I don't see an elegant way to tell the Float2Asc proc that it should take PI instead of a "normal" number. It would take an extra magic number and/or a branch - too much of a hussle. That's why I chose a REAL10 in .data to show PI.

Quote from: UtillMasm on March 30, 2009, 07:34:35 AM
Question 1
C:\FloatStr.asm
This file contains characters which can be lost in current encoding.
Do you want to select one of other encoding options?
==================================
Windows-936 is not right
Line: 1656
; ?
Windows-1252 is ok?
Line: 1656

Hi UtillMasm,

Which editor/IDE/assembler are you using? The float$ macro uses ° (Ascii 176). I never saw this error message using RichMasm (http://www.masm32.com/board/index.php?topic=9044.msg73814#msg73814), and the usual assemblers (ml, JWasm) swallow it without problems.

Quote

Question 2
==================================
Line: 1659
OPT_Linker   link
OPT_Icon   Calc
OPT_WAIT   1
OPT_Susy   CONSOLE
OPT_Tmp2Asm   1
What's mean these lines?

RichMasm options specifying which linker to use, an icon, subsystem CONSOLE, and whether you want to keep the (otherwise temporary) FloatStr.asm file.

Obsolete, see next post: Since both Herge and UtillMasm downloaded a much older version, I have updated everything and attach it here. Unzip to a temporary folder and double-click on Float2Asc_INSTALL.bat (the usual disclaimers apply).

[attachment deleted by admin]

Title: Re: float$ macro and algo for testing
Post by: jj2007 on July 06, 2009, 12:09:08 AM

float$ was somewhat inexact for numbers close to 10^n, so I rewrote it and called it Str$.
Download it from this thread (http://www.masm32.com/board/index.php?topic=11781.msg89037#msg89037).

Title: Re: float$ macro and algo for testing
Post by: ToutEnMasm on July 06, 2009, 05:33:13 AM

The math.sdk file (vc++2008 translate) have some important constants value defined.

Quote
M_PI equ < 3.14159265358979323846>
jj2007 PI = 3.14159265358979324

Better is to use this file . It can be added to windows.inc with the translate.inc coming before any .sdk file.
There is need to verify that the masm32 float fonctions follow the IEE specifications.I am not sure of that (some features can be missing).
The IEEE standard for floating point arithmetic
http://www.psc.edu/general/software/packages/ieee/ieee.php

Title: Re: float$ macro and algo for testing
Post by: jj2007 on July 06, 2009, 06:22:17 AM

Quote from: ToutEnMasm on July 06, 2009, 05:33:13 AM

The math.sdk file (vc++2008 translate) have some important constants value defined.
Quote
M_PI equ < 3.14159265358979323846>
jj2007 PI = 3.14159265358979324
Better is to use this file .

ToutEnMasm,

you should at least try 1. to read posts properly and 2. to understand them.

Title: Re: float$ macro and algo for testing
Post by: MichaelW on July 06, 2009, 06:23:42 AM

AFAIK a REAL10 cannot represent 21 digits. Using TC 3.0, for which the CRT does support an 80-bit long double, this code:

Code Select


#include <math.h>
#include <stdio.h>
int main()
{
    long double pi10 = 3.14159265358979323846L;
    long double pif10;
    double pif8;
    asm {
        fldpi
        fstp pif10
        fldpi
        fstp pif8
    }
    printf( "3.14159265358979323846\n" );
    printf( "%.19Lf\n", (long double) pi10 );
    printf( "%.19Lf\n", (long double) pif10 );
    printf( "%.19f\n", (double) pif8 );
    getch();
    return 0;
}

Produces these results:

Code Select


3.14159265358979323846
3.1415926535897932400
3.1415926535897932400
3.1415926535897931200

Title: Re: float$ macro and algo for testing
Post by: jj2007 on July 06, 2009, 07:03:11 AM

Quote from: MichaelW on July 06, 2009, 06:23:42 AM
AFAIK a REAL10 cannot represent 21 digits....
Produces these results:
Code Select Expand
3.14159265358979323846 3.1415926535897932400 3.1415926535897932400 3.1415926535897931200

Exactly. After a fldpi, Olly shows 3.1415926535897932380. The same identical value is obtained when pushing the value of 3.14159265358979323846 proposed by ToutEnMasm.

My new Str$ produces 3.14159265358979324, which is technically speaking more correct than the 32400 which wrongly pretends a higher precision, see here (http://www.physics.uoguelph.ca/tutorials/sig_fig/SIG_dig.htm), point d.

Title: Re: float$ macro and algo for testing
Post by: ToutEnMasm on July 06, 2009, 07:57:46 AM

To jj2007,
Be cool , i was saying that the crt give all we need to use floating point.
atof,ftoa are very useful.
Add to this the strsafe librarie,and there is a lot of work to do with masm.
Useful comparisons can be made between the two.

Title: Re: float$ macro and algo for testing
Post by: jj2007 on July 06, 2009, 08:06:24 AM

Quote from: ToutEnMasm on July 06, 2009, 07:57:46 AM

To jj2007,
Be cool , i was saying that the crt give all we need to use floating point.
atof,ftoa are very useful.
Add to this the strsafe librarie,and there is a lot of work to do with masm.
Useful comparisons can be made between the two.

I love useful comparisons. Please post some code showing the precision and the speed of your "alternatives". I suggest you use the value of 3.14159265358979323846 that you posted yourself.

Title: Re: float$ macro and algo for testing
Post by: ToutEnMasm on July 06, 2009, 09:12:16 AM

I am not microsoft,it's not my value.It's the value given by the math.h header file.

I this instant , the value can be put in a DT and is translate as this by a 32 bits

Quote
35 c2 68 21 a2 da 0f c9-00 40 ;dump of adress of the dt

view as follow by windbg

Quote
4000c90fdaa22168c235

fld accept the dt
After i need a viewer that can show 20 numbers after the virgul.
I will search

Title: Re: float$ macro and algo for testing
Post by: jj2007 on July 06, 2009, 09:40:48 AM

Quote from: ToutEnMasm on July 06, 2009, 09:12:16 AM

After i need a viewer that can show 20 numbers after the virgul.

Hmmmm.... I thought you had already a "viewer":

Quote from: ToutEnMasm on July 06, 2009, 07:57:46 AM
the crt give all we need to use floating point.

Title: Re: float$ macro and algo for testing
Post by: herge on July 06, 2009, 10:45:18 AM

Hi JJ:

06:44 AM EST July 6 - Monday 2009
I thoght I saw a puddy Cat!

I like your avatar!

Regards herge

Title: Re: float$ macro and algo for testing
Post by: sinsi on July 06, 2009, 10:48:33 AM

Hah! so this is jj2007 vs ToutEnMasm! But we know who always wins between those two avatars... :lol

Title: Re: float$ macro and algo for testing
Post by: ToutEnMasm on July 06, 2009, 12:23:05 PM

Fan of titi ,see this one in french

http://video.google.com/videoplay?docid=2852273791155524323
(http://video.google.com/videoplay?docid=2852273791155524323%3Cbr%20/%3E)

Title: Re: float$ macro and algo for testing
Post by: MichaelW on July 06, 2009, 02:43:13 PM

Borland used the same value:

#define M_PI 3.14159265358979323846

As does MinGW:

#define M_PI 3.14159265358979323846

I think the idea was to exceed the ~19 digits that the FPU can handle and truncate the value at a point were the next digit was < 5.

3.1415926535897932384626433832795...

Title: Re: float$ macro and algo for testing
Post by: dedndave on July 06, 2009, 02:45:56 PM

i thought that, with MASM, you could define it to as many places as you like, and the assembler rounds it to fit the define type
i suppose that leaves a big spot for the assembler to make a mistake - lol
(we all know how perfect MASM is, right?)
i guess that also means that different assemblers might yield different results
to fix that, define it in raw float bytes ?

Title: Re: float$ macro and algo for testing
Post by: jj2007 on July 06, 2009, 02:56:16 PM

Good point - what can the FPU handle, internally? A snippet for inspiration:

   fldpi      ; we push the exact PI
   fst MyPI8      ; a REAL8 for crt_printf
   fstp MyPI      ; a REAL10 for Str$
   fld MyPI      ; original value
   mov al, byte ptr MyPI
   dec eax
   mov byte ptr MyPI, al
   fld MyPI
   inc eax
   mov byte ptr MyPI, al
   fld MyPI
   inc eax
   mov byte ptr MyPI, al
   fld MyPI
   inc eax
   mov byte ptr MyPI, al
   fld MyPI
   inc eax
   mov byte ptr MyPI, al
   fld MyPI
   inc eax
   mov byte ptr MyPI, al
   fld MyPI
   inc eax
   mov byte ptr MyPI, al
   fld MyPI
   int 3

Title: Re: float$ macro and algo for testing
Post by: ToutEnMasm on July 06, 2009, 03:47:33 PM

Results with the fpu viewer , (libcmt )
Three value of PI jj_PI ,SDK_PI, flpi_value

Quote
;35 c2 68 21 a2 da 0f c9-00 40 ;dump of memory
;4000c90fdaa22168c235 ; the DT show by windbg

Quote
jj_PI dt thevalue
SDK_PI dt the value
the two show the same value in memory ;dump of memory

Print of values loaded in the fpu registers,the three are the same

Quote
3.14159265358979310000 ;float print
54442D18h ;hexa print

Title: Re: float$ macro and algo for testing
Post by: MichaelW on July 06, 2009, 04:46:46 PM

Using TC 3.0 again, this (16-bit DOS) code displays the values in the vicinity of pi that the REAL10 format can represent. The basis for this code is here (http://www.cygnus-software.com/papers/comparingfloats/Comparing%20floating%20point%20numbers.htm).

Code Select


#include <math.h>
#include <stdio.h>
int main()
{
    long double pi10;
    int i=0;
    asm {
        fldpi
        fstp pi10
    }
    printf( "\t3.14159265358979323846\n" );
    asm {
        lea bx, pi10
        sub WORD PTR [bx], 10
        sbb WORD PTR [bx+2], 0
        sbb WORD PTR [bx+4], 0
        sbb WORD PTR [bx+6], 0
        sbb WORD PTR [bx+8], 0
    }
    printf( "%d\t%.19Lf\n", i-10, (long double) pi10 );
    for( i=1; i<60; i++ )
    {
        asm {
            lea bx, pi10
            add WORD PTR [bx], 1
            adc WORD PTR [bx+2], 0
            adc WORD PTR [bx+4], 0
            adc WORD PTR [bx+6], 0
            adc WORD PTR [bx+8], 0
        }
        if( i < 20 || i == 52 || i == 53 )
          printf( "%d\t%.19Lf\n", i-10, (long double) pi10 );
    }
    getch();
    return 0;
}

The representable values in the immediate vicinity of pi are so close together that the Borland CRT cannot display the differences. To see a difference, it was necessary to go 6 values down or 43 values up. Perhaps Olly can do better.

Code Select


        3.14159265358979323846
-10     3.1415926535897932300
-9      3.1415926535897932300
-8      3.1415926535897932300
-7      3.1415926535897932300
-6      3.1415926535897932300
-5      3.1415926535897932400
-4      3.1415926535897932400
-3      3.1415926535897932400
-2      3.1415926535897932400
-1      3.1415926535897932400
0       3.1415926535897932400
1       3.1415926535897932400
2       3.1415926535897932400
3       3.1415926535897932400
4       3.1415926535897932400
5       3.1415926535897932400
6       3.1415926535897932400
7       3.1415926535897932400
8       3.1415926535897932400
9       3.1415926535897932400
42      3.1415926535897932400
43      3.1415926535897932500

Title: Re: float$ macro and algo for testing
Post by: dedndave on July 06, 2009, 04:57:50 PM

if this is the value from the fpu fldpi instruction:

Quote35 c2 68 21 a2 da 0f c9 00 40

then these should be correct
Pi_80 label tbyte
db 35h,0C2h,68h,21h,0A2h,0DAh,0Fh,0C9h,0,40h

Pi_64 label qword
db 69h,21h,0A2h,0DAh,0Fh,0C9h,0,40h

Pi_32 label dword
db 10h,0C9h,0,40h

Title: Re: float$ macro and algo for testing
Post by: jj2007 on July 06, 2009, 05:17:13 PM

Quote from: MichaelW on July 06, 2009, 04:46:46 PM
Perhaps Olly can do better.

Marginally - I get 370, 380, 380, 400, 400, 410, 420 for increments of 4 of the lowest byte of MyPI10.
I must admit I am a bit confused - I always thought that REAL10 and FPU use the same precision, and therefore had assumed that incrementing the lowbyte would result in a change of respective ST(n)...
3.141592653589793238 is 19 digits

Title: Re: float$ macro and algo for testing
Post by: dedndave on July 06, 2009, 05:24:50 PM

well - that tells me the float to string routine has a problem
although, you can't expect perfection because the float format is base2 and the ascii string is base10
whenever you perform a base conversion, there are going to be rounding errors, however small
if the string routine returned a few more digits, you might see a more linear pattern
but, those digits are really meaningless, as they represent a value less than +/- 1/2 LSB of the float
that is why 80 bits are generally used for calculations, but are not really intended for display

EDIT:
there is no reason you couldn't write a test routine that performs the conversion with, say, 88 bits
that would let you see the effect of stepping one LSB with more resolution
in fact, 96 bits would be easy enough - speed is not an issue

Title: Re: float$ macro and algo for testing
Post by: jj2007 on July 06, 2009, 05:40:29 PM

Quote from: dedndave on July 06, 2009, 05:24:50 PM
well - that tells me the float to string routine has a problem

The values are from Olly's display, and they have the 19 digits that are theoretically possible.

Quote
that is why 80 bits are generally used for calculations, but are not really intended for display

The 80 bits of the Real10 format refers to a 64-bit mantissa - and a max of 19.5 digits is possible. This is why I do not understand why fumbling with the lowest byte does not change the display in Olly.

Title: Re: float$ macro and algo for testing
Post by: dedndave on July 06, 2009, 05:42:14 PM

here is a simpler way, perhaps...
it might be interesting to see the result if you stripped the exponent and sign bits from the 80-bit value
then, display the remaining 64-bit value as an unsigned integer using one of the reliable routines we were playing with in the other thread

EDIT:
one little flaw, here
due to the implied leading 1 of the float format, a 64-bit mantissa is actually a 65 bit value
but, that could be ignored if you are only looking for the LSB delta
of course, you will see a delta of 1, but you get the idea

Title: Re: float$ macro and algo for testing
Post by: dedndave on July 06, 2009, 05:45:25 PM

QuoteThe values are from Olly's display, and they have the 19 digits that are theoretically possible.

there are more digits there because the true binary value is a much longer number
they simply have little meaning because they are beyond the represented precision (i.e. less than +/- 1/2 LSB)
because of this fact, display routines do not show them
that does not mean you cannot examine those digits for test purposes

Title: Re: float$ macro and algo for testing
Post by: MichaelW on July 06, 2009, 06:23:12 PM

Near the ends of the range the representable values are further apart.

Incrementing the low-order byte alone will not work correctly when the operation generates a carry.

For the REAL10 format the 1 in the first bit is stored, so the significand is 64 bits.

Title: Re: float$ macro and algo for testing
Post by: dedndave on July 06, 2009, 06:26:42 PM

ahhhh - ty Michael
so - it is easy to show the true binary value, and simply ignore the decimal placement and sign

Title: Re: float$ macro and algo for testing
Post by: MichaelW on July 06, 2009, 06:44:03 PM

Same idea for a REAL4, with the representable values displayed as a REAL8.

Code Select


; «««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««
    include \masm32\include\masm32rt.inc
; «««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««
    .data
      r8  REAL8 0.0
      r4 REAL4 0.0
    .code
; «««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««
start:
; «««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««
    fldpi
    fstp r4
    sub DWORD PTR r4, 10
    mov ebx, 21
    mov esi, -10
    .WHILE ebx
      inc DWORD PTR r4
      fld r4
      fstp r8
      invoke crt_printf, chr$("%d",9,"%.17f",10), esi, r8
      dec ebx
      inc esi
    .ENDW

    inkey "Press any key to exit..."
    exit
; «««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««
end start

Code Select


-10     3.14159059524536130
-9      3.14159083366394040
-8      3.14159107208251950
-7      3.14159131050109860
-6      3.14159154891967770
-5      3.14159178733825680
-4      3.14159202575683590
-3      3.14159226417541500
-2      3.14159250259399410
-1      3.14159274101257320
0       3.14159297943115230
1       3.14159321784973140
2       3.14159345626831050
3       3.14159369468688960
4       3.14159393310546870
5       3.14159417152404790
6       3.14159440994262700
7       3.14159464836120610
8       3.14159488677978520
9       3.14159512519836430
10      3.14159536361694340

Title: Re: float$ macro and algo for testing
Post by: dedndave on July 06, 2009, 07:23:09 PM

here is the result i got - 20 usable digits

0C90FDAA22168C235h / 2^62 = 3.1415926535897932385

Pi is roughly 3.1415926535897932384626433832795

of course, if i were to draw a circle with a compass and try to measure its' circumference, it would have to be precisely drawn
and i would have to use extremely fine measurement techniques to measure it with 7 usable digits of accuracy
5 digits would be more practical in day-to-day use
i would be doing well to measure Pi at 3.1416, in fact - lol
i think my good buddy Archimedes calculated it to 6 places or something like that
oops - i looked it up - Archie only got to 3.1418 - still, not bad for 250 b.c.
i find this a little disappointing, though - i thought he did better
imagine what he could have done with a pentium pc - lol (without pi already in it, of course)
the OS would be named Eureka instead of windows
and we would be getting our updates from a vaccuum cleaner manufacturer

Title: Re: float$ macro and algo for testing
Post by: dedndave on July 06, 2009, 09:11:04 PM

0C90FDAA22168C234h / 2^62 = 3.1415926535897932383 (Pi-1 lsb)
0C90FDAA22168C235h / 2^62 = 3.1415926535897932385 (Pi)
0C90FDAA22168C236h / 2^62 = 3.1415926535897932387 (Pi+1 lsb)
actual value of Pi = 3.14159265358979323846

Title: Re: float$ macro and algo for testing
Post by: dedndave on July 07, 2009, 12:58:42 AM

oh - btw
my short and long real values are incorrect, as i forgot about the 80 bit leading 1 not being implied

let's see if i can do this without screwing it up - lol

Pi_80 label tbyte
db 35h,0C2h,68h,21h,0A2h,0DAh,0Fh,0C9h,0,40h

Pi_64 label qword
db 18h,2Dh,44h,54h,0FBh,21h,9,40h

Pi_32 label dword
db 0DBh,0Fh,49h,40h

Title: Re: float$ macro and algo for testing
Post by: MichaelW on July 07, 2009, 04:39:00 AM

I had another go at this using the Digital Mars compiler version 8.50 from here (http://www.digitalmars.com/download/freecompiler.html). Unlike the Microsoft compilers, GCC, and I think probably many others, under Windows the DM RTL apparently does not use MSVCRT. Using my previous TC 3.0 source changing only the precision value from 19 to 20 I get an EXE that produces these results:

Code Select


3.14159265358979323846
3.14159265358979323850
3.14159265358979323850
3.14159265358979311590

One of my goals here was to find a RTL that supports an 80-bit long double and that I could link with a MASM app. Unfortunately, the DM libraries appear to be incompatible with the Microsoft linker.

Title: Re: float$ macro and algo for testing
Post by: ToutEnMasm on July 07, 2009, 06:34:48 AM

Searching for precision of rounded values seems to be a vast subject.
The mapm libraries (free) seems to offers valuable source codes and documentation.

Quote
http://www.mpfr.org/

Title: Re: float$ macro and algo for testing
Post by: jj2007 on July 07, 2009, 06:45:24 AM

Quote from: MichaelW on July 07, 2009, 04:39:00 AMI get an EXE that produces these results:
Code Select Expand
3.14159265358979323846 3.14159265358979323850

I am stuck at 3.141592653589793239 using print Str$("\nPI=\t%Jf", MyPI) :bg

Title: Re: float$ macro and algo for testing
Post by: ToutEnMasm on July 07, 2009, 08:02:46 AM

The mapm.lib that i have compiled with vc++2008 express,have further samples:
This one calculate Pi in five differents wayS.

Quote
PI1 = [3.141592653589793238462643383279502884197169399375105820974945]
PI2 = [3.141592653589793238462643383279502884197169399375105820974945]
PI3 = [3.141592653589793238462643383279502884197169399375105820974945]
PI4 = [3.141592653589793238462643383279502884197169399375105820974945]
PI5 = [3.141592653589793238462643383279502884197169399375105820974945]

Title: Re: float$ macro and algo for testing
Post by: herge on July 07, 2009, 11:03:28 AM

Hi Mike:

Results from a used AMD athlon 64

Press any key to continue . . .

C:\masm32\bin>FloatMw
-10 3.14159059524536130
-9 3.14159083366394040
-8 3.14159107208251950
-7 3.14159131050109860
-6 3.14159154891967770
-5 3.14159178733825680
-4 3.14159202575683590
-3 3.14159226417541500
-2 3.14159250259399410
-1 3.14159274101257320
0 3.14159297943115230
1 3.14159321784973140
2 3.14159345626831050
3 3.14159369468688960
4 3.14159393310546870
5 3.14159417152404790
6 3.14159440994262700
7 3.14159464836120610
8 3.14159488677978520
9 3.14159512519836430
10 3.14159536361694340
Press any key to exit...

Regards herge

The MASM Forum Archive 2004 to 2012

General Forums => The Laboratory => Topic started by: jj2007 on August 26, 2008, 03:48:53 PM