News:

MASM32 SDK Description, downloads and other helpful links
MASM32.com New Forum Link
masmforum WebSite

Multiplier table

Started by jj2007, July 09, 2009, 10:50:41 AM

Previous topic - Next topic

dedndave

hmmmmmmmmmmmm
log2 and antilog2
they sound tailor made for base conversion
i.e. what are we doing trying to mul and divide by 10 - lol
use logarithms to convert reals to decimal and vica versa
then, you don't need the table

loga X = logn X / logn a

should be bloody fast, too

jj2007

Quote from: dedndave on July 10, 2009, 09:24:23 PM

the type of pattern suggests that masm may be off on some and i may be off on others


Good grief - I thought Masm was perfect :wink

Below my slightly adjusted table - I am down to 0 fails, i.e. generated and "original" table are identical. Have to leave the whole of today, tonight I want to see results, my friend :bg


.data
f2sMore REAL10 1.0000000000000000001
f2sLess REAL10 0.9999999999999999999
;         01234567890123456789012345678901
f2sMulTable\
dd     00000000000000001110011000000000b ; 0
dd     00000001001100011000000001000110b ; 4
dd     00000000110000000000000011000000b ; 8
dd     00000100001101011000000011100000b ; 12
dd     10000110010000001100000000000011b ; 16
dd     01100100000000000000000000000000b ; 20
dd     00000000011000000000000100000110b ; 24
dd     00100000011001011001000110010000b ; 28
dd     01110011000000000010001100000000b ; 32
dd     00010000011000000110000000001100b ; 36
dd     00010001001100000011000000000000b ; 40



TestCurve proc uses esi edi ebx
LOCAL f2sShCt:SDWORD, f2sShPos:DWORD, fail
mov eax, Str$(1) ; init calibrated table
mov f2sShPos, offset f2sMulTable
mov ebx, f2sMulTable
fld f2s0dot1 ; 10.0 ------- create multplier table --------
fld f2sSeed ; 1.0e-144
mov ecx, ExRange*2-1
mov edx, offset f2sR10b ; generated table
mov edi, offset f2sR10 ; compare against the real thing
add edx, ExRange*2*10-2*10
add edi, ExRange*2*10-2*10
; 12345678901234567890123456789012
NumBits = 31
m2m f2sShCt, NumBits
and fail, 0
xor esi, esi
; SHL ebx, 1 ; carry set = correction needed
@@:
inc esi
fmul st, st(1) ; create new entry
fld st ; create a copy
fstp REAL10 ptr [edx] ; store & pop
mov eax, [edx]
sub eax, [edi]
; .if eax
; int 3
; .endif
SHL ebx, 1 ; carry set = correction needed
; int 3
.if Carry?
SHL ebx, 1
.if !Carry?
fld f2sMore
.else
fld f2sLess
.endif
fmul
fld st ; create a copy
fstp REAL10 ptr [edx] ; store & pop
mov eax, [edx]
sub eax, [edi]
.if eax
inc fail
mov eax, f2sShPos
sub eax, offset f2sMulTable
pushad
print str$(eax), 9
print str$(esi), 13, 10
popad
; int 3
.endif
dec f2sShCt
.endif
dec f2sShCt
.if Sign?
mov ebx, f2sShPos
add ebx, 4
mov f2sShPos, ebx
mov ebx, [ebx]
m2m f2sShCt, NumBits
.endif
sub edx, 10
sub edi, 10
dec ecx
jne @B
fstp st
fstp st
print Str$("Fail=%i", fail)
getkey
exit
  ret
TestCurve endp

dedndave

i was thinking of diddling the RC bits in the FPU to handle the offsets - lol
but, this logarithm thing has my curiousity up a bit
i may play with a logarithm routine to convert 64-bit integers to decimal strings
once you get that going on, you can use a similar technique to convert reals
i may take some play time, myself - lol
enjoy your day, Jochen

jj2007

The attached table makes the values look exact in OllyDbg. I wonder, though, whether Masm codes bad values, or whether Olly displays good values badly. Any idea how to check that??



[attachment deleted by admin]

dedndave

#49
well, i have checked a few points manually
for the ones i checked, masm had the right stuff
here is an example of how i did them....
take the 64-bit mantissa and convert it to decimal, just as though it were a 64-bit integer
then, take the exponent, subtract 16383 (the bias), subtract 63 (63 trailing bits under the MSB)
that value tells you how many times to multipy or divide by 2

this is masm's value for 1.0e+29
E5 0B B9 36 D7 07 8F A1 5F 40

exponent
40 5F (=16479 decimal: 16479-16383-63=33  that means we need to multiply the mantissa by 2^33, if it were negative, then divide)

mantissa
A1 8F 07 D7 36 B9 0B E5
in decimal, that is
11,641,532,182,693,481,445

so to see the result, multiply
11,641,532,182,693,481,445 X 2^33

i use my 64-bit unsigned integer routine
then i made a quicky routine to multiply and divide decimal strings repeatedly by 2
slow and crude, but it is a basic 80-bit real to decimal ascii routine with no formatting
it is capable of displaying the precise evaluation of reals, even though many of the trailing digits are unusable

jj2007

Now it gets philosophical: Do we want the correct values, or those that are needed to display correct results after multiplication...??

; Input for Masm                          ; display in OllyDbg         hex in OllyDbg
e62a REAL10 1.000000000000000000043e-62 ; 1.0000000000000000001E-62, 3F31 83A3EEEE F9153E8A
e62b REAL10 1.000000000000000000042e-62 ; 9.9999999999999999990E-63, 3F31 83A3EEEE F9153E89
e62c REAL10 1.000000000000000000000e-62 ; 9.9999999999999999990E-63, 3F31 83A3EEEE F9153E89


I had the bright idea to launch WinDbg for comparison, but the coward stops at 16 digits precision for displaying an 80-bit float :tdown

dedndave

#51
we want the closest value, of course
i think a really good real2str routine would allow me to choose how many places i want to display
if i elect to see 20 decimal digits, then i should be prepared to see a lot of "9"s
if i set the length to 18, they should all go away
at 19, i think you may still see some "9"s
you might even provide a setting that calculates the "usable digits" and sets the length to that

you could have a routine where the digit count is set as a constant - so long as it remains set to one value, the routine spits em out that long
(rather than having to pass the length as a parameter each time)
of course, you could have the routine call under two different names and do both

when i want to insure they display that way, i add a small delta, then truncate - the routine could provide that function
another possibility is to have a rounding control, similar to the fpu rc bits

the deluxe has all of the above - lol
the point is, it is the duty of the display routine
we always want the value that is nearest the intended value, especially if it is to be used for calculation, later
remember, extended reals weren't meant to be used for display
they were made for intermediate calculations, to retain more precision than required for presentation

i am trying to think of an application where i would want to generate thousands upon thousands of these numbers
this is the app that would require the routine to be very fast
otherwise, we either don't need a super-fast routine, or we don't need that many digits
the only answer i can come up with would be a special scientific app
there may be many apps that may store the reals, but not display them (CAD, animation, etc) - no conversion needed for that
accounting apps aren't likely to see that many digits unless we are printing the check that the US gov't pays to haliburton
if that check were off by 1 cent, the tax-payers might have a reason to get really upset

at any rate, the features mentioned above would make the routine much more useful
i think this is a case where features outweigh speed
of course, as assembler programmers, we want both - lol

dedndave

#52
i wanted to clean up my extended real evaluation program a bit and share it
i didn't spend a lot of time on it, but it works - lol - try not to be too critical of my code
before i post it, i am going to add the decimal point, sign, and clean up leading and trailing zeros - lol
i also want to detect QNaN's, SNaN's, infinity, and unsupported values
these values are defined a little differently for the extended real format, due to the explicit MSB
the information wasn't easy to find and i thought i might post this table for those who are interested...

EDIT
notice that Signaling NaN's require at least one of the lower 62 bits to be set
this is not so for Quiet NaN's
EDIT
Quiet NaN's and Signaling NaN's may have the sign bit set to 1
EDIT
the value: ffff c0000000 00000000 is an "Indefinate"
a special case of the set of Quiet NaN values
EDIT
all other values are "Invalid"

dedndave

one thing i have not been able to find out
perhaps someone out there knows the answer (hint, hint Raymond)
is it possible for NaN's to be negative ? - or are those invalid values
i suppose i could try loading one and multiply it, huh

raymond

Here's what I have in the tutorial. I most probably researched it at that time.

QuoteApart from the INFINITY and INDEFINITE values which can be generated by the FPU, there is a very large number of other NANs with all the possible permutations of fraction bits and sign bit being set to 1 when all the bits in the exponent field are set to 1.
When you assume something, you risk being wrong half the time
http://www.ray.masmcode.com

dedndave

ok Ray - thank you for helping out - i have modified the post above to clarify a few values

dedndave

here is something for you to chew on, Jochen - lol
Quotewe want the closest value, of course
now - how do we define "closest"
i think i know the answer, but it may be a matter of opinion/interpretation, as well
the application could affect what "the right answer" is, too
let's say we have a hypothetical numbering system
we want to represent the value 100.000000
in this numbering system, the values closest to 100.000000 are 99.999000 and 100.001000
at first glance, you might say either answer will work equally well
|100.001000-100.000000| = 0.001
|100.000000-99.999000| = 0.001

BUT !

100.001000/100.000000 = 1.00001
100.000000/99.999000 = 1.000010000100001.....

it is easy to see that, in intel extended reals, the closest value algebraically
may be one value, while the closest value geometrically may be the other
i think the geometrically closest value is the right answer in most applications
in accounting, however, the algebraically closest may be correct
this poses a problem to us of a different nature
finding the geometrically closest value requires division, which is slower
we need Ray, again - lol

raymond

When you start comparing apples, you have to continue comparing apples.

100.001000/100.000000 and 100.000000/99.999000 all have effectively 8 significant digits.

If you want to compare the results of divisions 1.00001000 to 1.000010000100001....., you should retain only 8 significant digits, thus comparing 1.00001000 to 1.00001000 which are identical in this example.
When you assume something, you risk being wrong half the time
http://www.ray.masmcode.com

dedndave

well - i chose a bad exmple, perhaps
the same conclusions could be drawn from 100.999, 101.000, and 101.001

jj2007

Quote from: dedndave on July 13, 2009, 10:44:58 PM
here is something for you to chew on, Jochen - lol
Quotewe want the closest value, of course
now - how do we define "closest"

That depends on the context, I guess. For an everyday routine like print Str$(1/3) (see my update, using the new algo for generating the multiplier table), the rule should be "choose the one that yields the most usable results". For climate change modelling, you will simply choose an algo with a defined precision, e.g. 128 bits - slow and bulky but as precise as you need it. Here is an example how IBM tested higher precision.

By the way, I found it difficult to google up exact values for constants such as PI = 3.14159265358979323846
Does Google have a wildcard option??