News:

MASM32 SDK Description, downloads and other helpful links
MASM32.com New Forum Link
masmforum WebSite

FPU doesn't see a real8 value of 0.0 properly

Started by braincell, March 29, 2012, 07:06:25 PM

Previous topic - Next topic

braincell

Hi,

after some FPU calculations, i store all 8 registers in an array of real8. Then i go through lots and lots of macros, pushing nesting to it's limits (i think).
Sometimes the registers are 0.0 values, and when i output them to text, they show up properly as "0.0".

However, when i use FCOM on the real8 values (with fstsw and sahf), if one is 0.0, it often (not always) gives an indeterminate result - meaning "jpe" is activated.
Also, when i do FLD MyReal8Array[index] and then FISTP MyInteger, the integer value becomes -2147483648.

If the value is anything other than 0.0 (like 0.2 or -0.545 or whatever) both FCOM and FISTP work properly.

Note: before using FCOM i always do FINIT so that's not it.

I tried reproducing the problem in a separate project, but i couldn't, because then everything worked as expected.
I am starting to think i might somehow be messing up the stored values with all of my code, or that i flipped some flags somewhere by accident.
The code is very complex by now and i can't even keep track of all the macros and conditionals, so it took a lot to just make it work, let alone figure out this mystery easily.

What could it be?
Should i convert more of my macros to proc, could it be the masm compiler? I'm clueless.

Thanks.

qWord

You want to compare two REAL-values for equality? That's not simple because the numbers can differ in the least significant fraction bits.
One simple solution would be to test for a range: e.g.
if (abs(a-b) <= threshold)
    equal
else
    not equal


BTW: use FCOMI[P] instead of FCOM
FPU in a trice: SmplMath
It's that simple!

dedndave

1) the FPU registers are real10 - not real8
2) it sounds like the tag word may not be restored properly
however, if you address issue 1, this may go away
otherwise, you may want to look at the FSTENV/FLDENV instructions, which save/restore the entire FPU state, including registers

qWord

Quote from: dedndave on March 29, 2012, 07:20:55 PM
1) the FPU registers are real10 - not real8
but you can save and load also other formats (double,float,sdword,sword,sqword)  :wink
FPU in a trice: SmplMath
It's that simple!

dedndave

true
but, without seeing his test code, i am taking shots in the dark   :P

braincell

Quote from: qWord on March 29, 2012, 07:18:29 PM
You want to compare two REAL-values for equality?

BTW: use FCOMI[P] instead of FCOM

- no, i am comparing only for > and <, i am comparing two real8 values in memory by loading one into st0 and using FCOM[P] to compare it to the other.

- I used FCOMI but it really makes no difference. I just do FSTSW and SAHF by myself after FCOM. I also tried using FWAIT but it didn't change anything.

Quote from: dedndave on March 29, 2012, 07:20:55 PM
1) the FPU registers are real10 - not real8
2) it sounds like the tag word may not be restored properly
however, if you address issue 1, this may go away
otherwise, you may want to look at the FSTENV/FLDENV instructions, which save/restore the entire FPU state, including registers

1) ok this might be the campus but i'm not that much of a noob. :) my code has 5271 lines by now and i know the SimplyFPU document by heart (even some byte opcodes).

2) i thought that finit restores the tag word too, no? i use finit before each command so how can that be it? maybe you just didn't read i mentioned FINIT, or im wrong about it restoring the tag word?

Quote from: dedndave on March 29, 2012, 07:39:18 PM
true
but, without seeing his test code, i am taking shots in the dark   :P

I can give you code that works and that has sections that look identical to what i'm doing in my more complex code (only obviously something is different). The full code i really can't post, it's too huge and contains private info sadly.

Anyway, the main question could be now: Does FINIT restore the tag word, and if not, what would be the best way to do it? I couldn't find any documentation on that.

braincell

This might be the reason:

Proc1 calls Proc2.
The FPU is initialized in Proc2, gets the values for the registers and returns from the proc (with ret) back to Proc1.
It does nothing else but load zeros into registers and some other non-zero numbers and performs FPU calculations and commands.
When we return to Proc1, within Proc1 it then stores the FPU into my array of real8 (using FSTP 8 times). This is done immediately after Proc2 returns.

Does the push/pop this proc call (and ret) causes maybe something to mess up the tag word?


Edit: never mind, i just tested storing st(0) into a single variable within the same proc, and when compared to a constant of 10.0, it jumps to "jpe", so still errorish.

qWord

braincell,
did you ever use a debugger?
I would suggest you place a INT 3 before the point of interest and then use OllyDbg to step through the code.
FPU in a trice: SmplMath
It's that simple!

dedndave

Quote from: braincell on March 29, 2012, 08:08:52 PMi use finit before each command so how can that be it?

we may have found the problem, right there   :P
i suggest FINIT once at the beginning of the program
then - try not to stuff more than 8 values into the FPU at any given time

oh - and read the FINIT part of Ray's tutorial one more time   :bg

braincell

Quote from: qWord on March 29, 2012, 08:28:49 PM
braincell,
did you ever use a debugger?
I would suggest you place a INT 3 before the point of interest and then use OllyDbg to step through the code.

Did i ever? I use Olly every day. I can't get it to work with INT3 because this is a dll called from .Net (managed code) it just hangs. I debug by outputting to a text file. :)

Quote from: dedndave on March 29, 2012, 08:30:26 PM
Quote from: braincell on March 29, 2012, 08:08:52 PMi use finit before each command so how can that be it?

we may have found the problem, right there   :P
i suggest FINIT once at the beginning of the program
then - try not to stuff more than 8 values into the FPU at any given time

oh - and read the FINIT part of Ray's tutorial one more time   :bg

I just removed all FINIT from my code except for the first one. I wasn't sure it would work, but obviously i was careful enough with it, so the code worked.
However the problem is still there.

I read it again 1 hour ago. Did i miss something for the 5th time reading it?

Never mind the comparison problem, i can do FTST for zeros and simply scoot around the issue.
The bigger problem is why FISTP returns a -217million value to the integer when it's a 0.0 value.



dedndave

when you ask these questions, it is hard for us to see what the real (pun) problems are without seeing specific code   :P
for the most part, you can assume that the FPU works correctly
so - and i do this, myself - lol - "if it doesn't work, it must be me"   :bg

-217 million sounds like a stored real being interpreted (or treated) as a large signed integer

from Ray...
QuoteThis instruction initializes the FPU by resetting all the registers and flags to their default values.

qWord

Quote from: braincell on March 29, 2012, 08:38:45 PM
I can't get it to work with INT3 because this is a dll called from .Net (managed code) it just hangs. I debug by outputting to a text file. :)
should be not a problem, because VS can show you the x86-disassembly of the program, when debugging it.
FPU in a trice: SmplMath
It's that simple!

jj2007

The odd result can easily be reproduced with this snippet:

include \masm32\MasmBasic\MasmBasic.inc   ; download
   Init
   Dim MyR8(9) As REAL8
   FpuFill  ; pushes 1001....1007 on the FPU
   deb 4, "Fpu", ST(0), ST(1), ST(2), ST(3), ST(4), ST(5), ST(6), ST(7), 80000000h
   fld MyR8(0)
   push eax
   fist dword ptr [esp]
   pop eax
   deb 4, "Result", eax, ST(0)
   fstp st
   Inkey "ok"
   Exit
end start

Fpu
ST(0)           1001.00000000000000
ST(1)           1002.00000000000000
ST(2)           1003.00000000000000
ST(3)           1004.00000000000000
ST(4)           1005.00000000000000
ST(5)           1006.00000000000000
ST(6)           0.0  <<<<<<<<< deb cannot display the last two regs, so no surprise
ST(7)           0.0
80000000h       -2147483648  <<<<<<<< the 217 Mio is just the FPU's way to express anger about unfair treatment ;-)

Result
eax             -2147483648 <<<<<<<<<<<<<<<<<<<<<<<<
ST(0)           0.0


Most probably, you push values on the FPU in a loop, and after iteration 7 the FPU is full. Which results in 80000000h popped into an integer...

Try to put an ffree ST(7) before each and every fld or fild command.

braincell

Quote from: dedndave on March 29, 2012, 08:42:50 PM
when you ask these questions, it is hard for us to see what the real (pun) problems are without seeing specific code   :P
for the most part, you can assume that the FPU works correctly
so - and i do this, myself - lol - "if it doesn't work, it must be me"   :bg

-217 million sounds like a stored real being interpreted (or treated) as a large signed integer

from Ray...
QuoteThis instruction initializes the FPU by resetting all the registers and flags to their default values.

Hehe it's the realest problem i've had so far, that's why im posting. ;)
Of course it's "just me" and not the FPU, it's just a mystery. Yeah -214million (-217 was a mistake) is exactly that. I think it's probably what jj said below.

Quote from: qWord on March 29, 2012, 08:44:46 PM
should be not a problem, because VS can show you the x86-disassembly of the program, when debugging it.

Hmm, it hung the last few times i tried it. I will try again soon.

Quote from: jj2007 on March 29, 2012, 08:58:55 PM
Most probably, you push values on the FPU in a loop, and after iteration 7 the FPU is full. Which results in 80000000h popped into an integer...

Try to put an ffree ST(7) before each and every fld or fild command.


Oh wow, now that's a really good answer. I thought that if i did FLD into a full register that i would get a stack overflow, or some other kind of program-halting problem. I had no idea it could be silent, so i didn't even look into it.
I'll check all my FLDs and let you know how that goes. Thanks!

braincell

I think i found the problem.

I am creating an array of bytes as FPU opcodes which i insert at a label within a VirtualProtected proc.
When i allow only FADD opcodes to be inserted, even when the value is 0.0, the error is not there.
Obviously, some of the more complex opcodes i am inserting are messing up the FPU. The question is now to find out which ones those are.
I can't really "see" the code as it's only passed as an array of bytes, randomly generated (but within logical constraints to not break anything), but maybe Olly can help.

I'll try to get Olly running with INT3 again, so thanks for that suggestion.
Thanks jj for suggesting your idea too, obviously, that solved it for me.

I'm glad i keep adding features to my code, otherwise i would miss bugs like this.

Thanks everyone!