News:

MASM32 SDK Description, downloads and other helpful links
MASM32.com New Forum Link
masmforum WebSite

Comparing real variables: fcmp

Started by jj2007, December 28, 2009, 09:51:59 PM

Previous topic - Next topic

jj2007

A quick way to compare real variables:
Quotefcmp MyReal10, MyOtherReal8
.if Sign?
        print "lower"
.elseif Zero?
        print "equal"
.else
        print "higher"
.endif
First parameter must be a real variable.
Second para can be blank (=compare against zero), a real variable, an immediate integer or a reg32.

include \masm32\include\masm32rt.inc

fcmp MACRO cmp1:REQ, cmp2 ; floatcmp
LOCAL oa
  ffree st(7)
  ifb <cmp2>
fldz ; no second arg; compare against zero
  else
oa = (opattr cmp2) AND 127
if (oa eq 36) or (oa eq 48)
push cmp2
fild dword ptr [esp] ; integer or reg32 on stack, then on FPU
pop eax
else
fld cmp2 ; real on FPU
endif
  endif
  ffree st(7)
  fld cmp1
  call fcmpP
ENDM
.code
fcmpP_proc:
fcmpP proc
xor edx, edx ; default retval: equal
fcompp ; pop twice
fstsw ax ; move FPU flags C1 etc to ax
test ah, 64 ; C3 is set if ST=0 (bt eax, 14)
jne @F ; equal, edx=0
test ah, 1 ; C0 (bt eax, 8)
je fcPos
dec edx ; negative
ret
@@: dec edx ; or zero (0-1+1=0)
fcPos: inc edx ; positive
ret
fcmpP endp
fcmpP_endp:

; --------------------- some test variables: ---------------------
f4 REAL4 4.4
f8 REAL8 8.8
f10 REAL10 0.0

start:
print chr$("expected: higher ", 9)
C1_start:
fcmp f4 ; only one argument: compare against zero
C1_end:
.if Sign?
print "lower"
.elseif Zero?
print "equal"
.else
print "higher"
.endif

print chr$(13, 10, "expected: lower  ", 9)
C2_start:
fcmp f4, f8 ; compare a real4 against a real8
C2_end:
.if Sign?
print "lower"
.elseif Zero?
print "equal"
.else
print "higher"
.endif

print chr$(13, 10, "expected: higher  ", 9)
mov eax, -1
C3_start:
fcmp f10, eax ; compare a real10 against eax
C3_end:
.if Sign?
print "lower"
.elseif Zero?
print "equal"
.else
print "higher"
.endif

print chr$(13, 10, "expected: equal  ", 9)
C4_start:
fcmp f10 ; compare a real10 against zero
C4_end:
.if Sign?
print "lower"
.elseif Zero?
print "equal"
.else
print "higher"
.endif

print chr$(13, 10, 10, "Code sizes:", 13, 10, "proc: ", 9)
mov eax, fcmpP_endp
sub eax, fcmpP_proc
print str$(eax), 13, 10, "call 1:", 9
mov eax, C1_end
sub eax, C1_start
print str$(eax), 13, 10, "call 2:", 9
mov eax, C2_end
sub eax, C2_start
print str$(eax), 13, 10, "call 3:", 9
mov eax, C3_end
sub eax, C3_start
print str$(eax), 13, 10, "call 4:", 9
mov eax, C4_end
sub eax, C4_start
print str$(eax)
getkey
exit
end start


(the fcmpP_proc and fcmpP_endp labels are just for getting the code size)

dedndave

great idea JJ
and you don't have to worry about saving the FPU state/stack   :U

here it is - i thought you may find this interesting...
http://www.cygnus-software.com/papers/comparingfloats/comparingfloats.htm

jj2007

Quote from: dedndave on December 28, 2009, 10:34:09 PM
great idea JJ
and you don't have to worry about saving the FPU state/stack   :U

here it is - i thought you may find this interesting...
http://www.cygnus-software.com/papers/comparingfloats/comparingfloats.htm

"AlmostEqual" looks suspiciously close to my old Parser routine. Check attachment for RealComp ;-)

G1=12.34, G2=43.21 (both global REAL10)
f1=11.11, f2=22.22 (both local REAL10)

Calculating 68.16122-G2*(f1+3*1.8**2.2)+G2*f1 :
Expected value is       -404.2335
Calculated value is     -404.2335
-- Second operand is smaller

Calculating 4-G1*(f2-3.4)+G2*f1 :
Expected value is       251.8243
Calculated value is     251.8243
** Operands are equal or almost equal at 8 digits precision

GregL

Here are some macros I wrote a while back for comparing real numbers, with help from the example code in Raymond's 'Simply FPU'.

Note to jj: I wasn't concerned about speed, just functionality.  :bg


; ====================================
isgreater MACRO r1:REQ, r2:REQ
    LOCAL error, true, false, clear
    finit
    fld r2
    fld r1
    fcom
    fstsw ax
    fwait
    sahf
    jpe   error
    ja    true
    jbe   false
  error:
    mov eax, -1
    jmp clear
  true:
    mov eax, 1
    jmp clear
  false:
    xor eax, eax
  clear:
    fstp st(0)
    fstp st(0)
    EXITM <eax>
ENDM
; ====================================
isgreaterequal MACRO r1:REQ, r2:REQ
    LOCAL error, true, false, clear
    finit
    fld r2
    fld r1
    fcom
    fstsw ax
    fwait
    sahf
    jpe   error
    jae   true
    jb    false
  error:
    mov eax, -1
    jmp clear
  true:
    mov eax, 1
    jmp clear
  false:
    xor eax, eax
  clear:
    fstp st(0)
    fstp st(0)
    EXITM <eax>
ENDM
; ====================================
isless MACRO r1:REQ, r2:REQ
    LOCAL error, true, false, clear
    finit
    fld r2
    fld r1
    fcom
    fstsw ax
    fwait
    sahf
    jpe   error
    jae   false
    jb    true
  error:
    mov eax, -1
    jmp clear
  true:
    mov eax, 1
    jmp clear
  false:
    xor eax, eax
  clear:
    fstp st(0)
    fstp st(0)
    EXITM <eax>
ENDM
; ====================================
islessequal MACRO r1:REQ, r2:REQ
    LOCAL error, true, false, clear
    finit
    fld r2
    fld r1
    fcom
    fstsw ax
    fwait
    sahf
    jpe   error
    ja    false
    jbe   true
  error:
    mov eax, -1
    jmp clear
  true:
    mov eax, 1
    jmp clear
  false:
    xor eax, eax
  clear:
    fstp st(0)
    fstp st(0)
    EXITM <eax>
ENDM
; ====================================
isnotequal MACRO r1:REQ, r2:REQ
    LOCAL error, true, false, clear
    finit
    fld r2
    fld r1
    fcom
    fstsw ax
    fwait
    sahf
    jpe   error
    ja    true
    jb    true
    jz    false
  error:
    mov eax, -1
    jmp clear
  true:
    mov eax, 1
    jmp clear
  false:
    xor eax, eax
  clear:
    fstp st(0)
    fstp st(0)
    EXITM <eax>
ENDM
; ====================================
isequal MACRO r1:REQ, r2:REQ
    LOCAL error, true, false, clear
    finit
    fld r2
    fld r1
    fcom
    fstsw ax
    fwait
    sahf
    jpe   error
    ja    false
    jb    false
    jz    true
  error:
    mov eax, -1
    jmp clear
  true:
    mov eax, 1
    jmp clear
  false:
    xor eax, eax
  clear:
    fstp st(0)
    fstp st(0)
    EXITM <eax>
ENDM
; ====================================
islessgreater MACRO r1:REQ, r2:REQ
    EXITM <isnotequal(r1, r2)>
ENDM
; ====================================
isapproxequal MACRO r1:REQ, r2:REQ, tolerance:REQ
    LOCAL diff
    .DATA?
        diff REAL10 ?
    .CODE
    finit
    .IF isgreater(r1, r2)
        fld   r2
        fld   r1
    .ELSE
        fld   r1
        fld   r2
    .ENDIF
    fsub
    fstp  diff
    fwait
    EXITM <islessequal(diff, tolerance)>
ENDM
; ====================================

jj2007

#4
Quote from: Greg Lyon on December 29, 2009, 03:30:56 AM
Here are some macros I wrote a while back for comparing real numbers, with help from the example code in Raymond's 'Simply FPU'.

Greg, looks nice, but finit trashes the whole FPU content. Especially in a context of comparing real variables, one should assume that the user is working with the FPU, and needs its content. My macro leaves all FPU settings intact and trashes only the top registers, st 6+7 - it is extremely unlikely that a user uses all 8 registers simultaneously when calling the fcmp macro, so I guess that "loss" can be tolerated.

Re functionality, imho one macro is enough - there is also only one cmp eax, nnn for integers. The right jumps can be set e.g. like this:
fcmp f4, f8 ; compare a real4 against a real8
.if Sign? && !Zero?
print "lower"
.elseif !Sign? && !Zero?
print "higher"
.elseif Zero?
print "equal"
.elseif Sign? || Zero?
print "lower or equal"
.elseif Zero? || !Sign?
print "higher or equal"
.endif


Below a modified version that preserves edx (but not eax). At 23 bytes for the proc, and 17...21 bytes for calling the macro, it is still fairly contained in size.

Quotefcmp MACRO cmp1:REQ, cmp2   ; -------- floatcmp --------
LOCAL oa
  ffree st(7)
  ifb <cmp2>
   fldz                        ; no second arg; compare against zero
  else
   oa = (opattr cmp2) AND 127
   if (oa eq 36) or (oa eq 48)
      push cmp2
      fild dword ptr [esp]      ; integer or reg32 on stack, then on FPU
      pop eax
   else
      fld cmp2               ; real on FPU
   endif
  endif
  ffree st(7)
  fld cmp1
  call fcmpP
ENDM
.code   ; --------- end of macro --------

fcmpP
[/size]proc
   push edx
   xor edx, edx   ; clear the flag register
   fcompp      ; compare ST(0) with ST(1) and pop twice
   fstsw ax      ; move FPU flags C1 etc to ax
   test ah, 64   ; C3 is set if ST=0 (bt eax, 14)
   jne @F      ; equal, edx=0
   test ah, 1      ; C0 (bt eax, 8)
   je fcPos
   dec edx      ; negative (-2+1=-1)
@@:   dec edx      ; or zero (0-1+1=0)
fcPos:   inc edx      ; positive
   pop edx
   ret
fcmpP endp


jj2007

Coming back to Dave's suggestion to look at the "almost equal" question: Here is a testbed for one approach that consists of saving the two operands to memory as REAL4. The limitation of this approach is that the exponent cannot exceed the REAL4 range.

include \masm32\include\masm32rt.inc

.data
aeL1 REAL10 3.9999999
aeH1 REAL10 4.0
aeL2 REAL10 4.0
aeH2 REAL10 4.0000001
aeL3 REAL10 -4.0000001
aeH3 REAL10 -4.0
aeL4 REAL10 -4.000001e-39 ; will falsely report equal for exponents above 37 or below -39
aeH4 REAL10 -4.0e-39
aeL5 REAL10 -4.000001 ; this one not equal
aeH5 REAL10 -4.0

.code
start:
ct = 0
REPEAT 5
ct = ct +1
@CatStr(<fld aeL>, %ct)
@CatStr(<fld aeH>, %ct)
sub esp, 8 ; allocate two REAL4 slots
fstp REAL4 ptr [esp]
pop eax
fstp REAL4 ptr [esp]
pop edx
sub edx, eax
.if edx
print "NOT equal", 13, 10
.else
print "equal", 13, 10
.endif
ENDM
getkey
exit
end start

GregL

#6
Quote from: jj2007Greg, looks nice, but finit trashes the whole FPU content. Especially in a context of comparing real variables, one should assume that the user is working with the FPU, and needs its content.

I knew you were going to say something like that.  Usually when I'm working with the FPU I do some calculations and save the result to a memory variable, do some more calculations and save the result to a memory variable.  When doing the compare, the FPU calculations would be done.  I rarely, if ever, need to preserve the FPU contents.  I like to use finit to be sure I'm getting the precision I want, API calls can change the precision.

Quote from: jj2007Re functionality, imho one macro is enough - there is also only one cmp eax, nnn for integers.
I disagree, if I'm testing for a specific thing, like "isgreater", that's the macro I am going to want to use.

With these macros it doesn't matter what the data type is, they handle REAL4, REAL8 or REAL10.


I'm sick and tired of posting code here only to have it torn apart, torn down and criticized.  I should have learned by now.  I know these macros work because I have used them and they do the job for me.  If you don't like them, don't use them.


jj2007

Quote from: Greg Lyon on December 29, 2009, 07:30:07 PM
I'm sick and tired of posting code here only to have it torn apart, torn down and criticized.

Sorry Greg if I stepped on your toes. My apologies. And thanks for reminding me of the sahf instruction, it is really handy.

I have modified the fcmp macro, and stumbled over a problem with the error detection:

;  Typical call:
;  ffree st(7)
;  fldz ; no second arg; compare against zero
;  ffree st(7)
;  fld MyReal4
;  call fcmpP
fcmpP proc
push edx
xor edx, edx ; clear the flag register
fucompp ; compare ST(0) with ST(1) and pop twice
fstsw ax ; move FPU flags C1 etc to ax
sahf ; translate flags
jpo @F ; jpo=no error
dec edx ; produce -2 as error flag
@@: ja fcPos
je @F
dec edx ; negative (-2+1=-1)
@@: dec edx ; or zero (0-1+1=0)
fcPos: inc edx ; positive
xchg edx, [esp] ; save edx, use eax as retval
pop eax
ret
fcmpP endp

This works fine and is reasonably short (once 24 bytes for the proc, plus 17...21 bytes per call). However, I can't convince the FPU to set the error flag, as described in FPU Chapter 7, fcom.

I tried the following to provoke an error:
.data
f10Err REAL10 1.0e99 ; will be written to f4Err
f4Err REAL4 0.0 ; will receive a bad number, exponent too high for a REAL4
.code
fld f10Err
fstp f4Err ; for error testing
finit
fclex
; int 3 ; start Olly here
fcmp f4Err ; only one argument: compare against zero
cmp eax, -2
je FatError ; there should be an error!!!

Tracing this with Olly reveals that ST0 does get a BAD number, but the C1...C3 flags are not being set. Can somebody tell me where I am wrong?? Full code attached.

qWord

Quote from: jj2007 on December 29, 2009, 10:12:57 PM
Tracing this with Olly reveals that ST0 does get a BAD number, but the C1...C3 flags are not being set.
It is the OE-Flag (Overflow Exception) that indicates too large values.

BTW: For my own macros, I'm using fcomi/fcomip  - both, AMD an Intel, suggest to use it for comparing fpu-values. The instruction directly set the rFLAGS.

qWord
FPU in a trice: SmplMath
It's that simple!

jj2007

Quote from: qWord on December 29, 2009, 10:37:26 PM
Quote from: jj2007 on December 29, 2009, 10:12:57 PM
Tracing this with Olly reveals that ST0 does get a BAD number, but the C1...C3 flags are not being set.
It is the OE-Flag (Overflow Exception) that indicates too large values.
Thanks, qWord. But how would one detect it? Do you have a practical example? Simply FPU states that the C2 flag should be used, but I can't get it to work.

Quote
BTW: For my own macros, I'm using fcomi/fcomip  - both, AMD an Intel, suggest to use it for comparing fpu-values. The instruction directly set the rFLAGS.
Yes, faster and shorter, but it seems to require a P6. Some members here still use a P3. On the other hand, I also went for SSE2 code in my own library. Difficult to say where one should draw the line...

qWord

Quote from: jj2007 on December 29, 2009, 10:58:05 PMDo you have a practical example?
What about this:

    .data
        f10Err  REAL10  1.0e99
        f4Err   REAL4   0.0
    .code
   
    fld f10Err
    fstp f4Err
    fstsw ax
    .if ax&01000y
        fn MessageBox,0,"value to larg",0,0
    .else
        fn MessageBox,0,"OK","OK",0
    .endif


EDIT:
an other method is to test (after store) for the value +INFINITE = 07f800000h.
    fld f10Err
    fstp f4Err
    .if f4Err == 07f800000h ;
        fn MessageBox,0,"value to larg",0,0
    .else
        fn MessageBox,0,"OK","OK",0
    .endif

FPU in a trice: SmplMath
It's that simple!

jj2007

Hmmpfffff....! Your example works, but now i am thoroughly confused. Does fstsw report past overflow??
After some testing, it seems the answer is yes. The O flag is set by fstp, and remains set until cleared by fclex.

    fld f10Err  <<< O flag still clear
    fstp f4Err  <<< after this instruction, the FPU is empty but the O and P flags are set
    fstsw ax


There is a good description of the status register here (by Randy Hyde); except that I cannot confirm "Bit seven of the status register is set if any error condition bit is set". Although bits 3 and 5 are set, bit 7 remains clear.
One more unclear bit is why Simply FPU recommends jpe error_handler ;the comparison was indeterminate after sahf - this does not seem to work.

Anyway, the corrected code is now as follows (bloated to 31 bytes :8)):
;  Typical call:
;  ffree st(7)
;  fldz ; no second arg; compare against zero
;  ffree st(7)
;  fld MyReal4
;  call fcmpP
fcmpP proc
push edx
xor edx, edx ; clear the flag register
fucompp ; compare ST(0) with ST(1) and pop twice
fstsw ax ; move FPU flags C1 etc to ax
test al, 8+32 ; test if overflow or precision flags are set
je @F ; overflow flag set=error
sub edx, 127 ; produce error code
fclex ; clear exceptions
@@: sahf
ja fcPos
je @F
dec edx ; negative (-2+1=-1)
@@: dec edx ; or zero (0-1+1=0)
fcPos: inc edx ; positive
xchg edx, [esp] ; save edx, use eax as retval
pop eax
ret
fcmpP endp

qWord

Quote from: jj2007 on December 30, 2009, 12:04:39 AM
Hmmpfffff....! Your example works, but now i am thoroughly confused. Does fstsw report past overflow??
After some testing, it seems the answer is yes. The O flag is set by fstp, and remains set until cleared by fclex.
tstsw copy the whole status word. The exception-flags are in the low-byte.

Quote from: jj2007 on December 30, 2009, 12:04:39 AM
There is a good description of the status register here (by Randy Hyde); except that I cannot confirm "Bit seven of the status register is set if any error condition bit is set". Although bits 3 and 5 are set, bit 7 remains clear.
a quote from AMD's developers Manuals:
QuoteException Status (ES). Bit 7. The processor calculates the value of this bit at each instruction
boundary and sets the bit to 1 when one or more unmasked floating-point exceptions occur...
By default, the OE is masked  (in control register) -> bit 7 will not set.

Quote from: jj2007 on December 30, 2009, 12:04:39 AM
jpe error_handler ;the comparison was indeterminate[/b] after sahf - this does not seem to work.
this only applies, if you are comparing values - if the operants (if one or both depends on instruction) are not compareable (e.g. NaN's) the flag C2 is set. C2 becomes to the Parity Flag, when setting AH to the rFLAGS.

Here a two good references( primary literature  :green2) for fpu-stuff (IMO):
AMD64 Architecture Programmer's Manual Volume 1: Application Programming (chapter 6)
AMD64 Architecture Programmer's Manual Volume 5: 64-Bit Media and x87 Floating-Point Instructions
FPU in a trice: SmplMath
It's that simple!

MichaelW

Quotebut it seems to require a P6. Some members here still use a P3

fcomi and fcomip work fine on a Pentium III. P6 normally refers to the sixth generation x86 processors that started with the Pentium Pro.

http://en.wikipedia.org/wiki/Intel_P6


Also, for EQ, GT, LT, etc you should be able to compare the values as integers and interpret the flags just as you would for integers.
eschew obfuscation

dedndave

i think if you weed out special values like NaN's, infinities, negative 0's, etc, that is correct
but comparing reals depends on the application
when you have a case where an epsilon is applicable, everything becomes difficult (most of the time, i guess)
the point is, no one solution will cover all the cases
i wonder why we haven't heard from Ray ? - lol