The MASM Forum Archive 2004 to 2012

General Forums => The Laboratory => Topic started by: ToutEnMasm on June 30, 2009, 06:46:19 AM

Title: view of fpu registers step by step
Post by: ToutEnMasm on June 30, 2009, 06:46:19 AM
Hello,
This one is to make it easier to work with the fpu.
Put your fpu code in the f_callback proc,put step in it (call step) .
choose a format to view the result (float,hexa,expo,decimal).
now you can see the data registers,the state register and the control register.
Needed: at least the vc++ express edition,the libcmt is used

I want to made the same thing with sse,mmx intructions but it seems that savings the registers isn't so easy.Perhap's an idea ?.
reports are welcome.

If you have trouble with the strsafe functions see remark and zip below




[attachment deleted by admin]
Title: Re: view of fpu registers step by step
Post by: ToutEnMasm on October 10, 2010, 02:50:30 PM

I have had a bad surprise with the new strsafe.lib.There isn't the function in use in the fpuvisu.lib.
Those functions are now in source in the header of the sdk and are unusable like that with masm.
SO,i have extracted the strsafe.obj of an old library and added this object in the new fpuvisu.lib.
With this new library you haven't need of the strsafe.lib.




Title: Re: view of fpu registers step by step
Post by: clive on October 10, 2010, 03:14:46 PM
Quote from: ToutEnMasm
I want to made the same thing with sse,mmx intructions but it seems that savings the registers isn't so easy.Perhap's an idea ?.

Use FXSAVE and pull the STx, MMx and xMM registers from the memory image.
Title: Re: view of fpu registers step by step
Post by: jj2007 on October 12, 2010, 09:02:37 PM
Quote from: clive on October 10, 2010, 03:14:46 PM
Quote from: ToutEnMasm
I want to made the same thing with sse,mmx intructions but it seems that savings the registers isn't so easy.Perhap's an idea ?.

Use FXSAVE and pull the STx, MMx and xMM registers from the memory image.

The Streaming SIMD Extensions fields in the save image (XMM0-XMM7 and MXCSR) may not be loaded into the processor if the CR4.OSFXSR bit is not set. This CR4 bit must be set in order to enable execution of Streaming SIMD Extensions. (http://www.rz.uni-karlsruhe.de/rz/docs/VTune/reference/vc129.htm#Layout)

How would you set CR4.OSFXSR?
Title: Re: view of fpu registers step by step
Post by: Antariy on October 12, 2010, 09:28:54 PM
Quote from: jj2007 on October 12, 2010, 09:02:37 PM
How would you set CR4.OSFXSR?

This is made by OS. You cannot set any CRx from usermode.



Alex
Title: Re: view of fpu registers step by step
Post by: dedndave on October 12, 2010, 10:57:46 PM
it isn't easy to find information about how different versions of windows treat the control registers
some may be set by BIOS and not messed with by windows
others may be set by BIOS, then changed by windows
still others may be set by windows only, or left as is altegether
it is an area of complete documentation mystery
Title: Re: view of fpu registers step by step
Post by: Antariy on October 12, 2010, 11:05:41 PM
Quote from: dedndave on October 12, 2010, 10:57:46 PM
it isn't easy to find information about how different versions of windows treat the control registers
some may be set by BIOS and not messed with by windows
others may be set by BIOS, then changed by windows
still others may be set by windows only, or left as is altegether
it is an area of complete documentation mystery

Well, this is not case for CR4.OSFXSR.
I doubt that most of BIOSes rely on such incompatible things as SSE.
If OS support and know about SSE - it set this flag, otherwise running of SSE instructions on *SSE-capable* CPU lead to #UD, and prog is crashed with fun :)



Alex
Title: Re: view of fpu registers step by step
Post by: clive on October 13, 2010, 12:07:38 AM
Quote from: jj2007 on October 12, 2010, 09:02:37 PM
The Streaming SIMD Extensions fields in the save image (XMM0-XMM7 and MXCSR) may not be loaded into the processor if the CR4.OSFXSR bit is not set. This CR4 bit must be set in order to enable execution of Streaming SIMD Extensions. (http://www.rz.uni-karlsruhe.de/rz/docs/VTune/reference/vc129.htm#Layout)

How would you set CR4.OSFXSR?

And what clown writes a multitasking, SSE compatible OS that DOESN'T enable this? As I recall this instruction was first implemented in Deschutes (Pentium II), prior to the Katmai (Pentium 3) which supported SSE for the first time. Microsoft supported the enabling of this feature as far back as February 1998, in one beta releases of Windows 98.
Title: Re: view of fpu registers step by step
Post by: clive on October 13, 2010, 12:15:30 AM
Quote from: Antariy
If OS support and know about SSE - it set this flag, otherwise running of SSE instructions on *SSE-capable* CPU lead to #UD, and prog is crashed with fun

Exactly, otherwise much hilarity would ensue.

The point for the flag in the CPUID, and the enablement in CR4 is an implicit acknowledgment by the OS that the CPU has this function, and that it should use it as the most efficient method of preserving the CPU context.

If one is concerned about executing it, then they should use an SEH around it. Or work with the kernel mode driver that can touch CR4.
Title: Re: view of fpu registers step by step
Post by: dedndave on October 13, 2010, 12:58:38 AM
i agree, Clive
my point was...
where is all that documented ?
you know it, but you are sort of a wizard - lol
we can't all be fricken wizards
Title: Re: view of fpu registers step by step
Post by: clive on October 13, 2010, 01:40:26 AM
Quote from: dedndave on October 13, 2010, 12:58:38 AM
i agree, Clive
my point was...
where is all that documented ? you know it, but you are sort of a wizard - lol we can't all be fricken wizards

Hard to say, when the features were first implemented Intel was rather sketchy with the details. Basically limiting it to preferred development partners, as the product came into main stream production the details were reflected in the technical manuals. I can certainly trace the use of FXSAVE/FXRSTOR back to early 1998 when Microsoft put it in the floating point driver for what became Windows 98, and then later in Windows 2000. It is not supported in NT 4.0 or Windows 95, but may have made it into subsequent service packs.

The BIOS's didn't tend to use/setup the control registers, they basically had to bring up DOS, and support the INT 15 functions. They deal with the A20 switch, and the fast-reset mode switching, and setting up the north bridge, south bridge, peripherals and memory. Once in DOS, things like HIMEM or EMM386, are where much of the control registers, and virtualization are setup, or ownership is taken from the BIOS.

The BIOS also looks at the CPUID, checking for and disabling the serial number function, checking the stepping of the processor and applying a microcode patch if available. Enabling and initializing the cache, and tweaking paging, memory size, or other options (MSRs) as needed or as directed by the CMOS/BIOS settings.

The BIOS writer would likely work directly with Intel engineers, and have documentation that is not generally available. A lot of the real fun aspects are covered by the chip errata, but the fixes and function of the MSR's are not well/openly documented (at least last time I care to look).
Title: Re: view of fpu registers step by step
Post by: dedndave on October 13, 2010, 02:18:42 AM
well - the definition of the registers seems to be fairly well documented, as far as Intel and AMD are concerned
where the documentation is lacking is what BIOS and Windows do with them at boot
Title: Re: view of fpu registers step by step
Post by: jj2007 on October 13, 2010, 06:39:23 AM
Quote from: clive on October 13, 2010, 12:07:38 AM
Quote from: jj2007 on October 12, 2010, 09:02:37 PM
The Streaming SIMD Extensions fields in the save image (XMM0-XMM7 and MXCSR) may not be loaded into the processor if the CR4.OSFXSR bit is not set. This CR4 bit must be set in order to enable execution of Streaming SIMD Extensions. (http://www.rz.uni-karlsruhe.de/rz/docs/VTune/reference/vc129.htm#Layout)

How would you set CR4.OSFXSR?

And what clown writes a multitasking, SSE compatible OS that DOESN'T enable this?

Ok, so what you are saying is it's enabled by default. Although there are many clowns working for Microsoft, of course :green2
Title: Re: view of fpu registers step by step
Post by: clive on October 13, 2010, 06:18:51 PM
Quote from: jj2007
How would you set CR4.OSFXSR?
..
Ok, so what you are saying is it's enabled by default. Although there are many clowns working for Microsoft, of course :green2

Rich clowns at that. They implemented it in the OS as soon as the processor supported it, but it clearly was not it Win 3.x or Win 95 because it was not invented at that time. The MMX register state was saved using the FSAVE/FRSTOR functions of the FPU.

The OS enables it as part of it's handling of the Math CoProcessor context, to preserve the FPU/MMX/SSE state as it switches from task-to-task. Microsoft has out-of-the-box support for FXSAVE/FXRSTOR since Windows 98 (circa Feb 98 at the time the Deschutes processor was released, it was in a beta release and might have been in earlier releases). As I recall they allocated 512 bytes for the context, but FXSAVE only writes out a portion/pieces of that, but will clearly depend on the processor, and if there is an XMM8..XMM15 set of registers.

When the processor boots CR4.OSFXSR is NOT enabled. So to use FXSAVE/FXRSTOR under DOS you have to explicitly enable it.
Title: Re: view of fpu registers step by step
Post by: ToutEnMasm on October 16, 2010, 09:28:08 AM

I have just a question Now.
When mmx instruction are used , the emms instruction must be used to clear the registers before a FPU instruction.
Is there a way to test if the cpu is in fpu or mmx mode ?
Title: Re: view of fpu registers step by step
Post by: ToutEnMasm on October 18, 2010, 12:53:43 PM

There is not too many answers  :P
Perhaps this link give it
http://iptraf.org/instruction-mmx/page-1/index-Instruction%20mmx.html

If I understand well,this mean:
XSAVE_FORMAT.TagWord == 0 (Byte) ----> MMX mode ELSE fpu mode
And if we do an fpu instruction in the MMX mode:
Quote
If a floating-point instruction loads one of the registers in the FPU register stack before the FPU tag word has been reset by the EMMS instruction, a floating-point stack overflow can occur that will result in a floating-point exception or incorrect result.
Is it Good ?

Title: Re: view of fpu registers step by step
Post by: FORTRANS on October 18, 2010, 01:24:56 PM
Hi,

   The tag word shows the status of the FPU registers (in FPU
mode of course).  Four states are shown in the documentation.


   Tag  Register
  value  state

   00  =  Valid
   01  =  Zero
   10  =  Special
   11  =  Empty


   So, all zeroes can exist outside the MMX mode if all of
the registers are full and contain valid numbers.

Regards,

Steve N.
Title: Re: view of fpu registers step by step
Post by: clive on October 18, 2010, 03:29:04 PM
Quote from: ToutEnMasm
There is not too many answers  :P

I was thinking about it, but couldn't find a quick cite, and wasn't sufficiently invested to look at the CPU behaviour. I don't think it would be too hard to verify/validate.

The information should also exist in the FSAVE/FNSAVE frames, as MMX was supposed to work in systems without explicit knowledge of the scheme. So legacy code to preserve the FPU context should maintain the MMX state across interrupts and task switches.
Title: Re: view of fpu registers step by step
Post by: ToutEnMasm on October 18, 2010, 03:47:26 PM

Perhaps i can do simply as this:
;here avoid the cases where FXSAVE don't work (2 cases)
FXSAVE
emms              ;be sure to be in FPU
--- use of fpu registers allowed --------
FXRSTOR
;if not returned in mmx mode
;the first mmx instruction made the fpu return in mmx mode (that is what is said of them)
Title: Re: view of fpu registers step by step
Post by: clive on October 19, 2010, 12:33:38 AM
Ok, so it is the Tag Word, in the FSAVE/FNSAVE 108 byte variant it is the word at +8, with MMX it is 0x5555 and after the EMMS it is 0xFFFF. So the MMX state is "special"

FSAVE after PXOR MM7,MM7
                               vv vv
0000 : 7F 03 FF FF 00 00 FF FF-AA AA FF FF 00 00 00 00 ................
0010 : 00 00 00 00 00 00 00 00-00 00 FF FF 00 00 00 00 ................
0020 : A0 39 16 00 00 00 AA B8-15 00 0A D8 90 7C 11 B4 .9...........|..
0030 : 17 00 00 00 4C F5 13 00-24 00 01 00 00 00 48 98 ....L...$.....H.
0040 : 15 00 61 B4 00 00 03 02-00 00 00 00 04 00 00 00 ..a.............
0050 : 00 00 00 00 00 00 04 00-D8 F9 13 00 38 02 15 00 ............8...
0060 : AC F5 00 00 00 00 00 00-00 00 FF FF             ............

FSAVE after EMMS

0000 : 7F 03 FF FF 00 00 FF FF-FF FF FF FF 00 00 00 00 ................
0010 : 00 00 00 00 00 00 00 00-00 00 FF FF 00 00 00 00 ................
0020 : A0 39 16 00 00 00 AA B8-15 00 0A D8 90 7C 11 B4 .9...........|..
0030 : 17 00 00 00 4C F5 13 00-24 00 01 00 00 00 48 98 ....L...$.....H.
0040 : 15 00 61 B4 00 00 03 02-00 00 00 00 04 00 00 00 ..a.............
0050 : 00 00 00 00 00 00 04 00-D8 F9 13 00 38 02 15 00 ............8...
0060 : AC F5 00 00 00 00 00 00-00 00 FF FF             ............