News:

MASM32 SDK Description, downloads and other helpful links
MASM32.com New Forum Link
masmforum WebSite

Non-temporal writes: CAUTION

Started by Apl_and_Asm, May 19, 2008, 11:37:21 PM

Previous topic - Next topic

Apl_and_Asm

We all kwow that non-temporal writes (e.g movnti, movntq,movntps) minimize cache pollution
and should be used whenever memory locations to be updated are unlikely to be accessed
"soon". Because non-temporal writes use WC (Write Combine) semantics, memory locations
are not updated "at once" , data is combined in buffers and stored in memory when these
buffers are full or if a SFENCE or a MFENCE instruction is executed.
However, Intel's words of caution about non-temporal writes, in instruction set reference,
pertain to multiprocessors environments only, which is misleading because either a (single) processor
always kwows the contents of its WC buffers have not been written to memory and it won't
load from non-updated locations, and this is not stated in the manuals, or it could be oblivious
of this fact and a MFENCE (not even a SFENCE) instruction would be necessary to serialize
all loads and stores to ensure all stores preceding MFENCE in program order are completed before
a load instruction following it can read from the memory locations which should have been
updated. I would recommend you end a portion a code doing non-temporal writes with a
MFENCE instruction. At least for the "peace of mind".

Alain

You can only come to the morning through the shadows.
(JRR Tolkien)

Apl_and_Asm

In Intel' s Architectures Software Developer' s Manual Vol 1 ,about non-temporal writes:
"If the memory location being written to is present in the cache hierarchy, the data in
the caches is evicted
. The non-temporal data is written to memory with WC semantcs"

But in the System Programming guide, about Write Combining (WC):
"If the WC buffer is partially filled, writes may be delayed until the next occurrence of
a serializing event such as an SFENCE or MFENCE instruction, CPUID execution, a read
or write to uncached memory
, an interrupt occurrence or a LOCK instruction execution."

So it would suffice, to tame any fears, to explicitly state that this referred to uncached
memory
also includes locations which might correspond to non-temporal writes and whose
data in the caches has been evicted to the purpose of the non-temporal write process.

In other words, that there is no distinction as to the "nature" of this uncached memory.

I will then remove my hopefully useless MFENCE.
You can only come to the morning through the shadows.
(JRR Tolkien)