Hey guys can somebody give me a good explain about how the LOCK prefix really work?.
For what i understand LOCK its mostly like ensure the exclusive memory isn't used by another function. (Multiprocessor)
So under this point of view, emulate the LOCK prefix on my sandbox wouldn't be needed at all...
I still want to know in depth how LOCK prefix work.
Thanks
It "locks" the memory bus so that, generally, other CPU's on multi-CPU systems don't access the same memory at the same time (much like a hardware level mutex or semaphore). With everything being in the cache these days, its not as important.
-r
Quote from: redskull on November 08, 2010, 01:07:23 AM
It "locks" the memory bus so that, generally, other CPU's on multi-CPU systems don't access the same memory at the same time (much like a hardware level mutex or semaphore). With everything being in the cache these days, its not as important.
-r
Now what happen if trying to access memory with the LOCK prefix? It would raise an exception or simple it will wait till the LOCK operation is done?.
I can't speak for certain, but I believe the second processor will simply wait until the first processor is finished. I am fairly sure it doesn't throw an exception.
-r
You expose the true latency of the memory subsystem, because the read has to occur after all other pending writes to the location have to complete first (ie write buffers, write back, flushing of matching dirty cache lines on all processor(s)), and during the actual read, and modification, the bus is exclusively held (owned) until the write itself completes.
Remember the buses will likely be running at different speed, different widths, and ownership (CPUs, DMA, etc) is arbitrated.
It is a very expensive operation. The speed of an exchange on memory (implicitly locked) will cost at least twice the access speed of the underlying memory, with whatever setup and turnaround times that might have.
It does not fault, it just stalls everything.
Thanks both redskull and clive. All perfectly clear on my mind :clap:
Quote from: cliveIt does not fault, it just stalls everything.
I might refine that. It stalls everything on the common memory bus. It would not have to stall a second processor from running it's own code out of cache or local memory. Any conflicting cache lines should have been voided, so the processor should/could act autonomously. Attempts to refill the cache from the common memory will stall.
If you have a NUMA style multiprocessors they should be able to proceed if the locked region is outside of their local memory arena, where as SMP would block.
Much of this will clearly depend on the CPU, chipset, busing and memory architecture.
with the 8086, i seem to recall we sometimes used lock with I/O instructions, too
maybe my memory is bus-locked - lol
Quote from: dedndave on November 08, 2010, 11:54:32 PM
with the 8086, i seem to recall we sometimes used lock with I/O instructions, too
maybe my memory is bus-locked - lol
Not sure I can help you there Dave. I can't think of any IO code using RMW. There's DMA and video refresh, but LOCKing a CGA access would just snow things up more.
the point was - i think bus lock applies to I/O the same as it applies to memory
unfortunately, i don't recall any specific instance for discussion - lol
it may be used that way in device drivers, and is probably why we don't see it too often
there is INSB and OUTSB that can be REP'd, but i don't think REP and LOCK go together well :P
in any case, the instruction prefix is rarely used altogether
and, there is no great way to emulate it's behavior, other than using the prefix itself
From the Intel manual:
The LOCK prefix can be prepended only to the following instructions and to those forms of the
instructions that use a memory operand: ADD, ADC, AND, BTC, BTR, BTS, CMPXCHG,
DEC, INC, NEG, NOT, OR, SBB, SUB, XOR, XADD, and XCHG. An undefined opcode
exception will be generated if the LOCK prefix is used with any other instruction. The XCHG
instruction always asserts the LOCK# signal regardless of the presence or absence of the LOCK
prefix.
thanks Paul
guess my memory is useless, now days :P
that also explains why XCHG is such a slug
however, i doubt that applies to XCHG reg,reg
they probably mean to say...
QuoteThe XCHG instruction always asserts the LOCK# signal regardless of the presence or absence of the LOCK prefix, if a memory operand is used.
Hi,
Well it is a bit unlikely that another processor is going to
mess with a register's contents. Whereas memory is fair
game. As far as I/O and memory being similar, it started
out as being the state of one pin on the CPU package that
specified whether I/O or memory was accessed.
Regards,
Steve N.
Real life examples powered by microsoft :lol:
7C80981E kernel32.InterlockedExchange 8B4C24 04 MOV ECX,DWORD PTR SS:[ESP+4] ; ntdll.7C920208
7C809822 8B5424 08 MOV EDX,DWORD PTR SS:[ESP+8]
7C809826 8B01 MOV EAX,DWORD PTR DS:[ECX] ; ntdll.7C91DC9C
7C809828 F0:0FB111 LOCK CMPXCHG DWORD PTR DS:[ECX],EDX ; LOCK prefix
7C80982C ^ 75 FA JNZ SHORT kernel32.7C809828
7C80982E C2 0800 RETN 8
7C809832 kernel32.InterlockedCompareExchange 8B4C24 04 MOV ECX,DWORD PTR SS:[ESP+4] ; ntdll.7C920208
7C809836 8B5424 08 MOV EDX,DWORD PTR SS:[ESP+8]
7C80983A 8B4424 0C MOV EAX,DWORD PTR SS:[ESP+C]
7C80983E F0:0FB111 LOCK CMPXCHG DWORD PTR DS:[ECX],EDX ; LOCK prefix
7C809842 C2 0C00 RETN 0C
7C809846 kernel32.InterlockedExchangeAdd 8B4C24 04 MOV ECX,DWORD PTR SS:[ESP+4] ; ntdll.7C920208
7C80984A 8B4424 08 MOV EAX,DWORD PTR SS:[ESP+8]
7C80984E F0:0FC101 LOCK XADD DWORD PTR DS:[ECX],EAX ; LOCK prefix
7C809852 C2 0800 RETN 8
i can see where you might use it while writing semaphores between threads
Quote from: dedndave on November 09, 2010, 05:07:34 PM
i can see where you might use it while writing semaphores between threads
Yes and meaby on IO operations. I dont see any other application for the LOCK prefix.
Indeed if my sandbox would execute threads (not sure if adding this option) i would have to add something like the LOCK prefix but nothing more complex than set a flag for lock memory operations while the other is being executed.
Also the exception raised is: C000001Eh INVALID_LOCK_SEQUENCE.
Does somebody have a website / list of exceptions that might be raised on all possible cases?