The MASM Forum Archive 2004 to 2012

General Forums => The Campus => Topic started by: theunknownguy on November 08, 2010, 01:03:19 AM

Title: LOCK prefix question
Post by: theunknownguy on November 08, 2010, 01:03:19 AM
Hey guys can somebody give me a good explain about how the LOCK prefix really work?.

For what i understand LOCK its mostly like ensure the exclusive memory isn't used by another function. (Multiprocessor)

So under this point of view, emulate the LOCK prefix on my sandbox wouldn't be needed at all...

I still want to know in depth how LOCK prefix work.

Thanks
Title: Re: LOCK prefix question
Post by: redskull on November 08, 2010, 01:07:23 AM
It "locks" the memory bus so that, generally, other CPU's on multi-CPU systems don't access the same memory at the same time (much like a hardware level mutex or semaphore).  With everything being in the cache these days, its not as important.

-r
Title: Re: LOCK prefix question
Post by: theunknownguy on November 08, 2010, 01:14:44 AM
Quote from: redskull on November 08, 2010, 01:07:23 AM
It "locks" the memory bus so that, generally, other CPU's on multi-CPU systems don't access the same memory at the same time (much like a hardware level mutex or semaphore).  With everything being in the cache these days, its not as important.

-r

Now what happen if trying to access memory with the LOCK prefix? It would raise an exception or simple it will wait till the LOCK operation is done?.
Title: Re: LOCK prefix question
Post by: redskull on November 08, 2010, 01:36:59 AM
I can't speak for certain, but I believe the second processor will simply wait until the first processor is finished.  I am fairly sure it doesn't throw an exception.

-r
Title: Re: LOCK prefix question
Post by: clive on November 08, 2010, 01:59:19 AM
You expose the true latency of the memory subsystem, because the read has to occur after all other pending writes to the location have to complete first (ie write buffers, write back, flushing of matching dirty cache lines on all processor(s)), and during the actual read, and modification, the bus is exclusively held (owned) until the write itself completes.

Remember the buses will likely be running at different speed, different widths, and ownership (CPUs, DMA, etc) is arbitrated.

It is a very expensive operation. The speed of an exchange on memory (implicitly locked) will cost at least twice the access speed of the underlying memory, with whatever setup and turnaround times that might have.

It does not fault, it just stalls everything.
Title: Re: LOCK prefix question
Post by: theunknownguy on November 08, 2010, 03:05:01 AM
Thanks both redskull and clive. All perfectly clear on my mind  :clap:
Title: Re: LOCK prefix question
Post by: clive on November 08, 2010, 04:43:14 PM
Quote from: cliveIt does not fault, it just stalls everything.

I might refine that. It stalls everything on the common memory bus. It would not have to stall a second processor from running it's own code out of cache or local memory. Any conflicting cache lines should have been voided, so the processor should/could act autonomously. Attempts to refill the cache from the common memory will stall.

If you have a NUMA style multiprocessors they should be able to proceed if the locked region is outside of their local memory arena, where as SMP would block.

Much of this will clearly depend on the CPU, chipset, busing and memory architecture.
Title: Re: LOCK prefix question
Post by: dedndave on November 08, 2010, 11:54:32 PM
with the 8086, i seem to recall we sometimes used lock with I/O instructions, too
maybe my memory is bus-locked - lol
Title: Re: LOCK prefix question
Post by: clive on November 09, 2010, 03:48:09 AM
Quote from: dedndave on November 08, 2010, 11:54:32 PM
with the 8086, i seem to recall we sometimes used lock with I/O instructions, too
maybe my memory is bus-locked - lol

Not sure I can help you there Dave. I can't think of any IO code using RMW. There's DMA and video refresh, but LOCKing a CGA access would just snow things up more.
Title: Re: LOCK prefix question
Post by: dedndave on November 09, 2010, 04:13:51 AM
the point was - i think bus lock applies to I/O the same as it applies to memory
unfortunately, i don't recall any specific instance for discussion - lol
it may be used that way in device drivers, and is probably why we don't see it too often
there is INSB and OUTSB that can be REP'd, but i don't think REP and LOCK go together well   :P

in any case, the instruction prefix is rarely used altogether
and, there is no great way to emulate it's behavior, other than using the prefix itself
Title: Re: LOCK prefix question
Post by: dioxin on November 09, 2010, 10:15:58 AM
From the Intel manual:
The LOCK prefix can be prepended only to the following instructions and to those forms of the
instructions that use a memory operand: ADD, ADC, AND, BTC, BTR, BTS, CMPXCHG,
DEC, INC, NEG, NOT, OR, SBB, SUB, XOR, XADD, and XCHG. An undefined opcode
exception will be generated if the LOCK prefix is used with any other instruction. The XCHG
instruction always asserts the LOCK# signal regardless of the presence or absence of the LOCK
prefix.
Title: Re: LOCK prefix question
Post by: dedndave on November 09, 2010, 10:44:48 AM
thanks Paul
guess my memory is useless, now days   :P

that also explains why XCHG is such a slug
however, i doubt that applies to XCHG reg,reg
they probably mean to say...
QuoteThe XCHG instruction always asserts the LOCK# signal regardless of the presence or absence of the LOCK prefix, if a memory operand is used.
Title: Re: LOCK prefix question
Post by: FORTRANS on November 09, 2010, 01:36:51 PM
Hi,

   Well it is a bit unlikely that another processor is going to
mess with a register's contents.  Whereas memory is fair
game.  As far as I/O and memory being similar, it started
out as being the state of one pin on the CPU package that
specified whether I/O or memory was accessed.

Regards,

Steve N.
Title: Re: LOCK prefix question
Post by: theunknownguy on November 09, 2010, 05:05:04 PM
Real life examples powered by microsoft  :lol:

7C80981E kernel32.InterlockedExchange                                    8B4C24 04                                   MOV ECX,DWORD PTR SS:[ESP+4]                                      ; ntdll.7C920208
7C809822                                                                 8B5424 08                                   MOV EDX,DWORD PTR SS:[ESP+8]
7C809826                                                                 8B01                                        MOV EAX,DWORD PTR DS:[ECX]                                        ; ntdll.7C91DC9C
7C809828                                                                 F0:0FB111                                   LOCK CMPXCHG DWORD PTR DS:[ECX],EDX                               ; LOCK prefix
7C80982C                                                               ^ 75 FA                                       JNZ SHORT kernel32.7C809828
7C80982E                                                                 C2 0800                                     RETN 8


7C809832 kernel32.InterlockedCompareExchange                             8B4C24 04                                   MOV ECX,DWORD PTR SS:[ESP+4]                                      ; ntdll.7C920208
7C809836                                                                 8B5424 08                                   MOV EDX,DWORD PTR SS:[ESP+8]
7C80983A                                                                 8B4424 0C                                   MOV EAX,DWORD PTR SS:[ESP+C]
7C80983E                                                                 F0:0FB111                                   LOCK CMPXCHG DWORD PTR DS:[ECX],EDX                               ; LOCK prefix
7C809842                                                                 C2 0C00                                     RETN 0C



7C809846 kernel32.InterlockedExchangeAdd                                 8B4C24 04                                   MOV ECX,DWORD PTR SS:[ESP+4]                                      ; ntdll.7C920208
7C80984A                                                                 8B4424 08                                   MOV EAX,DWORD PTR SS:[ESP+8]
7C80984E                                                                 F0:0FC101                                   LOCK XADD DWORD PTR DS:[ECX],EAX                                  ; LOCK prefix
7C809852                                                                 C2 0800                                     RETN 8







Title: Re: LOCK prefix question
Post by: dedndave on November 09, 2010, 05:07:34 PM
i can see where you might use it while writing semaphores between threads
Title: Re: LOCK prefix question
Post by: theunknownguy on November 09, 2010, 05:09:15 PM
Quote from: dedndave on November 09, 2010, 05:07:34 PM
i can see where you might use it while writing semaphores between threads

Yes and meaby on IO operations. I dont see any other application for the LOCK prefix.

Indeed if my sandbox would execute threads (not sure if adding this option) i would have to add something like the LOCK prefix but nothing more complex than set a flag for lock memory operations while the other is being executed.

Also the exception raised is: C000001Eh INVALID_LOCK_SEQUENCE.

Does somebody have a website / list of exceptions that might be raised on all possible cases?