I just looked at some of the source for library modules and noticed that many of them save various registers by using a negative stack reference. I can see that this would be much faster than doing a push/pop pair to save and restore but I'm curious about the implications and any restrictions this might place on programs that use the library. Does this preclude the program from setting up and using interupts and exception traps? Are those types of events automatically handled using a different stack? An example that would be common is for a program to set timer events in their process ... I know that can be handled in the message loop but I also thought that a user application could also be written to service such events and other exceptions and traps thru a process interupt vector.
General question is what restrictions does this place on the program that calls functions like this? In 32-bit flat mode running in ring-3 are there ever conditions that would make it unsafe to call szCopy because it saves a value in an unreserved stack location?
szCopy proc src:DWORD,dst:DWORD
; ----------------------------------------------------------
; copy source to destination returning the byte count copied
; ----------------------------------------------------------
mov [esp-4], esi
mov esi, [esp+4]
mov edx, [esp+8]
It is the same as using a local variable (without all the ebp stuff).
No, it's not the same at all because you are using an unreserved dword on the stack in this case. If your program is allowed to receive interupts that use the process stack then the real value of esi will be replaced by the flags or current instruction pointer when an interrupt or trap is serviced. It is certainly without the overhead associated with setting up a stack frame using ebp and it is fine if no interupts or traps are allowed ... What about the debugger traps that use 0CCh (int 3) as breakpoints? What about a divide error trap? Do they automatically use a seperate stack?
Phil,
Its a pretty standard way to write a proc without a stack frame and probably the best comparison is to write the same proc WITH a stack frame and disassemble it. Locals normally go below the stack pointer where parameters go above it.
LOCAL var :DWORD; = [esp-4]
args on the stack as passed to a procedure start at [esp+4].
You cannot directly call an interrupt in ring 3 access and require a driver to do it except on very early win95 oem version that allowed a few of the old DOS interrupts.
Whether you copy ESP to EBP with a stack frame or use ESP directly, it still referes to the same stack memory so there is no loss there.
Thanks for the explanation Hutch! As long as an interupt, trap, or exception can't cause the data below the stack pointer in the process stack to be destroyed it's a lot quicker than push/pop. It's a neat trick!
I tested an INT 2Eh call under Windows 2000 SP4. My results would seem to indicate that the interrupt call is not being bypassed, and it is not using the current stack.
; «««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««
include \masm32\include\masm32rt.inc
; «««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««
.data
.code
; «««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««
start:
; «««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««
; Verify that the interrupt call is being made.
mov eax,1
int 2eh
print uhex$(eax),13,10
; Verify that the interrupt call does not overwrite [esp-4].
mov dword ptr [esp-4],12345678h
mov eax,1
int 2eh
mov eax,[esp-4]
print uhex$(eax),13,10
mov eax, input(13,10,"Press enter to exit...")
exit
; «««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««
end start
Results:
C0000005
12345678
Phil,
A procedure written without a stack frame is usually what you call a leaf procedure in that it does not call other procedures but its very difficult to get one to work where the procedure is used recursively.
Michael,
Interesting test, it surprised me that it could be used in ring3.
Hello Michael,
what does int 2eh with the parameter 1 do? GetLastError?
Code in ring0 has different stack from ring3. During transition, the different states are stored and restored. I don't see how int can affect the given code.
I added an 'int 3' to your code Michael and I was surprised to see that it used the process stack and destroyed [esp-4]. I think this means a user breakpoint set inside one of these procedures would also cause unreserved stack elements to be destroyed when it is executed. I'll try to test that "hypothesis" and post again later.
; «««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««
include \masm32\include\masm32rt.inc
; «««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««
.data
.code
; «««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««
start:
; «««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««
; Verify that the interrupt call is being made.
mov eax,1
int 2eh
print uhex$(eax),13,10
; Verify that the interrupt call does not overwrite [esp-4].
mov dword ptr [esp-4],12345678h
mov eax,1
int 2eh
mov eax,[esp-4]
print uhex$(eax),13,10
; Show that a debugger breakpoint (int 3) will overwrite [esp-4].
mov dword ptr [esp-4],12345678h
mov eax,1
int 3 ; invoke debugger with a db 0CCh (int 3)
mov eax,[esp-4]
print uhex$(eax),13,10
mov eax, input(13,10,"Press enter to exit...")
exit
; «««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««
end start
I'm not sure where the FFFFFFFF that replaced the original [esp-4] came from but it looks like the 'int 3' service routine is using the ring-3 process stack.
c:\asm>testint3
C0000005
12345678
FFFFFFFF
Hello roticv,
The value 1 was just the first value I found that Windows would tolerate.
Mhm, what happens if there's an SEH handler installed?
I'm using an old Microsoft Visual C++ Debugger and the stack was okay after the breakpoint when I loaded the program first and set a normal breakpoint. I'll see if I can get some code working to test asyncronous timer interupts and see if they ever affect the stack.
Michael had just posted some code that uses timers and I'll see if I can build a test to loop thru the szCopy while a timer is ticking and see if it ever 'forgets' the actual contents of 'esi'. So far, so good. The earlier example of using an 'int 3' to invoke the debugger was just a trick I leaned that doesn't look to be safe to use within a procedure like szCopy. The normal breakpoints worked fine.
QvasiModo: I'm not sure what an SEH handler is but, if interupts are allowed to be processed using the ring-3 process stack it might present a problem.
SEH = Structured Exception Handler
For more info see Jeremy Gordon's tutorial...
http://www.jorgon.freeserve.co.uk/Except/Except.htm
Quote from: QvasiModo on June 17, 2005, 10:49:59 PM
Mhm, what happens if there's an SEH handler installed?
Indeed, if an exception occurs there are strong chances the stack will be overwritten. AFAIK SEH uses the ring-3 stack.
I've just checked out the SetConsoleCtrlHandler to see if I could create an interupt with Ctrl-C or Ctrl-Break that could affect the process stack ... but it creates a separate thread that has it's own registers and stack. The program I used to test was modifed from a processor idle time ditti by MichaelW ... Thanks for the code, I couldn't have done this (in less than a year or two) without you!
Thanks for the link and info about Structured Exception Handlers. I'll see what I can do with that. My guess at this point is that Hutch *really* knows what's going on and that it isn't going to be a problem. I just knew when I saw it that it could have spelled *disaster* if kernel DOS were still in charge!
[attachment deleted by admin]
The only restriction using negative stack references is that pushing another value will overwrite it and popping won't return it. Using [esp-4] is placing the value in the space that would be used by the next push, so it isn't overwriting any parameters or locals because they are kept in addresses forward of esp (ie, esp+x). Also, if someone else decides to do it as well it will overwrite the value, but its fine for use over a small piece of code. In a larger, more complicated setting, the advantages of proper naming and security of a local is much better.
I haven't seen anything recently discussing speed issues with push and pop, though I remember on 386/486 popping was slow compared to normal memory reads.