News:

MASM32 SDK Description, downloads and other helpful links
MASM32.com New Forum Link
masmforum WebSite

Lowest possible stack address

Started by jj2007, December 16, 2010, 08:01:59 PM

Previous topic - Next topic

jj2007

Quote from: Antariy on December 17, 2010, 02:57:01 AM
Please, try code which I suggested. If you want to find the end of the stack, you shoud go to its end, any "hypotetical" calculations is not right.

I have run your code. It says 30000h, and it crashes at 33000h with "Stack overflow".
However, with "mem access violation allowed" in Olly, you can restart with F9, and it will finally crash at 31000h.
The practical relevance is low, as you would have to write an exception handler to use the last 2000h aka 0.7% of your available stack. I am not an expert for SEH, but how should the handler know the difference between a "normal exception" somewhere in the legit range, and the "final two exceptions"? By checking whether esp is below 33000h? That is the subject of the whole bloody thread: That this value is not documented...

Anybody willing to run the attachment (here), in XP and other OS?

Antariy

Jochen, run the code attached.

It would work only on my and your system, since ESP constants is rely on the system, and can vary.

I get this:


0012FFC4         stack on entry

00030000        would be the hard end of the stack

Difference is: Latest_Successfully_Accessed_Address - value printed above

00033024        Latest successfully accessed address
00033020        Latest successfully accessed address
0003301C        Latest successfully accessed address
00033018        Latest successfully accessed address
00033014        Latest successfully accessed address
00033010        Latest successfully accessed address
0003300C        Latest successfully accessed address
00033008        Latest successfully accessed address
00033004        Latest successfully accessed address
00033000        Latest successfully accessed address



Now, computation of difference.

"My result" is returned at the line:

00030000        would be the hard end of the stack


Result of latest successfully accessed address ("your result"):

00033000        Latest successfully accessed address


I.e.:
00033000h-00030000h=3000h

Difference is 3000h (12 KB). It is contain last guard pages, and one latest not accessible page.

So, you can use my method precisely, just substract from EAX not 1024*1024, but 1024*1024-16384.

I.e.

mov eax,fs:[4]
add eax,-(1024*1024-16384)
; NOW in EAX *safe* end of the stack


*Safe end* is meant that you still can return from such deep code, without a crash :bg

japheth

VirtualQuery will tell some info about the memory block which contains the stack:


.386
.MODEL FLAT,stdcall
option casemap:none

.nolist
.nocref
WIN32_LEAN_AND_MEAN equ 1
include windows.inc
include stdio.inc
.cref
.list

lf equ 10

CStr macro text:VARARG
local xxx
.const
xxx db text,0
.code
exitm <offset xxx>
endm

.CODE

main PROC c

local mbi:MEMORY_BASIC_INFORMATION
local dwOldProt:dword

invoke VirtualQuery, addr mbi, addr mbi, sizeof MEMORY_BASIC_INFORMATION
invoke printf, CStr(<"VirtualQuery(%X)=%X",lf>), addr mbi, eax
invoke printf, CStr(<"BaseAddress=%X",lf>), mbi.BaseAddress
invoke printf, CStr(<"AllocationBase=%X",lf>), mbi.AllocationBase
invoke printf, CStr(<"RegionSize=%X",lf>), mbi.RegionSize
ret
main ENDP

mainCRTStartup:
call main
invoke ExitProcess, 0

END mainCRTStartup


It' written for WinInc, but should be trivial to make it work with Masm32.

Here's output on my machine (XP, binary with default stack size):

VirtualQuery(12FFA0)=1C
BaseAddress=12F000
AllocationBase=30000
RegionSize=1000


japheth

I also checked whether ESP may reach the start of the memory block:


.386
.MODEL FLAT,stdcall
option casemap:none

.nolist
.nocref
WIN32_LEAN_AND_MEAN equ 1
include windows.inc
include stdio.inc
.cref
.list

lf equ 10

CStr macro text:VARARG
local xxx
.const
xxx db text,0
.code
exitm <offset xxx>
endm

.CODE

assume fs:nothing

main PROC c

local mbi:MEMORY_BASIC_INFORMATION
local   bottom:dword

invoke VirtualQuery, addr mbi, addr mbi, sizeof MEMORY_BASIC_INFORMATION
invoke printf, CStr("VirtualQuery(%X)=%X",lf), addr mbi, eax
invoke printf, CStr("BaseAddress=%X",lf), mbi.BaseAddress
invoke printf, CStr("AllocationBase=%X",lf), mbi.AllocationBase
invoke printf, CStr("RegionSize=%X",lf), mbi.RegionSize
mov esi, mbi.BaseAddress
sub esi, mbi.AllocationBase
invoke VirtualAlloc, mbi.AllocationBase, esi, MEM_COMMIT, PAGE_READWRITE
invoke printf, CStr("VirtualAlloc(%X, %X)=%X",lf), mbi.AllocationBase, esi, eax
push myexc
pushd fs:[0]
mov fs:[0],esp
mov esi, esp
.while 1
push 0
.endw
cont_exec:
mov esp,esi
pop fs:[0]
add esp,4
invoke printf, CStr("exception occured at ESP=%X",lf), bottom
ret
myexc:
mov eax, [esp+12] ;get context
mov [eax].CONTEXT.Eip_, cont_exec
mov eax, [eax].CONTEXT.Esp_
mov [bottom], eax
xor eax, eax ;continue execution
retn

main ENDP

mainCRTStartup:
call main
invoke ExitProcess, 0

END mainCRTStartup


The result: it may reach the start of the block, but you cannot detect it with a simple exception handler. the "myexc" label is never reached when you try this program.

With the CDB debugger, it can be seen however:

Quote
image00400000+0x10d0:
004010d0 e82bffffff      call    image00400000+0x1000 (00401000)
0:000>g
VirtualQuery(12FFA0)=1C
BaseAddress=12F000
AllocationBase=30000
RegionSize=1000
VirtualAlloc(30000, FF000)=30000
(1fc.120): Access violation - code c0000005 (first chance)
First chance exceptions are reported before any exception handling.
This exception may be expected and handled.
eax=00000021 ebx=7ffdc000 ecx=77c118bf edx=77c31b78 esi=0012ff94 edi=00000000
eip=00401096 esp=00030000 ebp=0012ffbc iopl=0         nv up ei pl nz na pe nc
cs=001b  ss=0023  ds=0023  es=0023  fs=003b  gs=0000             efl=00010206
image00400000+0x1096:
00401096 6a00            push    0
0:000>

it crashes at ESP=30000h

redskull

Quote from: jj2007 on December 17, 2010, 01:44:28 AM
My theory is that there are guard pages above 33000h; they are committed until esp hits 33000h minus -1, in which latter case the exception handler says stop. The question is, I repeat, whether the 33000h changes for randomising Windows versions.

The last page of the stack is set as a gaurd page.  When it's accessed, the exception handler just commits the next sequential one (or reserves another and commits it), and sets that as a gaurd page.  If it can't reserve it, then it tosses the exception.  In this case, those three pages already exist, so you get the exception.  I don't see how not being able to allocate pages that you already allocated is undocumented...

But what's important is that it's still YOUR MEMORY.  You can set ESP to those addressess and read from it, just like it was the stack.  In fact, you can even get rid of that memory using VirtualFree, and let your stack keep growing until it hits the next range at 20000.  If your 1MB stack starts at 12E000, and you reserve a page at 2D000, that your new "hard limit" is there, instead.

-r
Strange women, lying in ponds, distributing swords, is no basis for a system of government

jj2007

Quote from: redskull on December 17, 2010, 02:41:30 PM
... sets that as a gaurd page.  If it can't reserve it, then it tosses the exception.

Yes indeed. And I doubt that you can simply free the 'OS crap', as you called it earlier. What's worse is that I tested it today on another PC, XP with Prescott P4, and instead of at 33000h, the crash happens at 43000h...

Stilll working on a fool-proof solution...

redskull

I tried executing a

invoke VirtualFree,00030000h, 0, MEM_RELEASE

with no problem, and it vanished off the memory list, same as any other page.  But what's *really* strange is that now it seems to have permenently vanished!  New processess don't seem to have it allocated anymore.  :dazzled:


-r


Ah, found it, it came back at 130000

Strange women, lying in ponds, distributing swords, is no basis for a system of government

BogdanOntanu

This kind of behaviour does not have to be documented.

In fact with current address randomization techniques and having PE's with relocations the OS only needs to know the size of your executables' image, stack and heap requirements.

However the OS is free to place your application at whatever address it likes and also free to place your stack at whatever address it likes.

In consequence you should NOT base your application on assumptions about undocumented OS features like the address where the stack is located and where is the lowest address in the stack memory range.
Ambition is a lame excuse for the ones not brave enough to be lazy.
http://www.oby.ro

jj2007

Quote from: BogdanOntanu on December 17, 2010, 06:14:37 PM
In consequence you should NOT base your application on assumptions about undocumented OS features like the address where the stack is located and where is the lowest address in the stack memory range.

Dear Bogdan,

Since I want to avoid that this thread is moved to The Colosseum, let me just kindly express my opinion that knowing when my app should stop the recursion is not a luxury but should rather be a basic feature of an OS.

:thumbu

redskull

I think it's well documented (or at least well known) that you are only gaurenteed a maximum stack size equal to that specified in the link switch (which defaults to 1MB).  Anything above that is just an extra luxury, not a gaurentee.
Strange women, lying in ponds, distributing swords, is no basis for a system of government

hutch--

That is pretty much the case, you get what you ask for if that much stack space is available and the OS has enough room to load the application into memory.
Download site for MASM32      New MASM Forum
https://masm32.com          https://masm32.com/board/index.php

Antariy

Jochen,
Quote
Stilll working on a fool-proof solution...

I already suggested a solution, when suggest to substract stack allocation size from value taken from FS:[4]

But you did not trust :P

Well, I done a small testing app. have a look into it. It is simple and straighforward. I use macro to determine - is we can do further recursive calls.

The program is multi threaded, and will run the number of threads you will ask it.

For each thread you should enter the stack size. Relatively to the entered size, you will see that different threads with different stack size do different number of recursion levels, and they all is not crashes - they just returned when "see" that this place "is too deep".

Here is my results, bold is the things which I have entered:

Quote
Hi, this is testing program
Enter a number of threads to test: 5
Enter a stack size for the thread #1: 200000
Enter a stack size for the thread #2: 400000
Enter a stack size for the thread #3: 600000
Enter a stack size for the thread #4: 800000
Enter a stack size for the thread #5: 1000000
Hi, this is thread #2, starting recursive function...
The thread #2 do 91 recursive calls (limited to the stack size only)
Hi, this is thread #4, starting recursive function...
The thread #4 do 191 recursive calls (limited to the stack size only)
Hi, this is thread #1, starting recursive function...
The thread #1 do 41 recursive calls (limited to the stack size only)
Hi, this is thread #3, starting recursive function...
The thread #3 do 141 recursive calls (limited to the stack size only)
Hi, this is thread #5, starting recursive function...
The thread #5 do 240 recursive calls (limited to the stack size only)
All threads are finished, find and press [Any] key to exit...

Program is support multi-core/CPUs systems, of course. The results can be printed not in order of threads creation - this is just "playing" of the scheduler - thread runned simultaneously, and that's not known, who will return first, or next.

Each recursive function have a fat local buffer, to real-life testing.

So, this would be nice to test this program on different systems, especially on x64. On 32bit systems it should work properly anyway - this is documented things was used in the program. If it not work properly (crashes), then this is meant that bug is in *app*, not in *idea* :bg
On x64 systems it also should work properly, at least until 32bit programs exists. Otherwice many system tools would not work.

Program and sources in the archive attached.

P.S. CRT input functions is not checked for return, if you will enter bad numbers (0 for example), or not numbers at all, so, you will the initiator of apps crash :P

jj2007

Not bad, Alex, and it seems to work, but in real life the app needs to know how much stack has been assigned. Attached what I consider "foolproof" - it crashes only if you give it one more push than allowed.

The archive contains the *.obj, so that you can link it with different stack sizes.

Quote004B3000         is the lowest address
009B0000         initial top of the stack
009AFFC4         current top of the stack
004B3024
004B3020
004B301C
004B3018
004B3014
004B3010
004B300C
004B3008
004B3004
004B3000
OK, no overflow
Hit a key to see it crash:
If you see this, the proggie did not work as it should
(red part visible in Olly)


jj2007

Dave,

The first 8 lines after start: tell a more complete story :wink
All three elements necessary to know the exact limit LowStack are somewhat unofficial, especially the three guard pages (3000h); but it works perfectly, at least on Win XP. Grateful for a test with Win2000, Vista and Win 7, though.