Following a heated debate (http://www.masm32.com/board/index.php?topic=12460.msg128535#msg128535) on the virtues of recursion, here a snippet showing that the lowest possible stack address is 33000h, at least on Windows XP:
include \masm32\include\masm32rt.inc
.code
start:
xor ecx, ecx
mov ebx, esp
.Repeat
push eax
add ecx, 4
.Until esp<=33000h ; decimal 208896
mov esi, esp
mov esp, ebx
print hex$(ecx), 9, " bytes of stack available", 13, 10
inkey hex$(esi), 9, " lowest possible stack without crashing"
exit
end start
Try replacing esp<=33000h with esp<33000h...
Now my question: I have googled a lot for this 33000h limit, as it would be the ideal way to test a recursion limit, but I found practically nothing. It seems not to be documented anywhere. Does somebody have insider knowledge?
When you get down that low, the kernel probably has parts of it reserved for whatever, so you are crashing into whatever's already there. You've piqued my curiosity, so i'll look into further when I have some time.
However, you can easily get around this limit via a linker switch; try the same thing with /STACK:1000000000, and it will put the stack way up high in memory where it can grow as needed (or, at least until it crashes into your code at the other end)
-r
Yah, there's defentily something there; 3 pages ranging from 30000 to 33000, of "shareable" memory. It has some unicode strings in it, with some side-by-side assembly stuff, networking stuff, GDI+... defenitly O/S crap.
-r
The correct question is not "how low can you go with the stack address" :D
The correct questions are:
a) how big you can go with the stack size if you need it for recursion
b) how small you can go with the stack size if you want to have many many threads each with it's own stack space
Quote from: redskull on December 16, 2010, 08:23:32 PM
However, you can easily get around this limit via a linker switch; try the same thing with /STACK:1000000000, and it will put the stack way up high in memory where it can grow as needed (or, at least until it crashes into your code at the other end)
Interesting idea. The 'high' stack will be reserved, not committed, so there should be no speed penalty.
Quote from: redskull on December 16, 2010, 08:34:32 PM
Yah, there's defentily something there; 3 pages ranging from 30000 to 33000, of "shareable" memory. It has some unicode strings in it, with some side-by-side assembly stuff, networking stuff, GDI+... defenitly O/S crap.
Can't see that with Olly - how did you see this memory range?
mov esi, esp ; 33000h
mov esp, ebx
mov eax, [esi] ; OK
mov eax, [esi-4] ; stack overflow exception
@Bogdan: Think about some more intelligent questions, always appreciated :U
I'm thinking it's the page table directory, but that's only a gut feeling. I'll have to look at it more later.
Also, yes, the first number is the reserved, which is allocated page by page as you access it; technically you get get a speed penalty once every 1024 pushes, instead of one big one all up front, but if you prefer the latter you can specify to commit the whole thing at once via the same linker switch. But either way, the stack should grow until it hits something else. When i reserved a gig, the stack got put all the way at 3BDBE000!
In Olly, under the memory page, the third line, the range at 00030000; i just double clicked the line. But you should be able to just dump the same range any which way you choose
-r
Quote from: redskull on December 16, 2010, 09:00:49 PMIn Olly, under the memory page, the third line, the range at 00030000; i just double clicked the line.
Thanks. For others who read this post: Click View, Memory Map, then double-click the desired range. For example, 20000h yields this:
Dump - 00020000..00020FFF
Address Hex dump Decoded data Comments
00020000 Ú. 00100000 dd 00001000 ; MaximumLength = 4096.
00020004 ³. 80070000 dd 00000780 ; Length = 1920.
00020008 ³. 01000000 dd 00000001 ; Flags = 1
0002000C ³. 00000000 dd 00000000 ; DebugFlags = 0
00020010 ³. 02006711 dd 11670002 ; ConsoleHandle = 11670002
00020014 ³. 00000000 dd 00000000 ; ConsoleFlags = 0
00020018 ³. 03000000 dd 00000003 ; StdInputHandle = 00000003
0002001C ³. 07000000 dd 00000007 ; StdOutputHandle = 00000007
00020020 ³. 0B000000 dd 0000000B ; StdErrorHandle = 0000000B
00020024 ³. 2600 dw 26 ; CurrentDir_Size = 38.
00020026 ³. 0802 dw 208 ; CurrentDir_Maxsize = 520.
00020028 ³. 90020200 dd 00020290 ; CurrentDir = "D:\masm32\RichMasm\"
0002002C ³. 0C000000 dd 0000000C ; CurrentDirectoryHandle = 0000000C
Still, I'd like to see an official Microsoft documentation saying "tell your app to stop when esp falls below 33000h". It's exciting to see this stuff, but it's called "undocumented behaviour" :bg
Unless you consider this May 1996 "Under the Hood" article on the Thread Information Block (http://www.microsoft.com/msj/archive/s2ce.aspx) an official Microsoft documentation.
:bg
There is another wildcard on the way, later OS designs randomises the stack address to decrease the viability of stack exploits. If you need much more stack room for recursive algos, set it with the linker and code a recursion depth limiter.
I am trying to imagine how "randomising" the stack address would work in this context. By default, an app has 1MB of stack. A randomised version would add a multiple of DWORD to that, in order to confuse exploiters. The question is whether the lowest possible address, 33000h, would also change...
Win XP SP3:
0012FFC4 stack on entry
000FCFC4 bytes of stack available
00033000 lowest possible stack without crashing
0000303C difference to 1 MB
include \masm32\include\masm32rt.inc
.code
start:
mov ebx, esp
print hex$(ebx), 9, " stack on entry", 13, 10
xor edi, edi
.Repeat
push eax
add edi, 4
.Until esp<=33000h ; decimal 208896
xchg esp, ebx
print hex$(edi), 9, " bytes of stack available", 13, 10
print hex$(ebx), 9, " lowest possible stack without crashing", 13, 10
mov eax, 100000h ; 1 MB
sub eax, edi
inkey hex$(eax), 9, " difference to 1 MB"
exit
end start
Quote from: jj2007 on December 17, 2010, 01:23:08 AM
Win XP SP3:
Also add:
mov eax,fs:[4] ; the top of the stack
add eax,-(1024*1024) ; or value which you pass to the linker
; in EAX - lowest possible address lower than stack doesn't exist
mov eax,fs:[8] ; the current end of the stack (page granularity)
Have a look into second line of the snippet - there is a way to determine "the end" of the stack of the thread. You should substract the value of the stack size (reserved) which you specify to the linker for the main thread.
For the newly created threads, you should specify the same value as for main thread, OR value which you pass (if do) to the CreateThread.
Quote from: jj2007 on December 17, 2010, 01:36:44 AM
Quote from: Antariy on December 17, 2010, 01:28:13 AM
Quote from: jj2007 on December 17, 2010, 01:23:08 AM
Win XP SP3:
Also add:
mov eax,fs:[4] ; the top of the stack
error A2108:use of register assumed to ERROR
assume fs:nothing
... do anything...
assume fs:error
Also I have made addition in original post - have a look into it.
Yes I looked but your assumption is incompatible with my test results.
0012FFC4 stack on entry
000FCFC4 bytes of stack available
00130000 top of the stack
00033000 lowest possible stack without crashing
0000303C difference to 1 MB
My theory is that there are guard pages above 33000h; they are committed until esp hits 33000h minus -1, in which latter case the exception handler says stop. The question is, I repeat, whether the 33000h changes for randomising Windows versions.
Quote from: jj2007 on December 17, 2010, 01:44:28 AM
Yes I looked but your assumption is incompatible with my test results.
0012FFC4 stack on entry
000FCFC4 bytes of stack available
00130000 top of the stack
00033000 lowest possible stack without crashing
0000303C difference to 1 MB
Can you post total code?
Quote from: jj2007 on December 17, 2010, 01:44:28 AM
My theory is that there are guard pages above 33000h; they are committed until esp hits 33000h minus -1, in which latter case the exception handler says stop. The question is, I repeat, whether the 33000h changes for randomising Windows versions.
Last guard page worked as flag to raise "stack overflow" exception.
Post total code, please, at this moment I'm far from writing anything.
in the first post, Alex :U
Quote from: jj2007 on December 17, 2010, 01:44:28 AM
Yes I looked but your assumption is incompatible with my test results.
This is not assumption.
Microsoft support (http://support.microsoft.com/kb/315937):
QuoteThis is implemented by placing a page with PAGE_GUARD access at the end of the current stack. When your code causes the stack pointer to point to an address on this page, an exception occurs. The system then does the three following things:
1. Remove the PAGE_GUARD protection on the guard page, so that the thread can read and write data to the memory.
2. Allocate a new guard page that is located one page below the last one.
3. Rerun the instruction that raised the exception.
In this way, the system can grow the stack for your thread automatically. Each thread in a process has a maximum stack size. The stack size is set at compile time by the /STACK:reserve[,commit] linker switch, or by the STACKSIZE statement in the .def file for the project. When this maximum stack size is exceeded, the system does the three following things:
* Remove the PAGE_GUARD protection on the guard page, as described earlier.
* Try to allocate a new guard page below the last one. However, this fails because the maximum stack size has been exceeded.
* Raise an exception, so that the thread can handle it in the exception block.
Complete code (already posted above):
include \masm32\include\masm32rt.inc
.code
start:
mov ebx, esp
print hex$(ebx), 9, " stack on entry", 13, 10
xor edi, edi
.Repeat
push eax
add edi, 4
.Until esp<=33000h ; decimal 208896
xchg esp, ebx
print hex$(edi), 9, " bytes of stack available", 13, 10
ASSUME fs:NOTHING
mov eax, fs:[4]
print hex$(eax), 9, " top of the stack", 13, 10
ASSUME fs:ERROR
print hex$(ebx), 9, " lowest possible stack without crashing", 13, 10
mov eax, 100000h ; 1 MB
sub eax, edi
inkey hex$(eax), 9, " difference to 1 MB"
exit
end start
Quote from: dedndave on December 17, 2010, 01:51:29 AM
in the first post, Alex :U
Oh... Why nobody never read my posts carefully. Did not you see that I have suggested method of determination, and Jochen is not posted code with incorporated suggestion?
Quote from: jj2007 on December 17, 2010, 01:57:22 AM
Complete code (already posted above):
No need in MS's quatations.
Code is not right.
The same question as in my previous post... :( :( :(
Right code is:
mov eax, fs:[4]
sub eax,1024*1024
EAX - is the lowest possible address for current thread stack. But this address is not commited usually, since app doesn't go to the last page.
I.e. somewhere in the code:
mov eax, fs:[4]
sub eax,1024*1024
cmp esp,eax
jz @WeCannotPushAnyMore
push etc.
Alex,
Try running my post with linker option /stack:0
You will see that top of stack is 70000h (not zero, surprise surprise... :wink), but lowest possible address is again 33000h. The system tries to protect the memory below 33000h, it doesn't care how much stack you wanted. You get roughly what you wanted, but very roughly.
Quote from: jj2007 on December 17, 2010, 02:08:18 AM
And instead of whining that nobody reads your posts, start testing the code.
Jochen, I take offence at this point. If you think that I have posted anything just for nothing - this is your problems.
Your code is not supposed to do real testing. Each page of stack, which lies below current, is guarded. Access to that page make access violation, but this is intentionally to do stack allocation by pieces. Page accessed stand commited. But the same last page cannot be commited - it is the last page.
Your code is not do reliable overflowing, since you limit it, and your constants can vary.
Now I'll try to make testing plan, even if this is not time for this, and this is very pity to me, that you didn't trust to me.
Quote from: Antariy on December 17, 2010, 02:12:40 AM
Your code is not do reliable overflowing, since you limit it, and your constants can vary.
Perhaps you should read
my posts carefully:
Try replacing esp<=33000h with esp<33000h...
In Ollys Options->Exceptions. Select "Memory Access Violation" Checkbox at "Ignore..." section.
Now, compile this code, and run it in olly.
Press F9. Then Press F9 again. Now in dump window go to the ESP value.
For me, I get
00030000 - would be the lowest address
and current ESP value - 00031000. It is different for 1000h - one page size. The last page is totally unaccesible. And this is end of the stack.
Q.E.D.
include \masm32\include\masm32rt.inc
.code
start:
ASSUME fs:NOTHING
mov eax, fs:[4]
sub eax,1024*1024
print hex$(eax)," - would be the lowest address",13,10
@@:
push eax
jmp @B
end start
Which OS? For mine (XP SP3) the value is 33000h.
0012FFC4 stack on entry
000FCFC4 bytes of stack truely available (relative to entry stack)
00130000 top of the stack
00030000 lowest possible address according to Alex
00033000 lowest possible stack according to my computer
0000303C difference 1 MB minus truely available stack
Quote from: jj2007 on December 17, 2010, 02:34:43 AM
Which OS? For mine (XP SP3) the value is 33000h.
Compile code which I write. Only will show true end of the stack. Do things which I explain in previous post when running that code.
You should not rely on ESP value, initial or final or between - it is different for each OS, and for each thread.
Quote from: jj2007 on December 17, 2010, 02:34:43 AM
Which OS? For mine (XP SP3) the value is 33000h.
00030000 lowest possible address according to Alex
00033000 lowest possible stack according to my computer
Difference is 3000h - 3 pages, 12 KB. Latest guard pages are signalled you that the end of the stack is close. But next 2 pages are awailable, too. So, latest accessible address is 31000, 30000 - is lowest possible address, but this page (30000-30FFF) is not accessible.
Not on my PC. But even if it was, why should an app crash into the last two pages, requiring an exception handler? And how should that exception handler know that the end is near???
Complete code again - the exe must be run from a commandline if TestVal=0.
My output for TestVal=0:
0012FFC4 stack on entry
00033024
00033020
0003301C
00033018
00033014
00033010
0003300C
00033008
00033004
00033000
... and there it stops.
Quote from: jj2007 on December 17, 2010, 02:49:08 AM
Not on my PC. But even if it was, why should an app crash into the last two pages, requiring an exception handler? And how should that exception handler know that the end is near???
Complete code again - the exe must be run from a commandline if TestVal=0.
Jochen, you calculate difference not right. Difference is thing between "my" stack end, and "your" stack end.
Please, try code which I suggested. If you want to find the end of the stack, you shoud go to its end, any "hypotetical" calculations is not right.
Why not run the code which I have suggested as test? Just follow to my way of testing. And I say this not because "my way" is tricky and good for me, but because it is only one reliable way.
Quote from: Antariy on December 17, 2010, 02:57:01 AM
Please, try code which I suggested. If you want to find the end of the stack, you shoud go to its end, any "hypotetical" calculations is not right.
I
have run your code. It says 30000h, and it crashes at 33000h with "Stack overflow".
However, with "mem access violation allowed" in Olly, you can restart with F9, and it will finally crash at 31000h.
The practical relevance is low, as you would have to write an exception handler to use the last 2000h aka 0.7% of your available stack. I am not an expert for SEH, but how should the handler know the difference between a "normal exception" somewhere in the legit range, and the "final two exceptions"? By checking whether esp is below 33000h? That is the subject of the whole bloody thread: That this value is not documented...
Anybody willing to run the attachment (here (http://www.masm32.com/board/index.php?topic=15665.msg128746#msg128746)), in XP and other OS?
Jochen, run the code attached.
It would work only on my and your system, since ESP constants is rely on the system, and can vary.
I get this:
0012FFC4 stack on entry
00030000 would be the hard end of the stack
Difference is: Latest_Successfully_Accessed_Address - value printed above
00033024 Latest successfully accessed address
00033020 Latest successfully accessed address
0003301C Latest successfully accessed address
00033018 Latest successfully accessed address
00033014 Latest successfully accessed address
00033010 Latest successfully accessed address
0003300C Latest successfully accessed address
00033008 Latest successfully accessed address
00033004 Latest successfully accessed address
00033000 Latest successfully accessed address
Now, computation of difference.
"My result" is returned at the line:
00030000 would be the hard end of the stack
Result of latest successfully accessed address ("your result"):
00033000 Latest successfully accessed address
I.e.:
00033000h-00030000h=3000h
Difference is 3000h (12 KB). It is contain last guard pages, and one latest not accessible page.
So, you can use my method precisely, just substract from EAX not 1024*1024, but 1024*1024-16384.
I.e.
mov eax,fs:[4]
add eax,-(1024*1024-16384)
; NOW in EAX *safe* end of the stack
*Safe end* is meant that you still can return from such deep code, without a crash :bg
VirtualQuery will tell some info about the memory block which contains the stack:
.386
.MODEL FLAT,stdcall
option casemap:none
.nolist
.nocref
WIN32_LEAN_AND_MEAN equ 1
include windows.inc
include stdio.inc
.cref
.list
lf equ 10
CStr macro text:VARARG
local xxx
.const
xxx db text,0
.code
exitm <offset xxx>
endm
.CODE
main PROC c
local mbi:MEMORY_BASIC_INFORMATION
local dwOldProt:dword
invoke VirtualQuery, addr mbi, addr mbi, sizeof MEMORY_BASIC_INFORMATION
invoke printf, CStr(<"VirtualQuery(%X)=%X",lf>), addr mbi, eax
invoke printf, CStr(<"BaseAddress=%X",lf>), mbi.BaseAddress
invoke printf, CStr(<"AllocationBase=%X",lf>), mbi.AllocationBase
invoke printf, CStr(<"RegionSize=%X",lf>), mbi.RegionSize
ret
main ENDP
mainCRTStartup:
call main
invoke ExitProcess, 0
END mainCRTStartup
It' written for WinInc, but should be trivial to make it work with Masm32.
Here's output on my machine (XP, binary with default stack size):
VirtualQuery(12FFA0)=1C
BaseAddress=12F000
AllocationBase=30000
RegionSize=1000
I also checked whether ESP may reach the start of the memory block:
.386
.MODEL FLAT,stdcall
option casemap:none
.nolist
.nocref
WIN32_LEAN_AND_MEAN equ 1
include windows.inc
include stdio.inc
.cref
.list
lf equ 10
CStr macro text:VARARG
local xxx
.const
xxx db text,0
.code
exitm <offset xxx>
endm
.CODE
assume fs:nothing
main PROC c
local mbi:MEMORY_BASIC_INFORMATION
local bottom:dword
invoke VirtualQuery, addr mbi, addr mbi, sizeof MEMORY_BASIC_INFORMATION
invoke printf, CStr("VirtualQuery(%X)=%X",lf), addr mbi, eax
invoke printf, CStr("BaseAddress=%X",lf), mbi.BaseAddress
invoke printf, CStr("AllocationBase=%X",lf), mbi.AllocationBase
invoke printf, CStr("RegionSize=%X",lf), mbi.RegionSize
mov esi, mbi.BaseAddress
sub esi, mbi.AllocationBase
invoke VirtualAlloc, mbi.AllocationBase, esi, MEM_COMMIT, PAGE_READWRITE
invoke printf, CStr("VirtualAlloc(%X, %X)=%X",lf), mbi.AllocationBase, esi, eax
push myexc
pushd fs:[0]
mov fs:[0],esp
mov esi, esp
.while 1
push 0
.endw
cont_exec:
mov esp,esi
pop fs:[0]
add esp,4
invoke printf, CStr("exception occured at ESP=%X",lf), bottom
ret
myexc:
mov eax, [esp+12] ;get context
mov [eax].CONTEXT.Eip_, cont_exec
mov eax, [eax].CONTEXT.Esp_
mov [bottom], eax
xor eax, eax ;continue execution
retn
main ENDP
mainCRTStartup:
call main
invoke ExitProcess, 0
END mainCRTStartup
The result: it may reach the start of the block, but you cannot detect it with a simple exception handler. the "myexc" label is never reached when you try this program.
With the CDB debugger, it can be seen however:
Quote
image00400000+0x10d0:
004010d0 e82bffffff call image00400000+0x1000 (00401000)
0:000>g
VirtualQuery(12FFA0)=1C
BaseAddress=12F000
AllocationBase=30000
RegionSize=1000
VirtualAlloc(30000, FF000)=30000
(1fc.120): Access violation - code c0000005 (first chance)
First chance exceptions are reported before any exception handling.
This exception may be expected and handled.
eax=00000021 ebx=7ffdc000 ecx=77c118bf edx=77c31b78 esi=0012ff94 edi=00000000
eip=00401096 esp=00030000 ebp=0012ffbc iopl=0 nv up ei pl nz na pe nc
cs=001b ss=0023 ds=0023 es=0023 fs=003b gs=0000 efl=00010206
image00400000+0x1096:
00401096 6a00 push 0
0:000>
it crashes at ESP=30000h
Quote from: jj2007 on December 17, 2010, 01:44:28 AM
My theory is that there are guard pages above 33000h; they are committed until esp hits 33000h minus -1, in which latter case the exception handler says stop. The question is, I repeat, whether the 33000h changes for randomising Windows versions.
The last page of the stack is set as a gaurd page. When it's accessed, the exception handler just commits the next sequential one (or reserves another and commits it), and sets that as a gaurd page. If it can't reserve it, then it tosses the exception. In this case, those three pages already exist, so you get the exception. I don't see how not being able to allocate pages that you already allocated is undocumented...
But what's important is that it's still YOUR MEMORY. You can set ESP to those addressess and read from it, just like it was the stack. In fact, you can even get rid of that memory using VirtualFree, and let your stack keep growing until it hits the next range at 20000. If your 1MB stack starts at 12E000, and you reserve a page at 2D000, that your new "hard limit" is there, instead.
-r
Quote from: redskull on December 17, 2010, 02:41:30 PM
... sets that as a gaurd page. If it can't reserve it, then it tosses the exception.
Yes indeed. And I doubt that you can simply free the 'OS crap', as you called it earlier. What's worse is that I tested it today on another PC, XP with Prescott P4, and instead of at 33000h, the crash happens at 43000h...
Stilll working on a fool-proof solution...
I tried executing a
invoke VirtualFree,00030000h, 0, MEM_RELEASE
with no problem, and it vanished off the memory list, same as any other page. But what's *really* strange is that now it seems to have permenently vanished! New processess don't seem to have it allocated anymore. :dazzled:
-r
Ah, found it, it came back at 130000
This kind of behaviour does not have to be documented.
In fact with current address randomization techniques and having PE's with relocations the OS only needs to know the size of your executables' image, stack and heap requirements.
However the OS is free to place your application at whatever address it likes and also free to place your stack at whatever address it likes.
In consequence you should NOT base your application on assumptions about undocumented OS features like the address where the stack is located and where is the lowest address in the stack memory range.
Quote from: BogdanOntanu on December 17, 2010, 06:14:37 PM
In consequence you should NOT base your application on assumptions about undocumented OS features like the address where the stack is located and where is the lowest address in the stack memory range.
Dear Bogdan,
Since I want to avoid that this thread is moved to The Colosseum, let me just kindly express my opinion that knowing when my app should stop the recursion is not a luxury but should rather be a basic feature of an OS.
:thumbu
I think it's well documented (or at least well known) that you are only gaurenteed a maximum stack size equal to that specified in the link switch (which defaults to 1MB). Anything above that is just an extra luxury, not a gaurentee.
That is pretty much the case, you get what you ask for if that much stack space is available and the OS has enough room to load the application into memory.
Jochen,
Quote
Stilll working on a fool-proof solution...
I already suggested a solution, when suggest to substract stack allocation size from value taken from FS:[4]
But you did not trust :P
Well, I done a small testing app. have a look into it. It is simple and straighforward. I use macro to determine - is we can do further recursive calls.
The program is multi threaded, and will run the number of threads you will ask it.
For each thread you should enter the stack size. Relatively to the entered size, you will see that different threads with different stack size do different number of recursion levels, and they all is not crashes - they just returned when "see" that this place "is too deep".
Here is my results, bold is the things which I have entered:
Quote
Hi, this is testing program
Enter a number of threads to test: 5
Enter a stack size for the thread #1: 200000
Enter a stack size for the thread #2: 400000
Enter a stack size for the thread #3: 600000
Enter a stack size for the thread #4: 800000
Enter a stack size for the thread #5: 1000000
Hi, this is thread #2, starting recursive function...
The thread #2 do 91 recursive calls (limited to the stack size only)
Hi, this is thread #4, starting recursive function...
The thread #4 do 191 recursive calls (limited to the stack size only)
Hi, this is thread #1, starting recursive function...
The thread #1 do 41 recursive calls (limited to the stack size only)
Hi, this is thread #3, starting recursive function...
The thread #3 do 141 recursive calls (limited to the stack size only)
Hi, this is thread #5, starting recursive function...
The thread #5 do 240 recursive calls (limited to the stack size only)
All threads are finished, find and press [Any] key to exit...
Program is support multi-core/CPUs systems, of course. The results can be printed not in order of threads creation - this is just "playing" of the scheduler - thread runned simultaneously, and that's not known, who will return first, or next.
Each recursive function have a fat local buffer, to real-life testing.
So, this would be nice to test this program on different systems, especially on x64. On 32bit systems it should work properly anyway - this is documented things was used in the program. If it not work properly (crashes), then this is meant that bug is in *app*, not in *idea* :bg
On x64 systems it also should work properly, at least until 32bit programs exists. Otherwice many system tools would not work.
Program and sources in the archive attached.
P.S. CRT input functions is not checked for return, if you will enter bad numbers (0 for example), or not numbers at all, so, you will the initiator of apps crash :P
Not bad, Alex, and it seems to work, but in real life the app needs to know how much stack has been assigned. Attached what I consider "foolproof" - it crashes only if you give it one more push than allowed.
The archive contains the *.obj, so that you can link it with different stack sizes.
Quote004B3000 is the lowest address
009B0000 initial top of the stack
009AFFC4 current top of the stack
004B3024
004B3020
004B301C
004B3018
004B3014
004B3010
004B300C
004B3008
004B3004
004B3000
OK, no overflow
Hit a key to see it crash:
If you see this, the proggie did not work as it should
(red part visible in Olly)
i was poking around a little and found some articles that may be of interest...
http://www.masm32.com/board/index.php?topic=1538.0
http://www.asmcommunity.net/book/The_Stack
http://en.wikipedia.org/wiki/Win32_Thread_Information_Block
Dave,
The first 8 lines after start: tell a more complete story :wink
All three elements necessary to know the exact limit LowStack are somewhat unofficial, especially the three guard pages (3000h); but it works perfectly, at least on Win XP. Grateful for a test with Win2000, Vista and Win 7, though.
Quote from: jj2007 on December 18, 2010, 04:40:18 PM
Not bad, Alex, and it seems to work, but in real life the app needs to know how much stack has been assigned. Attached what I consider "foolproof" - it crashes only if you give it one more push than allowed.
Jochen, it not *seems to work* - it *work*. And this is real life app - look into big local buffer.
This method I have suggested, is only one method fastest + compatible method of calculation of further stack end. One thing which is required by macro - you should know the allocation size of the stack. If you don't specify any size, then default value (1 MB) is used.
If you want to know *the size* of the stack (i.e. when you did not know it), or if you want to know the lowest addrress without having a size, then Japheth's method is good way to go.
Alex,
If you had bothered to look into my code, you would have discovered that it calculates the allocated stack size, instead of asking the user.
Quote from: jj2007 on December 19, 2010, 12:03:24 AM
Alex,
If you had bothered to look into my code, you would have discovered that it calculates the allocated stack size, instead of asking the user.
No, if you will bothered to look into my code, then you will see that I ask the size of stack for *newly* created threads. The asking is needed only to prove to you that method is work :P
Main point is: I just let you to specify the size of stack which will be allocated for new thread, and I let you do this in simple and fast way for multiple threads, insted of changing LINKers switch.
I tried to make it just as conveniece, but you did not understand this :(
P.S. SizeOfStackReserve is applied only for main thread of the program. Or for newly created threads with zero specified as stack size.
But usually, threads created with specified stack size, and which is smaller than 1 MB. Because some "handy" background threads without GUI etc can work fine with small stack size.
Code which Japheth was posted is perfect to get the stack size for current thread at runtime. Even if you do not know which size of the thread is (for example if you determine this in code which is can be called from anything thread with anything stack size).
By the way, I usually use VirtualQuery in SEH to get the name of module which was crashed. Get AllocationBase, and pass it as handle to GetModuleFileName.
Alex,
If I double the stack size with an add ecx, ecx as below, the thread still performs the # of iterations that corresponds to the stack size entered by the user. Which implies that you pass the stack size twice. Well hidden :U
Quote
add ecx, ecx ; requested stack size doubled
invoke CreateThread,0,ecx,offset TheThread,edx,0,esp
Your macro relies on the second para:
AxGetStackBottom MACRO thereg:REQ, theallocationsize
ASSUME fs:NOTHING
mov thereg, fs:[4]
ifdif <theallocationsize>,<>
sub thereg,theallocationsize
add thereg,1024*32
else
sub thereg,(1024*1024-(1024*32))
endif
ASSUME fs:ERROR
EXITM<thereg>
ENDM
In contrast, my code gets the
reserved size independently of linker settings and/or extra paras passed by the user.
Quote from: jj2007 on December 19, 2010, 12:42:37 AM
Alex,
If I double the stack size with an add ecx, ecx as below, the thread still performs the # of iterations that corresponds to the stack size entered by the user. Which implies that you pass the stack size twice. Well hidden :U
Quote
add ecx, ecx ; requested stack size doubled
invoke CreateThread,0,ecx,offset TheThread,edx,0,esp
Your macro relies on the second para:
AxGetStackBottom MACRO thereg:REQ, theallocationsize
ASSUME fs:NOTHING
mov thereg, fs:[4]
ifdif <theallocationsize>,<>
sub thereg,theallocationsize
add thereg,1024*32
else
sub thereg,(1024*1024-(1024*32))
endif
ASSUME fs:ERROR
EXITM<thereg>
ENDM
In contrast, my code gets the reserved size independently of linker settings and/or extra paras passed by the user.
I'm not hide that. I just load ECX from the place, *where* is located size of stack which would be used in the RecursiveFunction. I just construct dynamical structure on the stack.
Your method would not work for threads which created with different size of the stack. If you did not believe, then try to determine stack size with your method, and use it in the thread created as that:
invoke CreateThread,0,262144,offset TheThread,0,0,offset dwThrId
In "TheThread" proc run the recursive function.
You will fastly found that you have 4 times stack smaller that your method was "calculated".
Only one compatible way to get stack size of anything thread at runtime, is VirtualQuery.
Quote from: Antariy on December 19, 2010, 12:55:29 AMYour method would not work for threads which created with different size of the stack.
Yes, that's correct.
QuoteOnly one compatible way to get stack size of anything thread at runtime, is VirtualQuery.
How would you do that?
typedef struct _MEMORY_BASIC_INFORMATION {
PVOID BaseAddress;
PVOID AllocationBase;
DWORD AllocationProtect;
SIZE_T RegionSize;
DWORD State;
DWORD Protect;
DWORD Type;
} MEMORY_BASIC_INFORMATION, *PMEMORY_BASIC_INFORMATION;
Quote from: jj2007 on December 19, 2010, 12:42:37 AM
Alex,
If I double the stack size with an add ecx, ecx as below, the thread still performs the # of iterations that corresponds to the stack size entered by the user. Which implies that you pass the stack size twice. Well hidden :U
1. Code have comments at that part. So, if you did not tried to read my cyrillic English...
Stack size are placed at stack-dynamically-constructed object. If you want to have specify wrong size twice bigger, you should not only increase ECX, but "inform" and constucted object (structure).
Now:
mov ecx,[esp] ; thread stack size a second parameter of the structure
Your "trick"
mov ecx,[esp]
add ecx,ecx
mov [esp],ecx
2. Now I understand that I've made right decision to NOT post code of AxMsgTableViewer.
In AxMsgTableViewer I have written code of such style... So, it is right that I not post it, the stake is not go away from the world yet.
If you want, I'll post one part of the AxMsgTableViewer. The nice-buttons code, which draws the icon on the button. Icon is interact with a button - it is sunken when you press a button.
I'll post this part only if you will ask, and only for showing that I've not tried to hide anything in the TestStack program. This is just a style.
3. Japhet's code with small changement.
local mbi:MEMORY_BASIC_INFORMATION
invoke VirtualQuery, addr mbi, addr mbi, sizeof MEMORY_BASIC_INFORMATION
mov eax,mbi.AllocationBase ; in EAX - the hard end of the stack. "Soft" end is ~12 KB upper
mov ecx,fs:[4] ; get top of the stack
sub ecx,eax ; now in ECX the size of the stack
And only this code would work even on Win95, and would work in anything environment. If your code work in the thread which created not by you, if your code is work in the DLL which is called by unknown code, etc.
Quote from: jj2007 on December 19, 2010, 01:17:27 AM
How would you do that?
Above is post which I have long written, and you post before I have posted.
Code which prints the size of the stack. Independedly from where it would be called, and in which thread.
include \masm32\include\masm32rt.inc
.code
start proc
local mbi:MEMORY_BASIC_INFORMATION
invoke VirtualQuery, addr mbi, addr mbi, sizeof MEMORY_BASIC_INFORMATION
mov eax,mbi.AllocationBase ; in EAX - the hard end of the stack. "Soft" end is ~12 KB upper
assume fs:nothing
mov ecx,fs:[4] ; get top of the stack
sub ecx,eax ; now in ECX the size of the stack
invoke crt_printf,CTXT("The size of the stack: %u",10,10),ecx
invoke crt_printf,CTXT("Find and press [Any] key to exit..."),esi
invoke crt__getch
ret
start endp
end start
OK, great :U
So for threads, VQ is the only way to get the reserved size. For the ordinary exe, the GetModuleHandle plus IMAGE_NT_HEADERS will do the job, it is slightly shorter.
:thumbu
Quote from: jj2007 on December 19, 2010, 01:38:21 AM
OK, great :U
So for threads, VQ is the only way to get the reserved size. For the ordinary exe, the GetModuleHandle plus IMAGE_NT_HEADERS will do the job, it is slightly shorter.
:thumbu
Yes, info from PE header is applied for only the main thread of the program (i.e. thread which starts in program's entry point, usually "start:" label).
For any new threads used or value specified by user, or value of size of the main thread stack (if zero is specified).
But in re-usable code for general purpose that's not reliable to rely on the anything assumptions - is better to just calculate it in runtime. :thumbu
Now I'm suggest to make a great banquet for a case of solving these debates :bg
Who is voting for?
Quote from: Antariy on December 19, 2010, 01:46:56 AM
Now I'm suggest to make a great banquet for a case of solving these debates :bg
Who is voting for?
:U