a question about globals

joemc · January 22, 2010, 10:05:21 PM

in c/c++ it is assumed by many that global variables are a bad idea. I believe the main reason being it does not hide data from procedures and functions, but some have claimed it can cause speed issues as well. I personally do not know how most compiler's store data. My question is, for structures i am creating should i make them all on the stack and pass pointers to them?. or should i use a form of ALLOC and pass a pointer. Is this what most compilers are actually doing? It does seem like a fairly large distance to go all the way to the data section of the file and back, but i assume it is not much as far as performance. I have been trying to read up on how the stack is actually implemented, and different ways of handling things. Almost everything i find is an opinion. Some opinions are worth more than others, and i would like to here what some opinions from this forum are.

I am also searching this forum. If there is a thread that is noteworthy or any external links on the subject that would be exceptional. It seems everyone does there own thing, i have tried disassembling some code that my compiler has output and it is far more difficult to understand than looking at MASM with the use of macros.

donkey · January 22, 2010, 10:24:19 PM

Hi joemc,

Yes, data on the stack is generally faster than global data to access, partly because the stack is more likely to be cached than any particular global memory location. Another good reason to avoid global data is that it is not thread safe, that is if you are writing a multi-threaded program each one should use local storage for its memory to avoid one overwriting data from another. There are times when it is nearly impossible to avoid globals however and in reality the speed difference of a few cache misses is virtually negligible on modern processors so I don't worry too much about them in normal single threaded applications. Another place where you might avoid global memory is in reusable code blocks, if you write a piece of code that you intend to reuse in other projects you should avoid global memory since it can be a nightmare coercing more than a few of them to work together. So which is better ? well that depends on the application and your requirements, for a compiler which has no idea what you intend to use the code for it is best to go with the fastest solution even if it complicates the code a bit, for the assembly language programmer you have the choice of simplicity, size and speed and can make the decision based on your own criteria.

Slugsnack · January 22, 2010, 10:57:24 PM

A part of good software design is the general rule to make all variables as local/private as possible. That is, to minimise the scope of every variable/structure. The aim is to achieve low coupling and high cohesion. Think in terms of readability and maintainability and it is clear that if everything is as local as possible you will be able to read and understand the code better. Not only this, you can also be surer that changing one part of your program will not break another part.

As donkey says, there are indeed times it is not practical to completely avoid global variables but you should use the above generalisation as far as is sensible.

joemc · January 22, 2010, 11:10:26 PM

It's interesting in the first to responses i did get the answer i thought i would, it just seems none of the tutorials for ASM seem to show any normal way of doing this. I suppose in MASM with LOCAL it should not be too difficult to just push the address, or use address in invoke.

BogdanOntanu · January 22, 2010, 11:22:47 PM

Quote from: joemc on January 22, 2010, 11:10:26 PM
It's interesting in the first to responses i did get the answer i thought i would, it just seems none of the tutorials for ASM seem to show any normal way of doing this. I suppose in MASM with LOCAL it should not be too difficult to just push the address, or use address in invoke.

Inside a PROC when writing INVOKE statements you can simply use the ADDR keyword in front of any local variable name in order to obtain the address of it and not the value of it.

Beware that ADDR will generate a LEA eax, [local_variable] / push eax .... and this will make the EAX register dirty (silently). This can be a problem IF you also use EAX as one of the parameters for the INVOKE.

redskull · January 22, 2010, 11:26:51 PM

Also, it should be noted that many of the acceptable "encapsulated" variables in high-level languages can really just be globals in disguise; a static variable or a private class member, which are professor-approved and have none of the global variable stigma, are just globals whose scope and access is tightly controlled by the compiler. Lacking these fancy abstractions, MASM just calls them what they are. But if its presered across functions calls, it's a global, by any other name.

-r

Slugsnack · January 22, 2010, 11:33:46 PM

redskull makes a very relevant point there. The main problem is that local variables can not be declared as initialised. In a high level language, even if it appears to be the case, the data has been initialised at compile-time on the heap.

hutch-- · January 23, 2010, 01:03:41 AM

This has been a good discussion and very informative to a new member. With currect PE file format and from memory the coming PE32+ there is only one place for data in the file format and that is in the initialised data section. Uninitialised storage apart from dynamic memory allocation is done either in the uninitialsed data section or constructed on the stack as the procedure is called.

Within this framework you can construct things like STATIC variables inside a procedure which is effectively a GLOBAL value that can only be accessed from within that procedure and in MASM you can do that if you need with a MACRO.

The choice of LOCAL or GLOBAL is best done on the basis of required scope, if a variable is only ever required within a procedure and it does not need to be persistent, a LOCAL is the best choice for reasons mentioned above but if you need a variable that can be accessed across multiple procedures which may be callback style procedures, a GLOBAL value is a good choice as it save very messy and complicated archipeligos of stack arguments.

Now one place where globals are not a viable choice is re-entrant thread design and this would be a common requirement for software that handled multiple web connection on a high count basis. The type of design that best fits that requirement is a structure that is allocated as the thread is called and further procedures nested within that thread have the structure address passed to them by the thread initiator procedure.

jj2007 · January 23, 2010, 03:12:01 AM

Quote from: BogdanOntanu on January 22, 2010, 11:22:47 PM

Inside a PROC when writing INVOKE statements you can simply use the ADDR keyword in front of any local variable name in order to obtain the address of it and not the value of it.

Beware that ADDR will generate a LEA eax, [local_variable] / push eax .... and this will make the EAX register dirty (silently). This can be a problem IF you also use EAX as one of the parameters for the INVOKE.

Well, not that silently, actually. Both Masm (ml.exe) and JWasm throw an error if you attempt that. And you can use eax, if it is used after the last addr, see MyTest2 below.

Code Select

include \masm32\include\masm32rt.inc

MyTest1	PROTO: DWORD, :DWORD
MyTest2	PROTO: DWORD, :DWORD
MyTest3	PROTO: DWORD, :DWORD

.code
start:
	invoke MyTest1, chr$("My first title"), chr$("My first message")	; the chr$ macro
	invoke MyTest2, chr$("My second title"), chr$("My second message")	; generates strings
	invoke MyTest3, chr$("My third title"), chr$("My third message")	; in the .data section
	exit

MyTest1 proc pTitle, pMsg
LOCAL LocTitle[20]:BYTE, LocMsg[40]:BYTE
  invoke lstrcpy, addr LocTitle, pTitle	; we copy the strings
  invoke lstrcpy, addr LocMsg, pMsg	; to a local buffer

  invoke MessageBox, 0, addr LocMsg, addr LocTitle, MB_OK

  ret
MyTest1 endp

MyTest2 proc pTitle, pMsg
LOCAL LocTitle[20]:BYTE, LocMsg[40]:BYTE
  invoke lstrcpy, addr LocTitle, pTitle
  invoke lstrcpy, addr LocMsg, pMsg

  lea eax, LocTitle		; we use eax in an invoke, and it works fine!
  invoke MessageBox, 0, addr LocMsg, eax, MB_OK

  ret
MyTest2 endp

MyTest3 proc pTitle, pMsg
LOCAL LocTitle[20]:BYTE, LocMsg[40]:BYTE
  invoke lstrcpy, addr LocTitle, pTitle
  invoke lstrcpy, addr LocMsg, pMsg

BadVersion = 0	; put 1 to see "error A2133:register value overwritten by INVOKE"
  if BadVersion
	lea eax, LocTitle		; we use eax in an invoke, and the assembler chokes
	invoke MessageBox, 0, eax, addr LocTitle, MB_OK
  else
	lea edx, LocTitle		; so we have to revert to another register
	invoke MessageBox, 0, edx, addr LocTitle, MB_OK
  endif

  ret
MyTest3 endp

end start

Jwasm throws "Error A2188: Register value overwritten by INVOKE" but otherwise the behaviour is identical.

hutch-- · January 23, 2010, 08:01:29 AM

MASM's invoke and the use of EAX is a known issue which is very easy to get around, if you have multiple values that convert to the EAX register, copy them to local variable or another register. invoke is generally used in high level emulation so the transfer time to a local is no big deal, especially if you design the rest of the code to work with it. If you are using a MASM style function notation, just write the return value to a variable.

Code Select


  mov myVar, MyFunc(args etc ...)
  invoke Afunction,myVar

News:

a question about globals