News:

MASM32 SDK Description, downloads and other helpful links
MASM32.com New Forum Link
masmforum WebSite

Isz tutorial: pop goes the variable

Started by thomas_remkus, June 10, 2005, 01:55:26 PM

Previous topic - Next topic

thomas_remkus

I was working on understanding and not just memorizing the first Iszelion windows tutorial and something really stuck out. It was the :

    push  hInstance
    pop   wc.hInstance

Where I would think you would simply write "mov wc.hInstance, hInstance" as this seems most logical. From chatting on IRC about this I learned that I can not take things from memory and put them in another place in memory. Everything must either go to the stack or to a register first. In the IRC channel it was stated that pushing my value to the stack was much slower than using the register but I would "dirty" my register. I think I understand more about that now, but it's not easy wrapping my head around the "why" of this.

My other thought is this: because either pushing and popping or mov-ing something to the eax is so common, there must be a very common macro out there that does this. Am I right?

thomas

MichaelW

The problem with "dirtying" a register occurs in code where you are already using all available registers. To use a register to perform the move operation under these conditions, you must preserve the current register value, perform the move, and restore the register value. This would require 4 move instructions or a push, two move instructions, and a pop. Using the stack to perform the move requires only two instructions (although the 4-move instruction form might still be somewhat faster). The MASM32 macros include a m2m pseudo mnemonic that performs a memory-to-memory move using the stack.

eschew obfuscation

dsouza123

If you aren't using the x87 FPU or only some of them you could use the following:

FILD memfrom
FISTP memto

it is almost certainly slower but doesn't use any of the CPU registers or the Stack.

If the value is two dwords ( 64 bits ) instead of one,
it will still work using just the two instructions.

hutch--

Thomas,

It will come with practice but with algo design, you tend to set aside a register or two if possible so that you can handle transient values within the algo but be able to reuse the register in a number of places. In truly processor intensive code, you try to avoid push/pop as it is slower than mov to and from a register. With an API call, they are so slow in comparison to the inner working of a fast algo that it simply does not matter in speed terms.
Download site for MASM32      New MASM Forum
https://masm32.com          https://masm32.com/board/index.php

AeroASM

Quote from: thomas_remkus on June 10, 2005, 01:55:26 PM
My other thought is this: because either pushing and popping or mov-ing something to the eax is so common, there must be a very common macro out there that does this. Am I right?

No, there is no need. All a macro would do would allow you to type something like:


ld MyVar


instead of


mov eax,MyVar


The macro makes the code less clear, does not save much time typing and makes no difference to the final code output.

Mark_Larson

 
  You can also do it through MMX.


    movd  mm0,hInstance
    movd  wc.hInstance,mm0


  Movd does a DWORD move
BIOS programmers do it fastest, hehe.  ;)

My Optimization webpage
htttp://www.website.masmforum.com/mark/index.htm

thomas_remkus

MichaelW, you said "... would require 4 move instructions or a push, two move instructions, and a pop. Using the stack to ... requires only two instructions (although the 4-move instruction form might still be somewhat faster).". This, to me means that while the direct movement of data to the eax might be faster it is prefered to use the stack because of ease of maintenance and to avoid possible unknown issues. This, to me, also says that if I need to get that important cycle out of what I am doing then using this register method might be faster ... but to be careful. I understand and think that I'll use the stack for now until I understand more about the registers.

hutch, you said "...  truly processor intensive code, you try to avoid push/pop as it is slower than mov ...". Of couse, one of my ultimate goals is to take a very critical application to my business and rewrite it entirely in MASM32. I took some PowerBuilder code that used to take 7 hours to 3 days to run and rewrote it in C/C++ (using ADO connections to SQL) and it is now running on very expensive Dell hardware and takes just 180 seconds on average. I am possitive MASM will offer me the chance to understand strange-nesses about how this works and I will be able to, in the short term, develop my C/C++ code better and in the long term develop a new version of crushing speed and performance. Currently the image is just a few hundred K.

AeroASM, you said " ... macro makes the code less clear ...". Hey, this is what a programmer I was chatting with on IRC last evening talked about. He stated that because of the ".if" and other macro additions to the dialect you lost needed opcodes and practice with "jse" and such. This was so over my head that I had to look it up. He's sort of right because the ".if" has no indication of negation. But personally, I love the "invoke" and checking for me. MASM is just heavenly.

Mark_Larson, you said "... also do it through MMX ...". But not all machines come with MMX so you need to detect this first right? And then to optionally use the MMX registers in your code would just slow it down with the check. I presume you would need to compile specificallly just for MMX technologies.

Thanks all! I am trying to work out why I still can't use something like "mov eax, offset hInstance" or maybe even "mov eax, offset offset hInstance". Such a long time ago I needed to understand and not just memorize pointers. It's the same now. I don't want to use it unless I feel that I understand. Can you offer some more patience and help me understand what the "offset" and "addr" and just trying to put the value in one variable into another variable?

I know that this is more like four seperate responses, but as all this help is sort of specific to me ... I did not think it appropriate to open 3 new topics.

thanks so much,
thomas

Mark Jones

Hey Thomas. Look in the ASMINTRO.HLP file, it explains why and when to use ADDR, OFFSET, and DWORD/WORD/BYTE PTR. :)
"To deny our impulses... foolish; to revel in them, chaos." MCJ 2003.08

thomas_remkus

Mark Jones .... dude !!! Ya know, I didn't even know that stuff was there. There are HLP files with this? Well, I was just looking all over the place and running wild. I'll have to get into these right now. Lucky for me, my work's all complete for the day !!!!

dsouza123


hInstance  dd  12345678h   ; with the address of hInstance  403000h

mov eax, hInstance ; will copy the contents of the 4 bytes of memory starting at address 403000h which is 12345678h
                   ; consisting of the bytes at 403000h 403001h 403002h 403003h
                  ; because x86 stores low byte first it is 78h 56h 34h 12h for the above memory locations
                  ; the variable's content is shown as 12345678h because that is the dword (dd) value of hInstance and
                  ; the convention for writing numbers is high digit/hexdigit to low digit/hexdigit, from left to right

lea eax, hInstance ; will copy the address of hIntance which is 403000h into eax

AeroASM

Quote from: thomas_remkus on June 10, 2005, 06:49:06 PM
MichaelW, you said "... would require 4 move instructions or a push, two move instructions, and a pop. Using the stack to ... requires only two instructions (although the 4-move instruction form might still be somewhat faster).". This, to me means that while the direct movement of data to the eax might be faster it is prefered to use the stack because of ease of maintenance and to avoid possible unknown issues. This, to me, also says that if I need to get that important cycle out of what I am doing then using this register method might be faster ... but to be careful. I understand and think that I'll use the stack for now until I understand more about the registers.

You don't need to be so cautious; just make sure that you are not already using a register when you use it to copy memory.

Quote from: thomas_remkus on June 10, 2005, 06:49:06 PM
hutch, you said "...  truly processor intensive code, you try to avoid push/pop as it is slower than mov ...". Of couse, one of my ultimate goals is to take a very critical application to my business and rewrite it entirely in MASM32. I took some PowerBuilder code that used to take 7 hours to 3 days to run and rewrote it in C/C++ (using ADO connections to SQL) and it is now running on very expensive Dell hardware and takes just 180 seconds on average. I am possitive MASM will offer me the chance to understand strange-nesses about how this works and I will be able to, in the short term, develop my C/C++ code better and in the long term develop a new version of crushing speed and performance. Currently the image is just a few hundred K.

If you wrote it in MASM, the image would be less than 50k and would run 2-5 times faster.

Quote from: thomas_remkus on June 10, 2005, 06:49:06 PM
AeroASM, you said " ... macro makes the code less clear ...". Hey, this is what a programmer I was chatting with on IRC last evening talked about. He stated that because of the ".if" and other macro additions to the dialect you lost needed opcodes and practice with "jse" and such. This was so over my head that I had to look it up. He's sort of right because the ".if" has no indication of negation. But personally, I love the "invoke" and checking for me. MASM is just heavenly.

The .if macros and the invoke macros are superb because they make the code clearer; especially when it is complex. However a macro to put a vaule in eax is so simple it would make the code less clear.

Quote from: thomas_remkus on June 10, 2005, 06:49:06 PM
Mark_Larson, you said "... also do it through MMX ...". But not all machines come with MMX so you need to detect this first right? And then to optionally use the MMX registers in your code would just slow it down with the check. I presume you would need to compile specificallly just for MMX technologies.

All processors later than the very first Pentium have MMX, so it is safe to use it (IMO). However it is very slow.

Quote from: thomas_remkus on June 10, 2005, 06:49:06 PM
Thanks all! I am trying to work out why I still can't use something like "mov eax, offset hInstance" or maybe even "mov eax, offset offset hInstance". Such a long time ago I needed to understand and not just memorize pointers. It's the same now. I don't want to use it unless I feel that I understand. Can you offer some more patience and help me understand what the "offset" and "addr" and just trying to put the value in one variable into another variable?

An important thing to keep in mind is that there are no variables, only locations in memory. You can give a location a label like hInstance, if you want. But this does not define the size: you could put a byte, a long, or even a long string of bytes at that location.
To get the address of a label you use offset or addr (offset cannot handle local variables, whereas addr cannot handle forward references).
Therefore mov eax,offset hInstance is perfectly fine.
However how can you get the address of an address? It doesn't make sense. Therefore you cannot do mov eax,offset offset hInstance.
Therefore offset and addr are like & in C.

Consider this:
mov eax,offset hInstance
mov edx,[eax]

This means:
eax=&hInstance
edx=*eax

[] and * tell the processor to fetch the value stored in memory at a particular location. You tell it where by giving it an address. The address can be stored in a register or in memory. When an address is stored it is called a pointer, because you can use it like this to get the value that it is pointing to.

Note that the above example is superfluous, since you could just use:
mov eax,[offset hInstance]
The assembler provides a shorthand for this, which is:
mov eax,hInstance.