News:

MASM32 SDK Description, downloads and other helpful links
MASM32.com New Forum Link
masmforum WebSite

Discussion on instructions

Started by jj2007, August 10, 2008, 02:01:21 PM

Previous topic - Next topic

NightWare

Quote from: jj2007 on August 26, 2008, 05:55:13 PM
So we can consider .if "evil legacy code"  :toothy
you can, macros are made for a general use. and NOT for optimized code... otherwise everybody will use macros !  :toothy

beside your timing is more than discutable... gettickcount has a 7ms possible marge error, since you use it twice it mean 14 (and i remember you the difference obtained is inferior...)  :wink

jj2007

Quote from: NightWare on August 26, 2008, 09:52:35 PM
beside your timing is more than discutable... gettickcount has a 7ms possible marge error, since you use it twice it mean 14 (and i remember you the difference obtained is inferior...)  :wink

I would be surprised if Michael had used GetTickcount for test eax, eax vs or eax, eax comparison. Where did you see that?

jj2007

Quote from: NightWare on August 26, 2008, 09:52:35 PM
Quote from: jj2007 on August 26, 2008, 05:55:13 PM
So we can consider .if "evil legacy code"  :toothy
you can, macros are made for a general use. and NOT for optimized code... otherwise everybody will use macros !  :toothy

Finally I understand why my code is so slow... it's those damn macros! Thanxalot! :green2

NightWare

Quote from: jj2007 on August 26, 2008, 10:07:52 PM
Where did you see that?
kip irvine's asm x86 book 5th edition (french one), specify 10ms on win98, and i read somewhere else 7ms for winXp... but don't remember where...  :(

Quote from: jj2007 on August 26, 2008, 11:01:14 PM
Finally I understand why my code is so slow... it's those damn macros! Thanxalot! :green2
no... here it's slow because you use slow instructions like fdiv...  :dance:

hutch--

The problem with timings is they are run in RING3 so ALL METHODS of timing that are not done under RING0 will be subject to variation due to various OS level priorities.

The only virtue of GetTickCount() is that its simple to use, it has a granularity of about 15ms in results and wanders in its timings by about 3% in practice but then so does every other timing method under RING3. The solution is to increase the sample size to get the duration higher as by doing this you reduce the error amount by the scale of the increase in duration.

In practice anything much under a half a second starts to become unreliable and the results for timings under 100ms generally produce nonsense but as the error margin is reasonably constant, you improve the timing accuracy from 100ms by ten times by running the test over 1 second.

It becomes a point of diminishing returns to run much longer than that, once you are well under 1% you are not gaining much.
Download site for MASM32      New MASM Forum
https://masm32.com          https://masm32.com/board/index.php

Rockoon

Quote from: jj2007 on August 26, 2008, 05:19:39 PM
Quote from: Rockoon on August 26, 2008, 01:36:05 PM
Why is there an arguement about this?
Because "your" side produces no evidence (lab tests, timings), just hearsay

Timings are irrelevant. Even if they clock exactly the same on your time test, it is still no excuse to use more resources than necessary.

I realize that you backed yourself into a corner here, which is why you then did this:

Quote from: jj2007 on August 26, 2008, 05:19:39 PM
It is always a pleasure to see your blood pressure rising, Rockoon  :green

Exactly whos blood pressure is rising here? The overly defensive person who is now making personal attacks towards a person whos made a single post in this thread, or me?

Why are you so hostile on this subject? Its real simple. Admit when you are wrong. Your ego doesnt get bruised when you do so. Your ego gets bruised when people like my call you out while you perpetuate a stupid line of reasoning in order to shift the goal post away from the flaws of your original arguement.

Quote
Microsoft programmers thought 10 years ago that .if was a real Masm instruction. It still works in Masm 9.0

Way to redefine meanings in an attempt to save your prior statements while mucking up a new arguement.

.if is a masm directive.

Its called Assembly Language and .IF isnt an instruction.

You are grasping at straws here because you are too stuborn to admit the plainly obvious.. that it is always superior to use less resources when all other things are equal.
When C++ compilers can be coerced to emit rcl and rcr, I *might* consider using one.

Rockoon

Quote from: hutch-- on August 26, 2008, 11:22:18 PM
The problem with timings is they are run in RING3 so ALL METHODS of timing that are not done under RING0 will be subject to variation due to various OS level priorities.

The only virtue of GetTickCount() is that its simple to use, it has a granularity of about 15ms in results and wanders in its timings by about 3% in practice but then so does every other timing method under RING3. The solution is to increase the sample size to get the duration higher as by doing this you reduce the error amount by the scale of the increase in duration.

In practice anything much under a half a second starts to become unreliable and the results for timings under 100ms generally produce nonsense but as the error margin is reasonably constant, you improve the timing accuracy from 100ms by ten times by running the test over 1 second.

It becomes a point of diminishing returns to run much longer than that, once you are well under 1% you are not gaining much.

The problem with timing code fragments via artificial repetition is that you arent timing a practical real world scenario...

...unless the real world scenario you are trying to time happens to be that code fragment, nestled between that loop code, surrounded by those timing calls, with that specific instruction alignment, and so on..

When C++ compilers can be coerced to emit rcl and rcr, I *might* consider using one.

hutch--


> The problem with timing code fragments via artificial repetition is that you arent timing a practical real world scenario...

I generally agree with this view, thats why you time algorithms, not just fragments as the algorithm is much closer to a real world operation but the repetitive method of large count looping gives you a method of comparison where improvements in test speed correlate to improvements in real world performance.

The other factor is in almost every instance applications run under RING3 so timings taken under RING3 are far closer to how they will be used than fancy methods done in RING0 that don't have OS interference.
Download site for MASM32      New MASM Forum
https://masm32.com          https://masm32.com/board/index.php

Rockoon

Quote from: hutch-- on August 26, 2008, 11:40:07 PM
I generally agree with this view, thats why you time algorithms, not just fragments as the algorithm is much closer to a real world operation but the repetitive method of large count looping gives you a method of comparison where improvements in test speed correlate to improvements in real world performance.

I agree that it gives a method of comparison, but it is unreliable. Both false positives and false negatives plague the results of these sorts of tests.

It is certainly the case that if X > Y in these sortsa of timing tests, then X > Y is likely to remain the case in real world code and X < Y is extremely unlikely in real world code.

But when X = Y in these sortsa timing tests, it is not unrealistic to find that X != Y in a real world case...

...and furthermore, and more to the point, if we choose randomly between these two alternatives than we have a 50% chance of picking the superior method in those cases where X != Y. That 50% probability is as bad as it gets, while an intelligent chooser can expect to do better than 50% by simply using a rational line of reasoning.

X is better than Y because ....

Now, if I were faced with this thread and didnt know anything at all about computer architecture.. I would see that one side has given a reason why they believe their X is better than Y, and it certainly sounds reasonable, and that the other side has not given any reason at all why their Y is better than X and infact are simply declaring that X = Y, well you know what.. i'm going to use X .. since X is safe no matter who is right.




When C++ compilers can be coerced to emit rcl and rcr, I *might* consider using one.

MichaelW

Quote from: Rockoon on August 26, 2008, 11:26:41 PM
. . .that it is always superior to use less resources when all other things are equal.

It's a shot in the dark. All other things seldom are equal, how is one to know when they are, and exactly how would you go about determining the amount of resources used?

Quote
The problem with timing code fragments via artificial repetition is that you arent timing a practical real world scenario...

...unless the real world scenario you are trying to time happens to be that code fragment, nestled between that loop code, surrounded by those timing calls, with that specific instruction alignment, and so on..

The point of timing code, whether small fragments or entire algorithms is to provide a reasonable basis for selecting the fastest instructions, instruction sequences, or algorithms. At least practically speaking, there is no other way to do this.
eschew obfuscation

jj2007

Quote from: Rockoon on August 26, 2008, 11:23:31 PMTimings are irrelevant. Even if they clock exactly the same on your time test, it is still no excuse to use more resources than necessary.

As an economist, I am truly concerned about resource use. Bring me evidence (timings, lab tests, proof that CPU temperature is rising, and calculations how that would increase energy consumption if applied to all Google servers in the World) that there is a significant difference between .if eax and test eax, eax, and I will fall on my knees, praise Rockoon The Genius and quote you the rest of my life.

QuoteI realize that you backed yourself into a corner here, which is why you then did this:

Quote from: jj2007 on August 26, 2008, 05:19:39 PM
It is always a pleasure to see your blood pressure rising, Rockoon  :green

Exactly whos blood pressure is rising here? The overly defensive person who is now making personal attacks towards a person whos made a single post in this thread, or me?

Well, you arrive in this thread fuming like an attacking pitbull:

QuoteChoosing the extra operation in equivilent performance situations is just being stuborn, and promoting it would be irresponsible.

Which means the authors of .if (Microsoft, I guess?) are irresponsible  :bg

QuoteWhy are you so hostile on this subject? Its real simple. Admit when you are wrong.

My first argument was and still is that there is no substantial argument for using
test eax, eax
je @F
  nop
@@:


instead of

.if eax
  nop
.endif


My second argument was that it would be particularly stupid to tell newbies not to use the .if directive (sorry for my fault, you are perfectly right here). They need to be encouraged, not discouraged to use the built in macros. Once they have acquired confidence, they will ask the right questions, and learn how to write their own optimised macros.

I like this thread, and feel comfortable in my corner. There are some amused observers, and a number of people who behave as if they had themselves miniaturised, crept into the CPU, and watched with horror how jj ruthlessly destroyed registers, thus overheating the CPU and damaging the Earth's climate. As long as you do not bring proof, i.e. figures showing the difference between the two instructions, I will continue to call these arguments "folkloristic"; as soon as you bring such evidence, I will admit my failure. Not a ms earlier :bg

As I said earlier, if I am not using .if for some reasons, I do use test eax, eax instead of or eax, eax. It's good practice, but it's not a religion.

sinsi

Quote from: jj2007 on August 27, 2008, 08:06:53 AM
My first argument was and still is that there is no substantial argument for using
test eax, eax
je @F
  nop
@@:


instead of

.if eax
  nop
.endif

My argument against that is: I want to control what happens in my code as far as opcodes go. When the code coming from ".if"
changes - as it can at any time - it might break something (or several somethings) that I've used for years.

I hate macros (for the above reason) but I have a few of my own. One is "jeaxz" (i like the idea of jecxz)
jeaxz macro lbl:REQ
    test eax,eax
    jz lbl
endm
Light travels faster than sound, that's why some people seem bright until you hear them.

jj2007

Quote from: sinsi on August 27, 2008, 09:20:44 AM
My argument against that is: I want to control what happens in my code as far as opcodes go. When the code coming from ".if" changes - as it can at any time - it might break something (or several somethings) that I've used for years.

I understand. However, it is extremely unlikely that Microsoft would tell ml to produce a different code (e.g. test eax, eax  :wink) - imagine the pile of their own code that would be broken :red

The argument is more complex when talking about macros produced by third parties like members of the Masm32 forum. Example szstring to szstring copy:
      cst MACRO arg1,arg2
        invoke szCopy,reparg(arg2),tstarg(arg1)
      ENDM

That is a nice macro, of course, but you become dependent of one call and two other macros. The question is not whether macros are good or bad (an optimised macro can save a lot of time), but rather how much confidence we can have in its authors. I trust Hutch that m2m will never break my future code... :wink

Quote
jeaxz macro lbl:REQ
    test eax,eax
    jz lbl
endm


Cute :bg

hutch--

I must admit I find some humour in some of these discussions, the notion that there is a RIGHT WAY in a field as variable as assembler is not without its problems, make a rule and someone will always break it by doing something different that just works better. If you accept the axiom of change, you are never disappointed as change will continue with or without you.

In OZ we call it "Rafferty's rules", something like "Anything goes" and if you want to get technical "epistemological anarchy" which means little more than "if it works, do it !".  :bg
Download site for MASM32      New MASM Forum
https://masm32.com          https://masm32.com/board/index.php

Mark_Larson

Quote from: jj2007 on August 27, 2008, 08:06:53 AM
Quote from: Rockoon on August 26, 2008, 11:23:31 PMTimings are irrelevant. Even if they clock exactly the same on your time test, it is still no excuse to use more resources than necessary.

As an economist, I am truly concerned about resource use. Bring me evidence (timings, lab tests, proof that CPU temperature is rising, and calculations how that would increase energy consumption if applied to all Google servers in the World) that there is a significant difference between .if eax and test eax, eax, and I will fall on my knees, praise Rockoon The Genius and quote you the rest of my life.


  So let's look at it from a processor point of view.  In the case of TEST it only has to do ONE thing.  And that is update the flags.  In the case of OR it has to do TWO things.  Update the flags AND UPDATE the destination register.  There is no way the OR doesn't use less resources, since it has to do more. 

  When I was at Dell the hardware guys had to do thermal testing of the processors.  They had a program that ran different kinds of code.  It started out with ALU code, FP code, MMX code, and finally SSE2 coded.  When it hit the SSE2 code, the temperature was at it's hottest. :thumbu
BIOS programmers do it fastest, hehe.  ;)

My Optimization webpage
htttp://www.website.masmforum.com/mark/index.htm