> cmp SomeString[0], 0
Paul, I had not yet tested this one thanks. Apparently it's difficult to do this simple test with less than 7 bytes...
int 3 ; tell Olly you want to see this
cmp MyText[0], 0 ; 7 bytes
mov al, MyText[0] ; 5+2=7 bytes
test al, al
mov al, byte ptr MyText ; 5+2=7 (same code as before)
test al, al
test byte ptr MyText, 0 ; 7 bytes
mov eax, offset MyText ; 9 bytes
mov al, [eax]
test al, al
JJ,
For me, it is not a question about the amount of bytes consumed, it is no longer that important. I like the simplicity of the instruction.
And God knows I do not want to beat a dead horse but you DO seem to have a penchant to do two things:
1. Use destructive instructions that can be done nondestructively (CMP opposed to MOV).
2. Utilize precious registers to do a job that can be done without tying them up.
Now, those two things are important to ME.
Just an observation,
-- Paul
Quote from: PBrennick on August 10, 2008, 10:26:43 PM
1. Use destructive instructions that can be done nondestructively (CMP opposed to MOV).
2. Utilize precious registers to do a job that can be done without tying them up.
Paul, I was just playing around trying to find an even more elegant solution - but yours is clearly the best, it seems. As to destructive solutions, my definition of destructive is "
changes the content of a precious register". That is
not the case for
mov eax, 12345678h ; test value
cmp eax, 0 ; 3 bytes
test eax, eax ; 2 bytes
or eax, eax ; 2 bytes
and eax, eax ; 2 bytes
... and even this pair of instructions is not destructive according to that definition:
inc eax ; 2*1 = bytes
dec eax
So you see I am indeed concerned about not wasting registers :thumbu
As to size, yes I tend to be a bit ideological, as a manifestation against the 15,000,000,000 bytes of hot air that Vista wants to stuff down our throats. On the other hand, even under speed aspects size matters: If you have a medium sized project, you may reach the limits of the L1 cache, and then it pays off if your libraries take care of speed
and size.
Almost every example you posted is destructive, Anytime the value of a register has been change it has been destroyed as it 'has' changed. Important info is lost.
mov eax, 12345678h ; eax has been changed
cmp eax, 0
test eax, eax
or eax, eax ; Technically has changed but doesn't matter, still a bad habit
and eax, eax ;same as above, but worse
inc eax ; eax has been changed
dec eax ; eax has been changed
Changed means destroyed as the original value has been lost.
-- Paul
Quote from: PBrennick on August 11, 2008, 11:58:04 AM
mov eax, 12345678h ; eax has been changed
Please quote me correctly:
mov eax, 12345678h ; test value
I dubbed this "test value" so that the reader of this post can easily check whether eax has been "destroyed" or not by the code that follows the wrongly quoted line, e.g. with a simple
MsgBox 0, hex$(eax), addr AppName, MB_OK
Quote from: PBrennick on August 11, 2008, 11:58:04 AM
Changed means destroyed as the original value has been lost.
It also (probably) means that accesses to that register cannot be reordered around that instruction. With register renaming, the actual register changes even though the name and contents remain the same.
It would be possible for the CPU design to optimise this out, as
xor reg32,reg32 is (apparently) optimised, though I see no reason why the CPU design should specially handle what Paul so aptly describes as a bad habit.
(Also, cache concerns only really apply for a tight loop. If you plan on fitting an entire application in there you'd better inform the user that they can't have any other programs running at the same time, since they will also want to use the cache.)
Cheers,
Zooba :U
The comon notion of "destructive" is one of where a register is WRITTEN TO.
mov eax, eax
The register remains the same but it has had itself written back to itself.
Quote from: jj2007 on August 10, 2008, 02:01:21 PM
> cmp SomeString[0], 0
Paul, I had not yet tested this one thanks. Apparently it's difficult to do this simple test with less than 7 bytes...
That's how I entered into the thread - with a
thank you.
Quote from: PBrennick on August 10, 2008, 10:26:43 PM
...you DO seem to have a penchant to do two things:
1. Use destructive instructions that can be done nondestructively (CMP opposed to MOV).
2. Utilize precious registers to do a job that can be done without tying them up.
That's the answer I got - a personal attack saying I am an evil man who destroys and wastes registers.
Quote from: zooba on August 11, 2008, 12:54:32 PM
With register renaming, the actual register changes
Correct - and it
might have an impact on a relevant aspect of programming, i.e. speed.
I appreciate your intent to contribute with logical arguments. I think it would help if we separated two models of the CPU:
1. The logical model
2. The physical model
In the logical model,
or eax, eax and
and eax, eax do
not change the content of the register; so there is no harm in using these instructions because they will not break your code (and indeed, Masm and JWasm use them for high level constructs, although nobody would have stopped them from using test eax, eax instead)
Now, in the physical model, we have two sub-versions:
a) the folkloristic one: "These instructions are destructive, so the code
must be slower because the poor CPU is working harder"
b) the one underpinned by measurable facts:
Quote from: MichaelW on August 07, 2008, 08:18:15 PM
On my P3, in a contrived test that does not use the result, OR is faster [than test eax, eax], but in a more realistic test that does use the result there is no measurable difference.
Speed is measurable. So is CPU temperature - maybe the CPU heats up some millidegrees after some zillion loops with "evil"
or eax, eax instead of "benign"
test eax, eax instructions; but I don't have the means to test that. Grateful if the hardware freaks could provide a link to Intel's lab research on this topic... :wink
To conclude, I am always open to a nice exchange of arguments based on figures.
:bg
> That's the answer I got - a personal attack saying I am an evil man who destroys and wastes registers.
I am an evil person who does NOT waste registers. :P
Oh no, is this the part where we argue over wether CP/M or VAX "destroy" more registers? :wink
Quote from: Mark Jones on August 11, 2008, 08:40:13 PM
Oh no, is this the part where we argue over wether CP/M or VAX "destroy" more registers? :wink
Me grown up with Fortran IV on a PDP-11, never had any problems with destroyed registers... :bg
JJ,
Perhaps you did not notice my comment about Michael's test which was performed with no cache loading. It makes it invalid because even if it is NOT a regular scenario, the cache can load and a that point the instruction will slow down. You do not seem to want to listen to good advice and always claim that when these things are pointed out to you that we are attacking you. No one is attacking you and you are definitely free do anything what you wish; but when you misrepresent information on this board that people who are learning will read and reach wrong conclusions and develop bad habits - you can expect several of us to take notice and will speak up.
Mark Larson recently spoke up about an error in advice that I had made. I did not cry about it, I did not try to fight an unwinnable war; I moved on.
This is what makes this board work. On a bad day when someone makes an unfortunate mistake, there is someone to come to the rescue. We all work together this way and no one takes it personally. You, on the other hand, are a discordant note, I do not know how this will all resolve but I fear the worse. In the other topic, we all moved on to other things to take the heat off you as everyone likes you. You do not seem to have noticed that and are just hijacking other threads to continue preaching. Please stop, it is sad.
-- Paul
Quote from: PBrennick on August 11, 2008, 10:25:24 PM
JJ,
Perhaps you did not notice my comment about Michael's test which was performed with no cache loading. It makes it invalid because even if it is NOT a regular scenario, the cache can load and a that point the instruction will slow down.
That's interesting. You mean this post, right?
Quote from: PBrennick on August 08, 2008, 12:33:13 AM
Michael,
None of those tests involve a memory to registe,r or register to memory operation which is where the bottleneck is as I already said.
Also, Michael, something for you to think about in terms of your macros is the fact that they do not perform any cache loading so a test between instructions where one writes to the cache and another doesn't is not really valid.
And finally, back to the original point of all this OR 'is' destructive.
-- Paul
Red highlighting by myself. Could you please explain which cache is involved in the
or eax, eax vs
test eax, eax case? As you see, I do listen, and I am eager to learn from older colleagues.
Quote from: jj2007 on August 11, 2008, 11:01:15 PM
Could you please explain which cache is involved in the or eax, eax vs test eax, eax case? As you see, I do listen, and I am eager to learn from older colleagues.
jj, rewrite the register is really a bad habit, coz you have a stall (wait for result written in the register), and it's not the case for 'non destructive' instructions... :wink
Read the documentation, it is clear you have not - to this point. Especially if you do not know what a 'write cache' is. That is what the stall is all about, the pending write will happen and the stall ends. If the CPU is not to busy, the stall is minor or non-existant. This is what leads to results in timing macros that are suspect.
Since Michael's timing macros are mostly used by the 'optimization crowd' who know all this all too well and do not have these habits, the results of his macros are very dependable. But, like anything else, if you misuse a tool ...
You are arguing at this point, and angry, this is what leads you to write things like, " I do listen, and I am eager to learn from older colleagues." especially when you have no intention of listening at all. Since you are arguing and you will NOT listen, thiis has become a waste of time better spent.
-- Paul
Up here in New England, we have been decimated by rainfall this year. Is this happening in other parts of the world? Not much of a summer, just a handful of days in the 90s and the nights are quite chilly!
-- Paul
Quote from: PBrennick on August 12, 2008, 12:13:17 AM
Up here in New England, we have been decimated by rainfall this year. Is this happening in other parts of the world? Not much of a summer, just a handful of days in the 90s and the nights are quite chilly!
It's the same here in New France (just a few miles from you I guess). Rain, rain and rain.
In the holyday season we are a little egocentric and we don't mind about the positives effects of rain :green2
Here in the northwest US, our spring was very rainy, overcast and cool, much more so than normal. The garden got off to a really slow start. But now our summer has been sunny and dry for the most part, but not excessively warm.
Quote from: hutch-- on August 11, 2008, 01:56:55 PM
The comon notion of "destructive" is one of where a register is WRITTEN TO.
mov eax, eax
The register remains the same but it has had itself written back to itself.
No. If this were true, then the term "non-destructive write" would be nonsense. However, it isn't. A destructive write is a write which changes content. IMO that's a matter of course.
If the "destructive" is refering to a cache content which is modified, then this has to be said explicitly, it's surely not "common notion".
I seem to remember reading somewhere that a simple "or eax,eax" to test for eax=0 was a bad idea and to use "test eax,eax" (Agner maybe?), because of the stall.
I remember because typing "or" was easier than typing "test", but my habit now is to use "test".
Off topic, but I use "sub eax,eax" rather than "xor eax,eax" because it sounds better in my brain when I say it...
japheth,
> No. If this were true, then the term "non-destructive write" would be nonsense.
The term "non destructive write" IS nonsense. A write is a write is a write. code like,
mov eax, eax
differs from code like,
mov eax, ecx
only in what is written to EAX.
I agree with Sinsi here. Using an Athlon box, if I take "de-facto standard code" which uses instructions like ADD REG,1 and CMP REG,0 and replace those with INC REG and TEST REG,REG then the code generally runs much faster.
Of course there are inherent differences in the various processors and now this code will run like crap on an Intel CPU... ::)
Therefore, we must all use SSE5 to test for zero instead! :bdg :lol
Quote from: sinsi on August 12, 2008, 12:01:12 PM
I seem to remember reading somewhere that a simple "or eax,eax" to test for eax=0 was a bad idea and to use "test eax,eax" (Agner maybe?), because of the stall.
Agner, microarchitecture, page 45:
If the
SUB ECX,EAX instruction in the first triplet is changed to
CMP ECX,EAX then ECX is not written to, and
we will get a stallHmmm... so we must use a destructive instruction to
avoid a stall?
I don't want to appear stupid, but I am still waiting for an explanation, better: a URL, explaining which cache is affected by
or eax, eaxI have tried Google with a highly specific destructive "test eax" "or eax" (http://www.google.it/search?num=50&hl=en&newwindow=1&safe=off&q=destructive+%22test+eax%22+%22or+eax%22&btnG=Search&meta=) search, but it yields only 15 hits, most of them irrelevant; and this thread here is on top (yeah, they are fast at Google :U).
EDIT: Slowly going mad. I want to understand these things... I even downloaded the Intel® Architecture Optimization Reference Manual here (http://www.intel.com/design/pentiumii/manuals/245127.htm). 322 pages, 330 times
cache, 0 times
destructive.
The only relevant thing I could find in Agner Fog's recent optimization manuals was the statement that TEST can cause partial flags stalls when followed by LAHF or PUSHF(D), where under the same conditions AND and OR will not, but I think this is not really worth considering because of the unusual conditions.
From the IA-32 Intel Architecture Optimization Reference Manual (24896611.pdf):
Quote
Use test when comparing a value in a register with zero. Test essentially ands the operands together without writing to a destination register. Test is preferred over and because and produces an extra result register. Test is better than cmp ..., 0 because the instruction size is smaller.
Assembly/Compiler Coding Rule 50. (ML impact, M generality)
Use the test instruction instead of and when the result of the logical and is not used. This saves uops in execution. Use a test if a register with itself instead of a cmp of the register to zero, this saves the need to encode the zero and saves encoding space. Avoid comparing a constant to a memory operand. It is preferable to load the memory operand and compare the constant to a register.
I think the reasonable approach would be to assume that under normal conditions TEST
may be faster. I tend to use TEST, same as I tend to use XOR reg, reg instead of MOV reg, 0, even though AFAIK the former had a speed advantage only on the 8086/88.
Quote from: MichaelW on August 12, 2008, 03:50:53 PM
The only relevant thing I could find in Agner Fog's recent optimization manuals was the statement that TEST can cause partial flags stalls when followed by LAHF or PUSHF(D), where under the same conditions AND and OR will not, but I think this is not really worth considering because of the unusual conditions.
Agreed.
Quote
From the IA-32 Intel Architecture Optimization Reference Manual (24896611.pdf):
Quote
Use test when comparing a value in a register with zero. Test essentially ands the operands together without writing to a destination register. Test is preferred over and because and produces an extra result register.
Which
may indeed affect speed, although it's probably not the original physical register, due to register renaming. The accent is on "might", because setting flags is also a write operation, and who knows whether the two writes can be performed in parallel or not...
Quote
I think the reasonable approach would be to assume that under normal conditions TEST may be faster. I tend to use TEST, same as I tend to use XOR reg, reg instead of MOV reg, 0, even though AFAIK the former had a speed advantage only on the 8086/88.
The whole point is and was whether this potential hypothetical but yet unproven gain is important enough to incite newbies to fumble with jxx instead of using .if eax==0... oh well. By the way, I also use test if for some reason I don't want the high level syntax.
Thanks for a sober post, Michael.
Any instruction sequence with more uops than another "functionally equivilent" sequence could very well effect performance on intel cpu's with a Trace Cache <-- a cache
We sometimes forget that we arent programming for 386's anymore.
Why is there an arguement about this?
An extra operation, even if it does nothing functional, is still an extra operation. That extra operation may not carry with it a performance penalty (today) but its a silly arguement to run headlong into your code with the idea that extra operations are "OK" if they don't effect performance.
Given two methods that perform equivilently but one of them actualy does less work, there is no debate at all about which one is superior. Really. No debate at all.
Choosing the extra operation in equivilent performance situations is just being stuborn, and promoting it would be irresponsible.
As with all performance-related questions, first you should determine which methods are most performant in real world scenarios (which doesnt mean some tight loop around code fragments.) Then you break ties using logic and common sense, considering relevant factors.
Relevant factors may include resource usage, but may also consider other existing and future architectures (legacy & longevity) or even power usage. You might shrug off power usage for your particular application, but others may not be so comfortable in doing so. Maybe their code has a target of a trillion hours of execution time spread over a hundred million customers. That small per-iteration power savings really starts to look like something at these scales. Whats relevant is very situational.
Quote from: jj2007 on August 12, 2008, 04:03:15 PM
The whole point is and was whether this potential hypothetical but yet unproven gain is important enough to incite newbies to fumble with jxx instead of using .if eax==0... oh well. By the way, I also use test if for some reason I don't want the high level syntax.
Thanks for a sober post, Michael.
Fumble with the jxx instructions?
vs what? Thinking that .if is a real instruction?
Newbies, right?
Should newbies really be using macros that emit more than 1 instruction, ever?
Should newbies be shielded from the details of the flags register?
The flags register should be thrust upon newbies, because nearly every instruction executed within a program writes to it, and the best tweak-style optimizations leverage that fact.
well said Rockoon.
Kram
Quote from: Rockoon on August 26, 2008, 01:36:05 PM
Why is there an arguement about this?
Because "your" side produces no evidence (lab tests, timings), just hearsay
Quote
You might shrug off power usage for your particular application
I never would. Show many how many % of electricity you can save by poking your code directly into memory, instead of fumbling with hl constructs
Quote
Fumble with the jxx instructions?
vs what? Thinking that .if is a real instruction?
Microsoft programmers thought 10 years ago that
.if was a real Masm instruction. It still works in Masm 9.0
It is always a pleasure to see your blood pressure rising, Rockoon :green
MASM 6.0 was released ~1991.
Quote from: MichaelW on August 26, 2008, 05:31:41 PM
MASM 6.0 was released ~1991.
So we can consider
.if "evil legacy code" :toothy
Quote from: jj2007 on August 26, 2008, 05:55:13 PM
So we can consider .if "evil legacy code" :toothy
you can, macros are made for a general use. and NOT for optimized code... otherwise everybody will use macros ! :toothy
beside your timing is more than discutable... gettickcount has a 7ms possible marge error, since you use it twice it mean 14 (and i remember you the difference obtained is inferior...) :wink
Quote from: NightWare on August 26, 2008, 09:52:35 PM
beside your timing is more than discutable... gettickcount has a 7ms possible marge error, since you use it twice it mean 14 (and i remember you the difference obtained is inferior...) :wink
I would be surprised if Michael had used GetTickcount for
test eax, eax vs
or eax, eax comparison. Where did you see that?
Quote from: NightWare on August 26, 2008, 09:52:35 PM
Quote from: jj2007 on August 26, 2008, 05:55:13 PM
So we can consider .if "evil legacy code" :toothy
you can, macros are made for a general use. and NOT for optimized code... otherwise everybody will use macros ! :toothy
Finally I understand why my code (http://www.masm32.com/board/index.php?topic=9756.msg71425#msg71425) is so slow... it's those damn macros! Thanxalot! :green2
Quote from: jj2007 on August 26, 2008, 10:07:52 PM
Where did you see that?
kip irvine's asm x86 book 5th edition (french one), specify 10ms on win98, and i read somewhere else 7ms for winXp... but don't remember where... :(
Quote from: jj2007 on August 26, 2008, 11:01:14 PM
Finally I understand why my code (http://www.masm32.com/board/index.php?topic=9756.msg71425#msg71425) is so slow... it's those damn macros! Thanxalot! :green2
no... here it's slow because you use slow instructions like fdiv... :dance:
The problem with timings is they are run in RING3 so ALL METHODS of timing that are not done under RING0 will be subject to variation due to various OS level priorities.
The only virtue of GetTickCount() is that its simple to use, it has a granularity of about 15ms in results and wanders in its timings by about 3% in practice but then so does every other timing method under RING3. The solution is to increase the sample size to get the duration higher as by doing this you reduce the error amount by the scale of the increase in duration.
In practice anything much under a half a second starts to become unreliable and the results for timings under 100ms generally produce nonsense but as the error margin is reasonably constant, you improve the timing accuracy from 100ms by ten times by running the test over 1 second.
It becomes a point of diminishing returns to run much longer than that, once you are well under 1% you are not gaining much.
Quote from: jj2007 on August 26, 2008, 05:19:39 PM
Quote from: Rockoon on August 26, 2008, 01:36:05 PM
Why is there an arguement about this?
Because "your" side produces no evidence (lab tests, timings), just hearsay
Timings are irrelevant. Even if they clock exactly the same on your time test, it is still no excuse to use more resources than necessary.
I realize that you backed yourself into a corner here, which is why you then did this:
Quote from: jj2007 on August 26, 2008, 05:19:39 PM
It is always a pleasure to see your blood pressure rising, Rockoon :green
Exactly whos blood pressure is rising here? The overly defensive person who is now making personal attacks towards a person whos made a single post in this thread, or me?
Why are you so hostile on this subject? Its real simple. Admit when you are wrong. Your ego doesnt get bruised when you do so. Your ego gets bruised when people like my call you out while you perpetuate a stupid line of reasoning in order to shift the goal post away from the flaws of your original arguement.
Quote
Microsoft programmers thought 10 years ago that .if was a real Masm instruction. It still works in Masm 9.0
Way to redefine meanings in an attempt to save your prior statements while mucking up a new arguement.
.if is a masm
directive.
Its called Assembly Language and .IF isnt an instruction.
You are grasping at straws here because you are too stuborn to admit the plainly obvious.. that it is always superior to use less resources when all other things are equal.
Quote from: hutch-- on August 26, 2008, 11:22:18 PM
The problem with timings is they are run in RING3 so ALL METHODS of timing that are not done under RING0 will be subject to variation due to various OS level priorities.
The only virtue of GetTickCount() is that its simple to use, it has a granularity of about 15ms in results and wanders in its timings by about 3% in practice but then so does every other timing method under RING3. The solution is to increase the sample size to get the duration higher as by doing this you reduce the error amount by the scale of the increase in duration.
In practice anything much under a half a second starts to become unreliable and the results for timings under 100ms generally produce nonsense but as the error margin is reasonably constant, you improve the timing accuracy from 100ms by ten times by running the test over 1 second.
It becomes a point of diminishing returns to run much longer than that, once you are well under 1% you are not gaining much.
The problem with timing code fragments via artificial repetition is that you arent timing a practical real world scenario...
...unless the real world scenario you are trying to time happens to be that code fragment, nestled between that loop code, surrounded by those timing calls, with that specific instruction alignment, and so on..
> The problem with timing code fragments via artificial repetition is that you arent timing a practical real world scenario...
I generally agree with this view, thats why you time algorithms, not just fragments as the algorithm is much closer to a real world operation but the repetitive method of large count looping gives you a method of comparison where improvements in test speed correlate to improvements in real world performance.
The other factor is in almost every instance applications run under RING3 so timings taken under RING3 are far closer to how they will be used than fancy methods done in RING0 that don't have OS interference.
Quote from: hutch-- on August 26, 2008, 11:40:07 PM
I generally agree with this view, thats why you time algorithms, not just fragments as the algorithm is much closer to a real world operation but the repetitive method of large count looping gives you a method of comparison where improvements in test speed correlate to improvements in real world performance.
I agree that it gives a method of comparison, but it is unreliable. Both false positives and false negatives plague the results of these sorts of tests.
It is certainly the case that if X > Y in these sortsa of timing tests, then X > Y is likely to remain the case in real world code and X < Y is extremely unlikely in real world code.
But when X = Y in these sortsa timing tests, it is not unrealistic to find that X != Y in a real world case...
...and furthermore, and more to the point, if we choose randomly between these two alternatives than we have a 50% chance of picking the superior method in those cases where X != Y. That 50% probability is as bad as it gets, while an intelligent chooser can expect to do better than 50% by simply using a rational line of reasoning.
X is better than Y because ....
Now, if I were faced with this thread and didnt know anything at all about computer architecture.. I would see that one side has given a reason why they believe their X is better than Y, and it certainly sounds reasonable, and that the other side has not given any reason at all why their Y is better than X and infact are simply declaring that X = Y, well you know what.. i'm going to use X .. since X is safe no matter who is right.
Quote from: Rockoon on August 26, 2008, 11:26:41 PM
. . .that it is always superior to use less resources when all other things are equal.
It's a shot in the dark. All other things seldom are equal, how is one to know when they are, and exactly how would you go about determining the amount of resources used?
Quote
The problem with timing code fragments via artificial repetition is that you arent timing a practical real world scenario...
...unless the real world scenario you are trying to time happens to be that code fragment, nestled between that loop code, surrounded by those timing calls, with that specific instruction alignment, and so on..
The point of timing code, whether small fragments or entire algorithms is to provide a reasonable basis for selecting the fastest instructions, instruction sequences, or algorithms. At least practically speaking, there is no other way to do this.
Quote from: Rockoon on August 26, 2008, 11:23:31 PMTimings are irrelevant. Even if they clock exactly the same on your time test, it is still no excuse to use more resources than necessary.
As an economist, I am truly concerned about resource use. Bring me evidence (timings, lab tests, proof that CPU temperature is rising, and calculations how that would increase energy consumption if applied to all Google servers in the World) that there is a
significant difference between
.if eax and
test eax, eax, and I will fall on my knees, praise Rockoon The Genius and quote you the rest of my life.
QuoteI realize that you backed yourself into a corner here, which is why you then did this:
Quote from: jj2007 on August 26, 2008, 05:19:39 PM
It is always a pleasure to see your blood pressure rising, Rockoon :green
Exactly whos blood pressure is rising here? The overly defensive person who is now making personal attacks towards a person whos made a single post in this thread, or me?
Well, you arrive in this thread fuming like an attacking pitbull:
QuoteChoosing the extra operation in equivilent performance situations is just being stuborn, and promoting it would be irresponsible.
Which means the authors of
.if (Microsoft, I guess?) are irresponsible :bg
QuoteWhy are you so hostile on this subject? Its real simple. Admit when you are wrong.
My first argument was and still is that there is no substantial argument for using
test eax, eax
je @F
nop
@@:
instead of
.if eax
nop
.endif
My second argument was that it would be particularly stupid to tell newbies not to use the
.if directive (sorry for my fault, you are perfectly right here). They need to be
encouraged, not
discouraged to use the built in macros. Once they have acquired confidence, they will ask the right questions, and learn how to write their own optimised macros.
I like this thread, and feel comfortable in my corner. There are some amused observers, and a number of people who behave as if they had themselves miniaturised, crept into the CPU, and watched with horror how jj ruthlessly destroyed registers, thus overheating the CPU and damaging the Earth's climate. As long as you do not bring
proof, i.e. figures showing the difference between the two instructions, I will continue to call these arguments "folkloristic"; as soon as you bring such evidence, I will admit my failure. Not a ms earlier :bg
As I said earlier, if I am not using
.if for some reasons, I do use
test eax, eax instead of or
eax, eax. It's good practice, but it's not a religion.
Quote from: jj2007 on August 27, 2008, 08:06:53 AM
My first argument was and still is that there is no substantial argument for using
test eax, eax
je @F
nop
@@:
instead of
.if eax
nop
.endif
My argument against that is: I want to control what happens in my code as far as opcodes go. When the code coming from ".if"
changes - as it can at any time - it might break something (or several somethings) that I've used for years.
I hate macros (for the above reason) but I have a few of my own. One is "jeaxz" (i like the idea of jecxz)
jeaxz macro lbl:REQ
test eax,eax
jz lbl
endm
Quote from: sinsi on August 27, 2008, 09:20:44 AM
My argument against that is: I want to control what happens in my code as far as opcodes go. When the code coming from ".if" changes - as it can at any time - it might break something (or several somethings) that I've used for years.
I understand. However, it is extremely unlikely that Microsoft would tell ml to produce a different code (e.g. test eax, eax :wink) - imagine the pile of their own code that would be broken :red
The argument is more complex when talking about macros produced by third parties like members of the Masm32 forum. Example szstring to szstring copy:
cst MACRO arg1,arg2
invoke szCopy,reparg(arg2),tstarg(arg1)
ENDM
That is a nice macro, of course, but you become dependent of one call and two other macros. The question is not whether macros are good or bad (an optimised macro can save a lot of time), but rather how much confidence we can have in its authors. I trust Hutch that
m2m will never break my future code... :wink
Quote
jeaxz macro lbl:REQ
test eax,eax
jz lbl
endm
Cute :bg
I must admit I find some humour in some of these discussions, the notion that there is a RIGHT WAY in a field as variable as assembler is not without its problems, make a rule and someone will always break it by doing something different that just works better. If you accept the axiom of change, you are never disappointed as change will continue with or without you.
In OZ we call it "Rafferty's rules", something like "Anything goes" and if you want to get technical "epistemological anarchy" which means little more than "if it works, do it !". :bg
Quote from: jj2007 on August 27, 2008, 08:06:53 AM
Quote from: Rockoon on August 26, 2008, 11:23:31 PMTimings are irrelevant. Even if they clock exactly the same on your time test, it is still no excuse to use more resources than necessary.
As an economist, I am truly concerned about resource use. Bring me evidence (timings, lab tests, proof that CPU temperature is rising, and calculations how that would increase energy consumption if applied to all Google servers in the World) that there is a significant difference between .if eax and test eax, eax, and I will fall on my knees, praise Rockoon The Genius and quote you the rest of my life.
So let's look at it from a processor point of view. In the case of TEST it only has to do ONE thing. And that is update the flags. In the case of OR it has to do TWO things. Update the flags AND UPDATE the destination register. There is no way the OR doesn't use less resources, since it has to do more.
When I was at Dell the hardware guys had to do thermal testing of the processors. They had a program that ran different kinds of code. It started out with ALU code, FP code, MMX code, and finally SSE2 coded. When it hit the SSE2 code, the temperature was at it's hottest. :thumbu
Quote from: Mark_Larson on August 27, 2008, 02:29:47 PM
Quote from: jj2007 on August 27, 2008, 08:06:53 AM
Bring me evidence (timings, lab tests, proof that CPU temperature is rising, and calculations how that would increase energy consumption if applied to all Google servers in the World) that there is a significant difference between .if eax and test eax, eax
Quote
When I was at Dell the hardware guys had to do thermal testing of the processors. They had a program that ran different kinds of code. It started out with ALU code, FP code, MMX code, and finally SSE2 coded. When it hit the SSE2 code, the temperature was at it's hottest. :thumbu
I hope you are not seriously suggesting that using SSE2 is bad programming practice?
Please contact your old friends at Dell. We'll put up a serious test: How much % extra energy consumption when using
.if eax (aka or eax, eax) instead of
test eax, eax while booting Vista? Since they are building notebooks, they must be concerned about battery life issues.
Quote from: jj2007 on August 27, 2008, 02:41:23 PM
Quote from: Mark_Larson on August 27, 2008, 02:29:47 PM
Quote from: jj2007 on August 27, 2008, 08:06:53 AM
Bring me evidence (timings, lab tests, proof that CPU temperature is rising, and calculations how that would increase energy consumption if applied to all Google servers in the World) that there is a significant difference between .if eax and test eax, eax
Quote
When I was at Dell the hardware guys had to do thermal testing of the processors. They had a program that ran different kinds of code. It started out with ALU code, FP code, MMX code, and finally SSE2 coded. When it hit the SSE2 code, the temperature was at it's hottest. :thumbu
I hope you are not seriously suggesting that using SSE2 is bad programming practice?
Please contact your old friends at Dell. We'll put up a serious test: How much % extra energy consumption when using .if eax (aka or eax, eax) instead of test eax, eax while booting Vista? Since they are building notebooks, they must be concerned about battery life issues.
they are right, you don't want to admit you are wrong.
you said you wanted proof of it using more resources. done. and done posting since it's obviously a waste of time.
Quote from: jj2007 on August 27, 2008, 08:06:53 AM
As an economist, I am truly concerned about resource use. Bring me evidence (timings, lab tests, proof that CPU temperature is rising, and calculations how that would increase energy consumption if applied to all Google servers in the World) that there is a significant difference between .if eax and test eax, eax, and I will fall on my knees, praise Rockoon The Genius and quote you the rest of my life.
The first, obvious significant difference:
'.if eax' produces 2 instructions while 'test eax, eax' is a single instruction.
Now i'm sure that you are cursing yourself for not including a conditional branch with the test instruction.. so I am going to give you a pass on that..
.if eax produces 2 instructions right next to each other which you cannot seperate. You cannot place an instruction in between these two when it is sometimes advantageous from a performance standpoint to do so.
Further, it is non-trivial to make the second instruction a target of a branch which is sometimes advantageous from a performance perspective (why perform a cmp/test/whatever when the flags already have the information?)
And finally, my point about .if wasnt even performance related. I said exactly what my point was, but you convieniently ignored that point in order to have a leg to stand upon.
My point was related to you suggesting that newbie assembly language programers should use it. No, they shouldn't. They need to learn about the flags register and about which instructions effect them, as well as which instruction effect only some of them. The flags are the core of program flow, and are key to many big optimizations which avoid changing program flow. Directives such as '.if' are shackles.
The area of "newbie" programming is one I have a reasonable amount of mileage in and the assumption that learner programmers should have to start with bare mnemonic decision making is a serious mistake that is still being flogged by some. My generation grew up with bare mnemonic coding and many held that prejudice well past its necessity through habit and conservatism.
A field of programming like assembler is abstract enough to start with and this is where learners have the greatest difficulty even though many have programming experience in other languages. The old approach saw assembler programming collapse through lack of interest while having a eputation of being unreadable, unlearnable and un-necessary. The new approach of using everything that was available including pseudo high level constructs, RC style resources, macros, libraries, modular programming techniques, algorithm design, complex data structures and anything else that was in the rangle of conventional programming techniques put assembler programming back on the map as a viable language for high performance applications.
Put in short form, the olf style failed where the new style succeeded because it was learnable where the old stuff wasn't. A large number of people who started the old way failed and wrote NO ASSEMBLER code, those who start with the new approach get much more writen much more easily and get to see enough assembler code to learn the more complex and abstract lower level formats.
Quote from: Rockoon on August 27, 2008, 04:52:38 PM
Quote from: jj2007 on August 27, 2008, 08:06:53 AM
As an economist, I am truly concerned about resource use. Bring me evidence (timings, lab tests, proof that CPU temperature is rising, and calculations how that would increase energy consumption if applied to all Google servers in the World) that there is a significant difference between .if eax and test eax, eax, and I will fall on my knees, praise Rockoon The Genius and quote you the rest of my life.
The first, obvious significant difference:
'.if eax' produces 2 instructions while 'test eax, eax' is a single instruction.
Maybe you should read this post on page 3 (http://www.masm32.com/board/index.php?topic=9650.msg71475#msg71475) of this thread? It's nicely described there, at newbie level.
Quote
Now i'm sure that you are cursing yourself for not including a conditional branch with the test instruction.. so I am going to give you a pass on that..
No, I am not cursing myself; on the contrary, I will drink a glass of wine on your health - it's good to know you can be generous and have a sense of humour :toothy
Quote
.if eax produces 2 instructions right next to each other which you cannot seperate. You cannot place an instruction in between these two when it is sometimes advantageous from a performance standpoint to do so.
Further, it is non-trivial to make the second instruction a target of a branch which is sometimes advantageous from a performance perspective (why perform a cmp/test/whatever when the flags already have the information?)
What you say is absolutely correct. Newbies are bound to discover optimisation after having successfully written their first ten thousand lines of code.
Quote
And finally, my point about .if wasnt even performance related. I said exactly what my point was, but you convieniently ignored that point in order to have a leg to stand upon.
If I understood you well, you wanted to help Al Gore saving The Climate by banning
or eax, eax. To which Mark conveniently added that at Dell's they found higher temperatures for SSE2 instructions (which for him is The Proof, although I am just a humble economist unable to follow his line of argument).
Quote
My point was related to you suggesting that newbie assembly language programers should use it. No, they shouldn't. They need to learn about the flags register and about which instructions effect them, as well as which instruction effect only some of them. The flags are the core of program flow, and are key to many big optimizations which avoid changing program flow. Directives such as '.if' are shackles.
Hmmm... I had the good intention to not spoil the excellent post of Sir Hutch right above this one, but I cannot refrain from asking the obvious question: If
.if directives are shackles, then for whom did Microsoft write them? For seasoned young programmers like you?
Prost :U
JJ,
QuoteNo, I am not cursing myself; on the contrary, I will drink a glass of wine on your health - it's good to know you can be generous and have a sense of humour
I think, between typing nonsense on the Internet and drinking that wine all day, you have time for little else. How old are you, anyway? This whole conversation thing is beginning to remind me of someone else who used to do this until he hit the age of 18. He used to drive me crazy also. Have you tried Harold's board, they may be more attuned to your way of thinking andf you all can have a very cheery time.
Seriously, JJ, this is not fun any more and you should stop it.-- Paul
It is definitely time that this thread got put to sleep. :boohoo:
Quote from: ChrisLeslie on August 28, 2008, 05:52:01 AM
It is definitely time that this thread got put to sleep. :boohoo:
.if humour == 0
.break
.else
MsgBox 0, "One more game?", " :bg "
.endif
Quote from: jj2007 on August 28, 2008, 07:07:48 AM
MsgBox 0, "One more game?", " :bg "
Game over. Go to bed young man.
.if eax & eax
test eax,eax
.if !zero?
its all about not having to make up labels for every meaningless loop/branch...
PS: didn't read all the posts
Good point, drizz, your examples are certainly not as evil as the simple .if eax :U
.if eax & eax
nop
.endif
;test eax, eax
;je short ..
;nop
test eax,eax
.if !zero?
nop
.endif
;test eax, eax
;je short ..
;nop
.if eax
nop
.endif
;or eax, eax
;je short ..
;nop
.if eax !=0
nop
.endif
;or eax, eax
;je short ..
;nop
.if !eax==0
nop
.endif
;or eax, eax
;jne short ..
;nop
:bg
> its all about not having to make up labels for every meaningless loop/branch...
@@: