News:

MASM32 SDK Description, downloads and other helpful links
MASM32.com New Forum Link
masmforum WebSite

Which is faster?

Started by Neil, May 01, 2009, 10:56:52 AM

Previous topic - Next topic

jj2007

Quote from: lingo on May 05, 2009, 01:43:44 PM
Let see what is "your" and what is "mine"

strlen64B     proc szBuffer : dword
   pop        ecx
   pop        eax
....
strlen32s    proc      src:DWORD
      pop       eax         ; trash the return address
      pop       eax         ; the src pointer


Quote from: jj2007 on November 25, 2008, 08:57:12 PM
The SetSmallRect procedure looks ok. Here is another one, just in case - only 20 bytes long and pretty fast.

OPTION PROLOGUE:NONE
OPTION EPILOGUE:NONE
SetRect16 proc ps_r:DWORD,left:DWORD,top:DWORD,right:DWORD,bottom:DWORD
pop edx ; trash the return address
pop edx ; move the first argument to edx
pop dword ptr [edx].SMALL_RECT.Left
pop dword ptr [edx].SMALL_RECT.Top
pop dword ptr [edx].SMALL_RECT.Right
pop word ptr [edx].SMALL_RECT.Bottom
sub esp, 5*4+2 ; correct for 5 dword + 1 word pop, restore return address
ret 5*4 ; correct stack for five arguments
SetRect16 endp
OPTION PROLOGUE:PrologueDef
OPTION EPILOGUE:EpilogueDef


etc etc - so who is the thief here?

But I simply don't have the time to follow Lingo's game. Nobody "steals" here, I even acknowledge Lingo when I take over bits of his code. The Intel set has only a limited number of mnemonics, so it is inevitable that certain sequences pop up all over the place. Try this Google search for pcmpeqb pmovmskb. Does he ever admit where he gets his inspiration? Does he "steal" from Intel when he uses their manuals?

I couldn't care less for Lingo, but the whole forum loses credibility if members are insulted as "idiots" and "thieves", and no moderator intervenes.

Mark Jones

Quote from: MichaelW on May 04, 2009, 10:26:49 PM
Quoteit seems obvious that the counters are coming from the 2 cores

If you think that is so, then try restricting the process to the first core by adding these statements to your source somewhere above the tests:

    invoke GetCurrentProcess
    invoke SetProcessAffinityMask, eax, 1


And you might also want to try the second core, specified with an affinity mask value of 2.

I don't think this is the issue, because the source does set the process affinity as suggested, and the timing routine (Petroizki's ptimers.inc) is included in-line and seems to be a single-threaded routine. Thus all of it should run in one process, thread, and core, no?

Perhaps what Dave is seeing is a power-saving feature of his CPU causing errors in the timing resolution, due to clock variance?
"To deny our impulses... foolish; to revel in them, chaos." MCJ 2003.08

MichaelW

QuotePerhaps what Dave is seeing is a power-saving feature of his CPU causing errors in the timing resolution, due to clock variance?

AFAIK the TSC count should be independent of the clock frequency. Although I can't back this up, I was under the impression that recent processors use a fully static design that can accommodate any clock frequency from zero to the rated maximum, without missing a step. I think it's more likely that some other process is "borrowing" the processor while the test (or reference loop) is running.
eschew obfuscation

dedndave

well, i see two possibilities:
1) the RDTSC instruction is reading TS counter values from the 2 cores (makes sense)
2) the floating point math used is executing differently on my machine (as in a FP instruction serialization type problem)
    (i.e. it is possible a fwait is missing that does not cause trouble until it gets executed on a dual core cpu)

as for power/standby/hibernation, the very first thing i do after rebuilding a drive is turn all that stuff off
always on (desktop) - never turn off drives - never turn off monitor - disable hibernation - i also select screensaver: none

i have all the toys in place to test it
you guys will probably chuckle when i say, "the hard part is displaying the processor type" - lol
i swear - i thought MS was bad
i am half-tempted to copy/paste JJ's CpuID code in there - lol
but, i don't learn anything by doing that

NightWare

Quote from: jj2007 on May 05, 2009, 01:37:47 PM
Very funny, Sir Hutch. What is your official policy in this forum regarding calling other members (Nightware, myself) idiots? Do you recommend it nowadays officially?

hmm, personally i don't care, everybody can think what they want (plus, it's just words, if you take them all seriously...). now, drinking WATER ? why not MILK ? it's clearly an insult. i will report the author to the moderators, one day...  :bg

Quote from: lingo on May 05, 2009, 01:43:44 PM
Everyone (including sick people) can do with my code what they want but when someone tolerate idiotic behavior as a useless registers preservation I can't be quiet.

you mean like preserving ebx/esi/edi when it has been already done by the OS ?  :lol

plus, just for info, i don't preserve ecx/edx by default, i ONLY preserve the registers i use/alterate (including ebx/esi/edi) AND for MY OWN use. it's MY OWN calling convention, and i perfectly know why i proceed like that. if i don't preach to impose this calling technic it's because i understand that others can see it differently.

however, blindly following Microsoft's recommandations (or an interpretation of thoses recommandations) isn't very clever... (hmm... maybe one day, if i want to become another sheep, later...).



hutch--

hmmmm,

> but the whole forum loses credibility if members are insulted as "idiots" and "thieves", and no moderator intervenes.

Seems that bulldozer approach will have to be put in place soon. Why does the "pregnant schoolgirl" image come to mind ? Perhaps Deja Vu (and it seems, ahhhhhhhhh been here before ..... and it makes me woder [pause] whats goin' on etc .......) (apologies to Crosby Stills Nash and Young).

I loath to close down a topic that has code in it but sooner or even sooner still, if the PHUKING nonsense does not stop, I will turn Mount Everest into the Utah Salt Flats. I have no feel whatsoever for "camp" melodramas and I don't see it has a place in a technical forum for programmers. I will not take sides between members in a dispute as silly as this one, I will just pull the plug on it if it continues.
Download site for MASM32      New MASM Forum
https://masm32.com          https://masm32.com/board/index.php

dedndave

well - there has been a lot of fruitful discussion on this thread, also
hate to see it disappear, as i am using some of it as reference for a current project

jj2007

Quote from: dedndave on May 05, 2009, 10:25:51 PM

i am half-tempted to copy/paste JJ's CpuID code in there - lol
but, i don't learn anything by doing that


Attached for copy & paste the mininum code, displaying brand string and SSE level, adds only 142 bytes to the exe. It is commented, but reading it together with the Wikipedia description of CPUID might help. Don't forget to move the PROTO upstairs :thumbu
Output:
Intel(R) Celeron(R) M CPU        420  @ 1.60GHz (SSE3)

[attachment deleted by admin]

jj2007

Quote from: NightWare on May 05, 2009, 11:46:33 PM

i don't preserve ecx/edx by default, i ONLY preserve the registers i use/alterate (including ebx/esi/edi) AND for MY OWN use. it's MY OWN calling convention, and i perfectly know why i proceed like that. if i don't preach to impose this calling technic it's because i understand that others can see it differently.


I agree 100%. It is a question of personal taste, everybody is free to do whatever he/she likes. My taste is to preserve ecx and edx when I alter them, it costs me 4 bytes and 3 cycles. Not a big "loss" for routines that typically run in hundreds or thousands of cycles, and are often being called more than once in a context where these two registers already serve a purpose and therefore must be saved anyway.

:bg

dedndave

Thanks JJ

I found this pdf file from Intel, "Intel Processor Identification and the CPUID Instruction"

www.intel.com/Assets/PDF/appnote/241618.pdf

You can save the file as text, or google "241618.pdf" and use google to convert it to HTML, then use the browser to save it as text
It has a very coprehensive program in assembler that ID's Intel CPU's

Then, use this one from AMD, "CPUID Specification" to add a few touches

www.amd.com/us-en/assets/content_type/white_papers_and_tech_docs/25481.pdf

Really, this is beyond the scope of what I wanted to do for CPU identification
I really just want something like your "Short Version" - I don't even care about the clock frequency
In fact, as a minimalist approach, all I really need is how many cores the CPU has
GetProcessAffinityMask tells me how many the system uses - that is really enough for the program to function

jj2007

Quote from: dedndave on May 06, 2009, 07:40:44 AM

In fact, as a minimalist approach, all I really need is how many cores the CPU has
GetProcessAffinityMask tells me how many the system uses - that is really enough for the program to function


Remember your own post here?
The code gives me the same results:
QuoteCPU family 15, model 4, Pentium 4 Prescott (2005+), MMX, SSE6
Cores           2
... but according to Wikipedia the Prescott has only one core ::)

GetProcessAffinityMask sounds promising, though - thanks for the hint. But it also says I have two cores:

SystemAffinityMask:     00000000000000000000000000000011
ProcessAffinityMask:    00000000000000000000000000000011


Any hardware experts around...?

include \masm32\include\masm32rt.inc

.data?
ProcessAffinityMask dd ?
SystemAffinityMask dd ?
buffer dd 10 dup (?)

.code
start:
invoke GetCurrentProcess
invoke GetProcessAffinityMask, eax, offset ProcessAffinityMask, offset SystemAffinityMask
print "SystemAffinityMask: ", 9
invoke dw2bin_ex, SystemAffinityMask, offset buffer
print offset buffer,13,10
print "ProcessAffinityMask: ", 9
invoke dw2bin_ex, ProcessAffinityMask, offset buffer
print offset buffer,13,10
getkey
exit

end start

BlackVortex


dedndave

Well, I can tell you the Prescott has 2 cores - lol
I re-read the Wikipedia article you linked and it does not really say it has a single core, per se
In that article, they use the term "core" to mean the overall core - not stating it has 2 (or 1, either)
That is odd that you pointed that out JJ - lol - I had read that page earlier this week and had not noticed the omission
I guess I was more interested in the heat issue it mentions
I am an Electronics Engineer, although I avoid the term "expert" because those who use it are usually showing how little they know
In any event, the system affinity mask returned by the GetProcessAffinityMask really tells you what you are able to access
If you had a CPU with 8 cores, but the system only uses 7, 7 is probably all you could use without generating some kind of protection fault
The real authority on how many cores the CPU has is the manufacturer, I suppose
If you use CPUID (and all it's whack-a-mole caveats), it will tell you that the Prescott is dual core

jj2007

Quote from: dedndave on May 06, 2009, 08:55:55 AM
Well, I can tell you the Prescott has 2 cores - lol

You probably have seen them with your own eyes, so I believe you :bg

That Prescott story is truly confusing. Apparently, before the first "true" Dual Core came out, they fumbled two Prescotts on one board and called it "Smithfield" - see Google search. Besides, I could not find any clear documentation of the CPUID code that is supposed to tell you the number of cores :(

dedndave


check this document out JJ - go to the end look at their masm code
www.intel.com/Assets/PDF/appnote/241618.pdf

the prescott - 2 cores right across the middle
now, you can say you have seen them, too - lol