The MASM Forum Archive 2004 to 2012

Project Support Forums => 64 Bit Assembler => Topic started by: sinsi on February 03, 2012, 06:18:51 AM

Title: Zeroing a register
Post by: sinsi on February 03, 2012, 06:18:51 AM
Is there any performance problem between using a 32-bit or 64-bit register?
    sub r8d,r8d
    sub r8,r8

Both do the same thing, both are encoded as 3 bytes, but is one better?
I used r8 in this example because using eax/rax there is a difference (extra byte for the rex prefix).
Title: Re: Zeroing a register
Post by: sinsi on February 03, 2012, 09:59:36 AM
Maybe I should qualify my question, it's not so much the performance (1 clock cycle ain't a killer) but more like a gotcha.
I am thinking of stalls, like when using an 8- or 16-bit register in 32-bit mode.
Title: Re: Zeroing a register
Post by: habran on February 03, 2012, 11:09:52 AM
Hi sinsi,

I am using XOR reg,reg instead of SUB reg,reg
even though both instructions use 1 clock cycle on 486 processor
XOR is looking better and more sophisticated because it looks you understand binary numbers

here http://classes.engr.oregonstate.edu/eecs/summer2008/cs271/Instructions.htm you can check for clock cycles

regards
Title: Re: Zeroing a register
Post by: hutch-- on February 03, 2012, 11:21:19 AM
sinsi,

I don't know if 64 bit capable hardware suffered the problem that early PIVs did with partial register writes stalling a larger register read or write shortly after it. I personally doubt that a zeroing operation fits into that style of problem as both SUB and XOR tend to live in silicon, not microcode but probably the only safe way is to make a small test piece and time it. I remember on a PIII that you used to get very bad stalls if you performed a BYTE operation followed shortly after with a DWORD operation on a register and it was blatantly obvious that the timing was different.

If you don't get major differences in the timing, then it probably is not a big deal.
Title: Re: Zeroing a register
Post by: sinsi on February 04, 2012, 10:13:25 AM
Yeah, I can't really see a problem with zeroing the upper bits, that's built in to all sorts of other instructions.
Interesting, I was wondering about 32/64, never thought about e.g. r8b and how that affects r8/r8d/r8w. Same I should think as al/eax in 32-bit cpus.

All we need is for MichaelW to make timers64...although I am having a go at it on and off.
Title: Re: Zeroing a register
Post by: MichaelW on February 04, 2012, 10:30:57 AM
If Dave would hurry up and win the lottery he could buy me a new system as he promised, and then I could make the move to 64 bits :bg
Title: Re: Zeroing a register
Post by: qWord on February 04, 2012, 12:07:56 PM
Quote from: sinsi on February 04, 2012, 10:13:25 AM
All we need is for MichaelW to make timers64...although I am having a go at it on and off.
I've translate them a while ago:
; x64-Version of MichaelW's macros
counter_begin MACRO loopcount:REQ, priority
LOCAL label

IFNDEF tmcb__nLoops
.data
align 16
tmcb__nLoops dd 0
tmcb__cntr dd 0
tmcb__qw dq 2 dup (?)
.code
ENDIF

mov tmcb__nLoops,loopcount
IFNB <priority>
call GetCurrentProcess
mov rdx,priority
mov rcx,rax
call SetPriorityClass
ENDIF
xor rax,rax
cpuid
rdtsc

mov DWORD ptr tmcb__qw[0],eax
mov DWORD ptr tmcb__qw[4],edx
mov tmcb__cntr, loopcount
xor rax,rax
cpuid
align 16
@@:
sub tmcb__cntr,1
jnz @B

xor rax,rax
cpuid
rdtsc
shl rdx,32
or rax,rdx
sub rax,tmcb__qw[0]
mov tmcb__qw[0],rax

xor rax, rax
cpuid
rdtsc
mov tmcb__cntr,loopcount
mov DWORD ptr tmcb__qw[8],eax
mov DWORD ptr tmcb__qw[12],edx
xor rax,rax
cpuid
align 16
label:
tmcb__label equ <label>
ENDM

; x64-Version of MichaelW's macros
counter_end MACRO
sub tmcb__cntr,1
jnz tmcb__label

xor rax,rax
cpuid
rdtsc
shl rdx,32
or rax,rdx
sub rax,tmcb__qw[0]
sub rax,tmcb__qw[8]
mov tmcb__qw[0],rax

call GetCurrentProcess
mov rdx,NORMAL_PRIORITY_CLASS
mov rcx,rax
call SetPriorityClass

IFDEF _EMMS
EMMS
ENDIF

finit
fild tmcb__qw[0]
fild tmcb__nLoops
fdiv
fistp tmcb__qw[0]

mov rax,tmcb__qw[0]
ENDM