News:

MASM32 SDK Description, downloads and other helpful links
MASM32.com New Forum Link
masmforum WebSite

help with MMX/SSE2 theory on different cpus

Started by daydreamer, June 01, 2006, 07:01:45 PM

Previous topic - Next topic

daydreamer

I want help with get this tested on as many different cpus as possible, from old pentium II still running to newest amd64 /Pentium D etc
I prefer you step it in a debugger to see, non SSE2 caps cpu run MMX regs  and SSE2 caps, do perform in XMM regs instead
my theory I want to prove is general: non SSE2 cpus all ignore 66h prefix and executes them as MMX
if there is some exceptions, its good to know
if its true, same code with different loops can be used for MMX/SSE2, instead of twice the amount of code for each case

tested on my sempron socket A with non SSE2 caps and it works ignoring prefix and executes as MMX
you are also free to experiment and improve this procedural code to produce better things
please post your result in this thread and cpu


[attachment deleted by admin]

GregL

Runs fine on my Pentium III 1 GHz. The CPU is ignoring the 66h prefix and executes them as MMX.




EduardoS

I noted it when my XP was returning wrong result when running SSE2 code,
CPUs without SSE2 will ignore the 66h prefix (when it exists) as you see,
but write a code that work fine with both MMX registers and SSE maybe more complex than writing the code twice...

dsouza123

Runs on an Athlon Thunderbird 1.2 (uc at 1170 Mhz) with XP Pro,

Produces left to right moving waves that change from blueish/green to yellow.

CPU only supports ALU, FPU, MMX+, 3DNow+.

EduardoS

I tested here (with SSE2), don't work...
I see the code, MMX register use a diferent register file than SSE, so a movd MM0, [eax] won't change XMM0 (you should put a 66h in front of SSE to do it), i can't recompile here, will try tomorrow with Athlon XP.

daydreamer

thanks everyone for participating in this test
quote author=EduardoS link=topic=4910.msg36673#msg36673 date=1149214700]
I tested here (with SSE2), don't work...
I see the code, MMX register use a diferent register file than SSE, so a movd MM0, [eax] won't change XMM0 (you should put a 66h in front of SSE to do it), i can't recompile here, will try tomorrow with Athlon XP.
Quote
dont work, little more complicated when right alignment must work or gp
advantage is movq<->movdqa, db 66h only changes data size between 64bit and 128bit, you make it work on 2 pixels/4 pixels

.data
iterations dd 200000
stepping dd 8

align 16
mmxdata1 dq ?????,??????
mmxdata2 dq ?????,?????
.code
;initialization
.IF SSE2bit==1
mov eax,stepping
add eax,eax
mov stepping,eax
mov eax,iterations
sar eax,1
mov iterations,eax
.ENDIF
;MMX/SSE2 loop
mov ecx,iterations
mov ebx,adress
and ebx, 0FFFFFFF0h ;see to it adress is aligned on 16byte boundary
Loop:db 66h
movq MM0,[ebx]
;do your calculation here, sticking to MMX that work on qword at a time, each has a db 66h before its opcode
db 66h
movq [ebx],MM0 ;store back result
add ebx,step
dec ecx
jne Loop


EduardoS

I can't try here, what happen if i put a 0F3h prefix before a movq instruction in a CPU without SSE2?