The MASM Forum Archive 2004 to 2012

General Forums => The Campus => Topic started by: stanhebben on May 19, 2006, 11:19:27 AM

Title: Getting SSE working
Post by: stanhebben on May 19, 2006, 11:19:27 AM
I'm trying to use SSE/SSE2 instructions in my program, but the compiler doesn't recognize them. For example,

  movss mmx1, [esi+eax]

generates the following error:
  error A2006: undefined symbol : mmx1

I'm using masm 6.15, and xmm is enabled (.686 .xmm)

Anyone knows what I've forgotten?
Title: Re: Getting SSE working
Post by: MichaelW on May 26, 2006, 08:33:32 AM
I don't know much about this, but I can tell you that the registers are mm0-mm7 and xmm0-xmm7.


Title: Re: Getting SSE working
Post by: Ossa on May 26, 2006, 09:09:18 AM
For the sake of completeness, I'll repost the macros I put here before the "event"...

the MM0-MM7 and XMM0-XMM7 (or higher) registers need to be in upper case in MASM, so here are some macros that allow lower case registers (if, like me, you prefer them this way). You can extend these to do the XMM8-XMM15 (as i think they are called) registers that exist on the 64-bit processors.

IFDEF MM0
mm0 TEXTEQU MM0
mm1 TEXTEQU MM1
mm2 TEXTEQU MM2
mm3 TEXTEQU MM3
mm4 TEXTEQU MM4
mm5 TEXTEQU MM5
mm6 TEXTEQU MM6
mm7 TEXTEQU MM7
ENDIF

IFDEF XMM0
xmm0 TEXTEQU XMM0
xmm1 TEXTEQU XMM1
xmm2 TEXTEQU XMM2
xmm3 TEXTEQU XMM3
xmm4 TEXTEQU XMM4
xmm5 TEXTEQU XMM5
xmm6 TEXTEQU XMM6
xmm7 TEXTEQU XMM7
ENDIF


Ossa
Title: Re: Getting SSE working
Post by: MichaelW on May 26, 2006, 12:03:20 PM
If I recall correctly the case sensitivity for the register names is related to the relative positions of the option casemap:none and the .MMX or .XMM directives. This assembled and ran without error:


; «««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««
    include \masm32\include\masm32rt.inc
    .686
    .mmx
    .xmm
; «««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««
    .data
        x dq 0,0
    .code
; «««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««
start:
; «««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««
    movss mm1, x
    movss xmm1, x
    inkey "Press any key to exit..."
    exit
; «««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««««
end start


Title: Re: Getting SSE working
Post by: Mark Jones on May 26, 2006, 08:16:23 PM
I have a question. What would be the best approach to using the most advanced FPU functions available for a given PC? Would you determine the FPU capabilities first with CPUID and set a global flag to branch to individual code sections? (If FPU=MMX jmp xxx else if FPU=XMM jmp yyy, etc.) Or is the most efficient method to setup a series of error handlers and just "skip" over the unsupported instructions? (i.e. setup errorhandler1, try xmm instruction, if fail setup errorhandler2, try mmx instruction, etc.)
Title: Re: Getting SSE working
Post by: Ossa on May 26, 2006, 08:29:42 PM
Personally, (if the speedup was very important) I would rewrite every procedure that could make use of more powerful features for each "level" that I wanted to support... e.g. one for FPU only, one for MMX, one for SSE, one for SSE2... then when they are called, I would not call them directly, but rather have a jump table (or call table... whatever), would load the addess and jump to it... the jump table would created at startup, so that no time is lost whilst running checking which mode we're in. Maybe that deserves some code:


.data?

pMathsHeavyProc DWORD ?

...

.code
start:

invoke CallTableInit

...

mov eax, pMathsHeavyProc
call eax

...

CallTableInit PROC
    ; if FPU only
    mov pMathsHeavyProc, offset MathsHeavyProcFPU

    ; if MMX
    mov pMathsHeavyProc, offset MathsHeavyProcMMX

    ; if SSE
    mov pMathsHeavyProc, offset MathsHeavyProcSSE

    ...

CallTableInit ENDP

MathsHeavyProcFPU PROC
    ...
MathsHeavyProcFPU ENDP

MathsHeavyProcMMX PROC
    ...
MathsHeavyProcMMX ENDP

MathsHeavyProcSSE PROC
    ...
MathsHeavyProcSSE ENDP


It's a lot of work, but I think it's worth it if you need the speed up.

Ossa
Title: Re: Getting SSE working
Post by: stanhebben on May 27, 2006, 09:21:21 AM
I recently rewrote a drawing routine so it used SSE2 instructions. The result was about 3 times faster.

I'll put it on my (future) website. It's part of a larger project, but I'll split it off and make it available as example.
Title: Re: Getting SSE working
Post by: daydreamer on May 27, 2006, 04:23:12 PM
so if I write MMX code w db 66h prefix, they will be SSE2
will non-SSE2 capability cpu's still perform them by ignoring prefix and execute them as MMX?
so in a loop only add ecx,stepping ; mov ecx,loopcount has different stepping and loopcount, depending if it executes on 16byte or 8byte data?
Title: Re: Getting SSE working
Post by: stanhebben on May 27, 2006, 04:49:14 PM
No, the sse instruction set is different from the mmx instruction set. If you want code that is compatible with both, then you'll have to write two separate pieces of code and switch between them, depending on the processor.
Title: Re: Getting SSE working
Post by: daydreamer on May 27, 2006, 05:31:39 PM
Quote from: stanhebben on May 27, 2006, 04:49:14 PM
No, the sse instruction set is different from the mmx instruction set. If you want code that is compatible with both, then you'll have to write two separate pieces of code and switch between them, depending on the processor.

Stan read this thread, download these macros and examine them in your favourite text editor
http://www.masm32.com/board/index.php?topic=973.0
I just ran a test and my cpu ignores my SSE2 integer opcodes and executes them still as MMX, because my cpu has no SSE2 caps
SSE2 integer is just MMX with db 66h prefix, you only need to have minor changes like loopiterations and stepping of adress 8 or 16
you only need to set these variables up in initialization after checking cpuid
http://nono40.developpez.com/sources/source0068/
you still understand the international x86 language in this example, which was what I found

I just need to run test on several nonSSE2 cap cpu's to confirm it works in general and not happened to work on just mine
Title: Re: Getting SSE working
Post by: stanhebben on May 27, 2006, 07:45:28 PM
Sorry, I wasn't aware of that.

Interesting behavior... I wonder how good it works.