News:

MASM32 SDK Description, downloads and other helpful links
MASM32.com New Forum Link
masmforum WebSite

SSE2 and SSE3 macros for ml 6.14

Started by hutch--, February 24, 2005, 09:52:59 AM

Previous topic - Next topic

hutch--

char,

You get that help info with ML by typing

ml /? > text.txt


You get the best SSE3 info from the Intel manuals.
Download site for MASM32      New MASM Forum
https://masm32.com          https://masm32.com/board/index.php

daydreamer

#16
Quote from: hutch-- on February 26, 2005, 03:02:16 AM
daydreamer,

It sounds like an interesting project, what I wonder is if the object module format is the same from 32 bit COFF to whatever will be used in win64 on AMD64 hardware.
I was thinking casecode in the installer, which chooses .exe compiled with ml+emu64.inc or ml64 without it
but maybe better to let the user choose 64bit or 32bit version
I was first thinking using emu64.inc and equates forthe xtra regs
ok I could make macros for add RAX,RBX, sub RAX,RBX etc, but dont wanna waste time on it, if people arent interested in use it
if anybody is interested I put more work on it


[attachment deleted by admin]

daydreamer

#17
Quote from: hutch-- on February 25, 2005, 03:37:27 AM
If ML 6.15 was available, we would not ned them and while I own ML up to version 7 and every version inbetween but the p5 processor pack is the only external source of ML 6.15 and it is only licenced for VC owners.

6.14 almost does it all and its a very reliable version so stretching i it to do SSE2 would give it life until the end of 32 bit Windows which will be longer than many think. When Mark gets back into business with his new job and the like, I will ask him if he has time to implement the SSE2 he suggested in a thread in the old forum. Any other reliable sources would be appreciated.
I dont have 6.14 so I appreciate you could test these macros and currently I broke my new cpu, so I cant testrun SSE2 either
howdo I write SSE2 macros that has the same names as mmx but mem128/.xmm regs without a nameconflict with 6.14? for them its also db 66h prefix on the same instructions
and is it possible to override PS instructions adress size inside macros to take db F2 to make all SD instructions which have 64bit wide?
I am not the most experienced macrowriter, but I know a bit SSE, from research for my tutorial and making few programs
uploadedd all SD instructions as well

hutch--

daydreamer,

Thanks for making the effort writing these, I had a quick play to format them up so they are easier to read and I would like to see your NAME on the include file so we all know who did the work. I can easily enough make the alternate lower case versions but what I don't have floating around at the moment is any SSE2 code to test it with.

If we can get these going reliably, it will extend the life of MASM 6.14 for some time as it is a very reliable vesion apart from not supporting SSE2.
Download site for MASM32      New MASM Forum
https://masm32.com          https://masm32.com/board/index.php

AeroASM

My fractal code in the Laboratory uses SSE2; you could try that.

hutch--

 :U

Magnus,

These look good. I have again tweaked the format so they are easier to read but the count is 21 so far and this will go a long way to making SSE2 available to many people.

I have formatted them like this,


    ADDPD MACRO M1,M2
      db 066h
      ADDPS M1,M2
    ENDM
   
    ADDSD MACRO M1,M2
      DB 0F2H
      ADDPS M1,M2
    ENDM
   
    SUBPD MACRO M1,M2
      DB 066H
      SUBPS M1,M2
    ENDM


I think it is easy enough to make both upper and lower case versions by having a base macro and calling it with two others, an upper case and a lower case version. The important part at the moment is to get them working and make sure they are reliable. The rest is just detail so they are easier to use.

Now the next trick is I have almost no SSE2 code handy to test the macros on. I wonder if anyone has come test code with SSE2 that can be used for testing the macros ?
Download site for MASM32      New MASM Forum
https://masm32.com          https://masm32.com/board/index.php

phase_verocity

I put the macors in a small program of mine and then ran the program through ollydebug.  It formats them like this at run time

066:0F2805

but the debugger still says its running movaps instead of movapd like I want it too.

004011D9   > 66:0F2805 70304000  MOVAPS XMM0,DQWORD PTR DS:[403070]


cheers

Mark Jones

Quote from: phase_verocity on July 23, 2006, 02:36:00 PM
but the debugger still says its running movaps instead of movapd like I want it too.

Ollydbg has known limitations on SSE2 registers. We are all waiting on version 2.0 :bg
"To deny our impulses... foolish; to revel in them, chaos." MCJ 2003.08

agner

I have made macros for SSE2, SSE3, SSE4 at
http://www.agner.org/optimize/#macros

Some versions of ML recognize SSE4 but put the code bytes in wrong order  :lol

hutch--

Agner,

Thanks for thew macros, they look like good stuff.  :U
Download site for MASM32      New MASM Forum
https://masm32.com          https://masm32.com/board/index.php