Vaguely I remember someone somewhere writing some macros to stretch ML 6.14 a little so it could produce SSE2/3 code. Has anyone done any work in this area as it would help to give a good accessible version a new lease of life.
Hutch you wrote at least one of them. :U Or maybe it was really Mark Larson as a special guest ?
http://www.old.masmforum.com/viewtopic.php?t=3007
http://www.old.masmforum.com/viewtopic.php?t=2999
It must have been Mark, thanks for finding them.
You can also get them off the Intel website.
AeroASM, could you post the link ?
This one is for SSE3:
http://www.intel.com/cd/ids/developer/asmo-na/eng/167741.htm?page=9
I couldn't find the one for SSE2; you could possibly download MASM 6.15 from Microsoft?
If ML 6.15 was available, we would not ned them and while I own ML up to version 7 and every version inbetween but the p5 processor pack is the only external source of ML 6.15 and it is only licenced for VC owners.
6.14 almost does it all and its a very reliable version so stretching i it to do SSE2 would give it life until the end of 32 bit Windows which will be longer than many think. When Mark gets back into business with his new job and the like, I will ask him if he has time to implement the SSE2 he suggested in a thread in the old forum. Any other reliable sources would be appreciated.
Writing the SSE3 macros wasn't that bad. There are only 13 of them. Having to do SSE2 macros is a bit more time consuming. I made sure to make sure the macros I wrote would work with 6.14 of MASM since that is what comes with MASM32. So somone could use the macros I wrote as a basis for writing the SSE2 macros.
Hutch--, any luck on getting 6.15 from Microsoft? Or maybe providing a few links in the install of where to get it?
Mark,
I have the VC6 Processor Pack upgrade on the forum website but its only licenced to VC6 owners. They are VERY SLOW to deal with unfortunately but we may be lucky one day. :P
If its not a big deal, a set of SSE2 macros that were reliable would be a blessing as it will extend the life of 6.14 until the end of win32. What I envisaged was a seperate include file named something like "sse2.inc" so just by including the file, sse2 is enabled through the macros.
Quote from: hutch-- on February 25, 2005, 04:21:41 PM
Mark,
I have the VC6 Processor Pack upgrade on the forum website but its only licenced to VC6 owners. They are VERY SLOW to deal with unfortunately but we may be lucky one day. :P
If its not a big deal, a set of SSE2 macros that were reliable would be a blessing as it will extend the life of 6.14 until the end of win32. What I envisaged was a seperate include file named something like "sse2.inc" so just by including the file, sse2 is enabled through the macros.
Well I thought I would make emu64.inc, emulating the xtra regs for those who are interested in coding 64bitcode, but still only have a 32bitter
but I want help to make casecode detecting if you should run 64bit code, or the emulated counterpart?
and I guess XMM8-XMM15, should be placed as local variables, for best speed
daydreamer,
It sounds like an interesting project, what I wonder is if the object module format is the same from 32 bit COFF to whatever will be used in win64 on AMD64 hardware.
http://www.textpad.com/add-ons/synh2m.html
I founded the syntax for ml 6.15. Or you prefer a direct link http://www.textpad.com/add-ons/files/syntax/masm.zip
Char,
Thanks for the link, you will find that 6.15 uses identical syntax to 6.14 but has added capacity with SSE2 instructions. 6.15 is just a warmed over 6.14 .
Microsoft (R) Macro Assembler Version 6.15.8803
Copyright (C) Microsoft Corp 1981-2000. All rights reserved.
ML [ /options ] filelist [ /link linkoptions ]
/AT Enable tiny model (.COM file) /omf generate OMF format object file
/Bl<linker> Use alternate linker /Sa Maximize source listing
/c Assemble without linking /Sc Generate timings in listing
/Cp Preserve case of user identifiers /Sf Generate first pass listing
/Cu Map all identifiers to upper case /Sl<width> Set line width
/Cx Preserve case in publics, externs /Sn Suppress symbol-table listing
/coff generate COFF format object file /Sp<length> Set page length
/D<name>[=text] Define text macro /Ss<string> Set subtitle
/EP Output preprocessed listing to stdout /St<string> Set title
/F <hex> Set stack size (bytes) /Sx List false conditionals
/Fe<file> Name executable /Ta<file> Assemble non-.ASM file
/Fl[file] Generate listing /w Same as /W0 /WX
/Fm[file] Generate map /WX Treat warnings as errors
/Fo<file> Name object file /W<number> Set warning level
/FPi Generate 80x87 emulator encoding /X Ignore INCLUDE environment path
/Fr[file] Generate limited browser info /Zd Add line number debug info
/FR[file] Generate full browser info /Zf Make all symbols public
/G<c|d|z> Use Pascal, C, or Stdcall calls /Zi Add symbolic debug info
/H<number> Set max external name length /Zm Enable MASM 5.10 compatibility
/I<name> Add include path /Zp[n] Set structure alignment
/link <linker options and libraries> /Zs Perform syntax check only
/nologo Suppress copyright message
I also founded this on a website, but the website just contain that plain text; so I just copy and paste it and save it.
I search for SSE3 Source code, well at the page 4 from GOOGLE. However, I just noted that Fasm supports SSE3.
Edit
If HLA does translates from HLA to Fasm and HLA to Masm32. I think you can have the code from Fasm if it's legal and translating it with HLA so we can have the source from the SSE3
char,
You get that help info with ML by typing
ml /? > text.txt
You get the best SSE3 info from the Intel manuals.
Quote from: hutch-- on February 26, 2005, 03:02:16 AM
daydreamer,
It sounds like an interesting project, what I wonder is if the object module format is the same from 32 bit COFF to whatever will be used in win64 on AMD64 hardware.
I was thinking casecode in the installer, which chooses .exe compiled with ml+emu64.inc or ml64 without it
but maybe better to let the user choose 64bit or 32bit version
I was first thinking using emu64.inc and equates forthe xtra regs
ok I could make macros for add RAX,RBX, sub RAX,RBX etc, but dont wanna waste time on it, if people arent interested in use it
if anybody is interested I put more work on it
[attachment deleted by admin]
Quote from: hutch-- on February 25, 2005, 03:37:27 AM
If ML 6.15 was available, we would not ned them and while I own ML up to version 7 and every version inbetween but the p5 processor pack is the only external source of ML 6.15 and it is only licenced for VC owners.
6.14 almost does it all and its a very reliable version so stretching i it to do SSE2 would give it life until the end of 32 bit Windows which will be longer than many think. When Mark gets back into business with his new job and the like, I will ask him if he has time to implement the SSE2 he suggested in a thread in the old forum. Any other reliable sources would be appreciated.
I dont have 6.14 so I appreciate you could test these macros and currently I broke my new cpu, so I cant testrun SSE2 either
howdo I write SSE2 macros that has the same names as mmx but mem128/.xmm regs without a nameconflict with 6.14? for them its also db 66h prefix on the same instructions
and is it possible to override PS instructions adress size inside macros to take db F2 to make all SD instructions which have 64bit wide?
I am not the most experienced macrowriter, but I know a bit SSE, from research for my tutorial and making few programs
uploadedd all SD instructions as well
daydreamer,
Thanks for making the effort writing these, I had a quick play to format them up so they are easier to read and I would like to see your NAME on the include file so we all know who did the work. I can easily enough make the alternate lower case versions but what I don't have floating around at the moment is any SSE2 code to test it with.
If we can get these going reliably, it will extend the life of MASM 6.14 for some time as it is a very reliable vesion apart from not supporting SSE2.
My fractal code in the Laboratory uses SSE2; you could try that.
:U
Magnus,
These look good. I have again tweaked the format so they are easier to read but the count is 21 so far and this will go a long way to making SSE2 available to many people.
I have formatted them like this,
ADDPD MACRO M1,M2
db 066h
ADDPS M1,M2
ENDM
ADDSD MACRO M1,M2
DB 0F2H
ADDPS M1,M2
ENDM
SUBPD MACRO M1,M2
DB 066H
SUBPS M1,M2
ENDM
I think it is easy enough to make both upper and lower case versions by having a base macro and calling it with two others, an upper case and a lower case version. The important part at the moment is to get them working and make sure they are reliable. The rest is just detail so they are easier to use.
Now the next trick is I have almost no SSE2 code handy to test the macros on. I wonder if anyone has come test code with SSE2 that can be used for testing the macros ?
I put the macors in a small program of mine and then ran the program through ollydebug. It formats them like this at run time
066:0F2805
but the debugger still says its running movaps instead of movapd like I want it too.
004011D9 > 66:0F2805 70304000 MOVAPS XMM0,DQWORD PTR DS:[403070]
cheers
Quote from: phase_verocity on July 23, 2006, 02:36:00 PM
but the debugger still says its running movaps instead of movapd like I want it too.
Ollydbg has known limitations on SSE2 registers. We are all waiting on version 2.0 :bg
I have made macros for SSE2, SSE3, SSE4 at
http://www.agner.org/optimize/#macros
Some versions of ML recognize SSE4 but put the code bytes in wrong order :lol
Agner,
Thanks for thew macros, they look like good stuff. :U