I need to write something that gets the values from GlobalMemoryStatus (returned in MEMORYSTATUS structure) and transforms all the values in that structure (except the first member) from bytes to megabytes (divide by 1024*1024). Instead of using the div instruction I though of using SSE intructions. I tried once but got bad results because I never worked with SSE. Any help would be appreciated
If a truncated integer value is acceptable you could just shift the byte count right by 20 bit positions. Or you could multiply the byte count by some appropriate power of 10 before the shift and treat the result as a fixed-point decimal number.
SSE is mostly new floating point instructions. Regardless there is no integer division instruction in MMX, SSE or SSE2. You could technically convert it to floating point, do the division, and convert it back. Or you can do Michael's suggestion but do the shifts in parallel. Look at pslld and psrld. They both require SSe2. They both shift dword values in an SSE register. Also note that any data you access in memory will have to be on a 16-byte aligned boundary. You can also do the MMX version of that. It is also pslld and psrld. However you use MMX0-MM7 to distinguish between it and SSE2. MMX will let you do two dwords in parallel, and doesn't give an exception if the data isn't aligned.
thanks for the help! I know that movaps moves 128-bit float data, but how do I move integers to xmm registers? maybe it doesn't make any difference but the code I wrote doesn't work:
movaps xmm0, memstatus ;MEMORYSTATUS structure (256-bits)
movaps xmm7, memstatus+16
psrld xmm0, 10
psrld xmm7, 10
movaps memstatus, xmm0
movaps memstatus+16, xmm7
please correct me
Quote from: russian on January 16, 2005, 06:33:28 PM
thanks for the help! I know that movaps moves 128-bit float data, but how do I move integers to xmm registers? maybe it doesn't make any difference but the code I wrote doesn't work:
movaps xmm0, memstatus ;MEMORYSTATUS structure (256-bits)
movaps xmm7, memstatus+16
psrld xmm0, 10
psrld xmm7, 10
movaps memstatus, xmm0
movaps memstatus+16, xmm7
please correct me
The integer version is MOVDQA. However MOVAPS actually works fine for copying integer data to XMM registers. PSRLD requires MASM 6.15 and an AMD64 procesor or Pentium 4 processor. Other than that your code is fine. You are shifting each dword right by 10 bits. 2^10 = 1024. So you are converting them to kilobytes.