Hi!
I'm searching for a possibility to move the high quadword to a specified memory location. The low quadword is no problem, movq does the job, but I've no idea for the high quadword of the xmm.
Any ideas? Thanks for your help! Hannes
shift it first, before movq.
Really no other possibility?
shr xmm0, 64
That's a bit tough performancewise isn't it? It's part of an attempt to replace MMX-code, but that shift will kill the additional performance for sure.
Thanks for your reply! Have a nice day, Hannes
movlpd and movhpd i think
Thanks for your tip!
I will try it ASAP!
Cheers, Hannes
Sorry, that doesn't work, since I'm dealing with signed integers not floats.
Shifting doesn't work either, since only logical shifting is possible.
Does anybody have any ideas to get the 2 quadwords separated into whatever, 2 mmx registers or 4 general purpose registers?
Thanks! Hannes
Hey pshufd xmm1,xmm1,E4h exchanges the quadwords - great. Just to find out that the algorithm has another problem elsewhere.
Yeah, assembly likes games...
Have a nice day, Hannes
Quote from: a_h on October 08, 2005, 08:35:52 AM
Sorry, that doesn't work, since I'm dealing with signed integers not floats.
It will still work. Unlike the FPU, MMX and SSE do not keep track of what type of data is in the registers.
Thanks for correcting me! You're perfectly correct - so the reference to single/double precision values in the manual just refers to the amount of bits moved.
Still I've to fix the rest as well...
Cheers & thanks for your help! Hannes
Err, so MOVLPD and MOVQ are essentially the same commands? Just that MOVQ is far more flexible, of course.
What's with speed? Have now no time to test that.
Cheers, Hannes
Logically they do the same thing, although I have heard that you are "not supposed" to misuse them.
After fixing another bug, the movlpd/movhpd don't seem to work as intended.
Still I've another problem, I want to get the lower quad from [eax] and the higher quadword from [eax+offset] (offset not 16) into a xmm register. Seems that such non continous movs are a real problem with SSE.
movq cannot be used, since it zeroes the high quad. pshufd cannot be used, since I have 2 memory locations.
Great would be a way to copy 2 mmx-registers to a single xmm register, but I'm not aware of such a possibility.
If somebody has an idea, that would be great!
Thanks a lot, Hannes
How many bugs can exist in less than 30 lines of code?!! Forgive me AeroASM! It works perfectly now!
Thanks for your advice! Hannes