possible to move high quadword from xmm-register to memory?

Started by a_h, September 09, 2005, 05:52:27 PM

Previous topic - Next topic

a_h

Hi!

I'm searching for a possibility to move the high quadword to a specified memory location. The low quadword is no problem, movq does the job, but I've no idea for the high quadword of the xmm.

Any ideas? Thanks for your help! Hannes


a_h

Really no other possibility?

shr xmm0, 64

That's a bit tough performancewise isn't it? It's part of an attempt to replace MMX-code, but that shift will kill the additional performance for sure.

Thanks for your reply! Have a nice day, Hannes


a_h


a_h

Sorry, that doesn't work, since I'm dealing with signed integers not floats.

Shifting doesn't work either, since only logical shifting is possible.

Does anybody have any ideas to get the 2 quadwords separated into whatever, 2 mmx registers or 4 general purpose registers?

Thanks! Hannes

a_h

Hey pshufd xmm1,xmm1,E4h exchanges the quadwords - great. Just to find out that the algorithm has another problem elsewhere.

Yeah, assembly likes games...

Have a nice day, Hannes

AeroASM

Quote from: a_h on October 08, 2005, 08:35:52 AM
Sorry, that doesn't work, since I'm dealing with signed integers not floats.

It will still work. Unlike the FPU, MMX and SSE do not keep track of what type of data is in the registers.

a_h

Thanks for correcting me! You're perfectly correct - so the reference to single/double precision values in the manual just refers to the amount of bits moved.

Still I've to fix the rest as well...

Cheers & thanks for your help! Hannes

a_h

Err, so MOVLPD and MOVQ are essentially the same commands? Just that MOVQ is far more flexible, of course.

What's with speed? Have now no time to test that.

Cheers, Hannes

AeroASM

Logically they do the same thing, although I have heard that you are "not supposed" to misuse them.

a_h

After fixing another bug, the movlpd/movhpd don't seem to work as intended.

Still I've another problem, I want to get the lower quad from [eax] and the higher quadword from [eax+offset] (offset not 16) into a xmm register. Seems that such non continous movs are a real problem with SSE.

movq cannot be used, since it zeroes the high quad. pshufd cannot be used, since I have 2 memory locations.

Great would be a way to copy 2 mmx-registers to a single xmm register, but I'm not aware of such a possibility.

If somebody has an idea, that would be great!

Thanks a lot, Hannes

a_h

How many bugs can exist in less than 30 lines of code?!! Forgive me AeroASM! It works perfectly now!

Thanks for your advice! Hannes