How to merge high quad and low quad of 2 different SSE registers?

Started by a_h, November 22, 2005, 04:30:30 PM

Previous topic - Next topic

a_h

Currently I'm trying once more to replace existing mmx code with some sse code, but I've 1 line where I've no idea how I could replace it:

punpcklwd mm2, mm2

SSEing that, I would need a function that interleaves the low words of the low quadword and the low words of the high quadword. punpcklwd xmm2, xmm2 doesn't work since it interleaves the whole low quads (not only the low words of the low quads), so the high quad is completely wrong.

If I do


movdqa xmm3, xmm2
punpcklwd xmm2, xmm2
punpckhwd xmm3, xmm3

, I need to merge the low quads of both registers, xmm2 & xmm3, into a single register - how I could do that?

Any ideas?

Thanks for your help! Hannes

MazeGen

I didn't try it, but this should work:

shufpd xmm2, xmm3, 10y

Bit 0 = 0 selects low quadword of xmm2, bit 1 = 1 select high quadword of xmm3.

It is really weird that MOVHPD and similar cannot work with register operands only :tdown


a_h

Hey thanks for your reply!

Nice idea! pshufd is really the solution here, unfortunately it's quite slow...however, in the meantime I realized that I don't need to do anything. punpcklbw/punpcklhbw do just the same for mmx and sse registers without messing up their order. Stupid me.

Yeah, it's really strange that movhpd/movlpd work on registers only.

Thanks a lot for your help! Have a nice evening, Hannes